Opened 5 years ago
#7185 new enhancement
Preserve codec delay/initial padding and trailing padding during muxing
Reported by: | mkver | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | avformat |
Version: | git-master | Keywords: | mkv mov |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Currently initial and trailing padding data is often lost during muxing even when the target container has features that can be used to retain it. I'm thinking about Matroska and mp4 here, but there might be other containers with these capabilities.
For Matroska, the CodecDelay track header field is only written for Opus although it could be written for all audio tracks. Also, the DiscardPadding element isn't used for end trimming for most encoders (Opus, libopus libmp3lame and the experimental vorbis encoder being exceptions; here the encoder adds the needed AV_PKT_DATA_SKIP_SAMPLES side data; probably there are other exceptions).
For mp4, edit lists could be used to signal the actual encoder delay; and sometimes this is already done so, but not by design, but by accident/as a side-effect of the shifting of timestamps to make the dts nonnegative. Here is a test*:
% ffmpeg.exe -f lavfi -i anullsrc -filter:a "atrim=start_sample=0:end_sample=10000" -c:a aac -f framehash -hash crc32 - #format: frame checksums #version: 2 #hash: CRC32 #extradata 0, 5, c72331cb #software: Lavf58.13.100 #tb 0: 1/44100 #media_type 0: audio #codec_id 0: aac #sample_rate 0: 44100 #channel_layout 0: 3 #channel_layout_name 0: stereo #stream#, dts, pts, duration, size, hash 0, -1024, -1024, 1024, 23, 5a2d9cda 0, 0, 0, 1024, 6, 595037cd 0, 1024, 1024, 1024, 6, 595037cd 0, 2048, 2048, 1024, 6, 595037cd 0, 3072, 3072, 1024, 6, 595037cd 0, 4096, 4096, 1024, 6, 595037cd 0, 5120, 5120, 1024, 6, 595037cd 0, 6144, 6144, 1024, 6, 595037cd 0, 7168, 7168, 1024, 6, 595037cd 0, 8192, 8192, 1024, 6, 595037cd 0, 9216, 9216, 784, 6, 595037cd
When muxing this into mp4, there is an edit list showing that the first 1024 samples should be discarded which means that the dts of the first packet is -1024:
% ffmpeg.exe -f lavfi -i anullsrc -filter:a "atrim=start_sample=0:end_sample=10000" -c:a aac test.mp4 ffmpeg -i test.mp4 -c copy -f framehash -hash crc32 - #format: frame checksums #version: 2 #hash: CRC32 #extradata 0, 5, c72331cb #software: Lavf58.13.100 #tb 0: 1/44100 #media_type 0: audio #codec_id 0: aac #sample_rate 0: 44100 #channel_layout 0: 3 #channel_layout_name 0: stereo #stream#, dts, pts, duration, size, hash 0, -1024, -1024, 1024, 23, 5a2d9cda, S=1, 10, be66397a 0, 0, 0, 1024, 6, 595037cd 0, 1024, 1024, 1024, 6, 595037cd 0, 2048, 2048, 1024, 6, 595037cd 0, 3072, 3072, 1024, 6, 595037cd 0, 4096, 4096, 1024, 6, 595037cd 0, 5120, 5120, 1024, 6, 595037cd 0, 6144, 6144, 1024, 6, 595037cd 0, 7168, 7168, 1024, 6, 595037cd 0, 8192, 8192, 1024, 6, 595037cd 0, 9216, 9216, 751, 6, 595037cd
But if we start the audio a bit later (at sample 500), only the first 524 samples are discarded via the edit list (because the edit list by design is only there to make the dts non-negative) and time from 0 to 500/44100 s now contains valid audio whereas previously there was nothing:
ffmpeg.exe -f lavfi -i anullsrc -filter:a "atrim=start_sample=500:end_sample=10500" -c:a aac test2.mp4 ffmpeg -i test2.mp4 -c copy -f framehash -hash crc32 - #format: frame checksums #version: 2 #hash: CRC32 #extradata 0, 5, c72331cb #software: Lavf58.13.100 #tb 0: 1/44100 #media_type 0: audio #codec_id 0: aac #sample_rate 0: 44100 #channel_layout 0: 3 #channel_layout_name 0: stereo #stream#, dts, pts, duration, size, hash 0, -524, -524, 1024, 23, 5a2d9cda, S=1, 10, d740a07e 0, 500, 500, 1024, 6, 595037cd 0, 1524, 1524, 1024, 6, 595037cd 0, 2548, 2548, 1024, 6, 595037cd 0, 3572, 3572, 1024, 6, 595037cd 0, 4596, 4596, 1024, 6, 595037cd 0, 5620, 5620, 1024, 6, 595037cd 0, 6644, 6644, 1024, 6, 595037cd 0, 7668, 7668, 1024, 6, 595037cd 0, 8692, 8692, 1024, 6, 595037cd 0, 9716, 9716, 780, 6, 595037cd
*: All tests have been done with this version of ffmpeg:
ffmpeg version N-90920-ge07b1913fc Copyright (c) 2000-2018 the FFmpeg developers built with gcc 7.3.0 (GCC) configuration: --disable-static --enable-shared --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth libavutil 56. 18.100 / 56. 18.100 libavcodec 58. 19.100 / 58. 19.100 libavformat 58. 13.100 / 58. 13.100 libavdevice 58. 4.100 / 58. 4.100 libavfilter 7. 21.100 / 7. 21.100 libswscale 5. 2.100 / 5. 2.100 libswresample 3. 2.100 / 3. 2.100 libpostproc 55. 2.100 / 55. 2.100
This is one completely unrelated commit behind git-master.