Opened 9 months ago
Closed 9 months ago
#10458 closed defect (duplicate)
MPEG4 demuxing: last audio sample's duration ignored
| Reported by: | John Regan | Owned by: | |
|---|---|---|---|
| Priority: | normal | Component: | undetermined |
| Version: | unspecified | Keywords: | |
| Cc: | John Regan | Blocked By: | |
| Blocking: | Reproduced by developer: | no | |
| Analyzed by developer: | no |
Description (last modified by )
It seems like ffmpeg is properly removing the front padding from audio in mp4 files, but doesn't account for the end padding added to audio frames to round up to the frame length.
This is signaled by the mp4 file listing a different duration for the final sample - either via the Decoding Time to Sample Box box, or for fragmented mp4s, the sample duration in the track fragment run box.
How to reproduce:
% ffmpeg -f lavfi -i anullsrc=r=48000:d=2 source.wav # verify the created audio file as exactly 96000 samples % soxi -s source.wav 96000 # encode to aac % ffmpeg -i source.wav -c:a aac encoded.m4a # decode back to wav % ffmpeg -i encoded.m4a destination.wav # observe the sample count != 96000 % soxi -s destination.wav 96256
Using boxdumper from l-smash, I can verify that ffmpeg correctly added an edit list box that lists total media duration, as well as the samples to trim from the beginning of the audio (the encoder delay):
[edts: Edit Box]
position = 845
size = 36
[elst: Edit List Box]
position = 853
size = 28
version = 0
flags = 0x000000
entry_count = 1
entry[0]
segment_duration = 2000
media_time = 1024
media_rate = 1.000000
The Decoding Time to Sample Box specifies the final sample is 768 frames. Doing the math: (94 samples * 1024 frames) + 768 = 97024 frames. Subtract the 1024 frames from the previous Edit List Box and you should have 96000 samples.
[stts: Decoding Time to Sample Box]
position = 1140
size = 32
version = 0
flags = 0x000000
entry_count = 2
entry[0]
sample_count = 94
sample_delta = 1024
entry[1]
sample_count = 1
sample_delta = 768
I think the issue may be the MP4 demuxer not signaling the final decoded packet's duration. This occurs if I use other codecs as well, for example mp3:
# using the same source.wav as above that's 96000 samples: % ffmpeg -i source.wav -c:a libmp3lame encoded-in-mp3.mp4 % ffmpeg -i encoded-in-mp3.mp4 decoded-from-mp3.wav % soxi -s decoded-from-mp3.wav 96815
Here's the edts box and stts from encoded-in-mp3.mp4:
[edts: Edit Box]
position = 32900
size = 36
[elst: Edit List Box]
position = 32908
size = 28
version = 0
flags = 0x000000
entry_count = 1
entry[0]
segment_duration = 2000
media_time = 1105
media_rate = 1.000000
[stts: Decoding Time to Sample Box]
position = 33205
size = 32
version = 0
flags = 0x000000
entry_count = 2
entry[0]
sample_count = 84
sample_delta = 1152
entry[1]
sample_count = 1
sample_delta = 337
So again doing some math: (84 samples * 1152 frames) + 337 frames = 97105 frames. Subtract the 1105 frames from the edit list - 96000 frames.
Another example with opus:
% ffmpeg -i source.wav -c:a libopus encoded-in-opus.mp4 % ffmpeg -i encoded-in-opus.mp4 decoded-from-opus.wav % soxi -s decoded-from-opus.wav 96648
Same issue with a fragmented mp4 - which doesn't have the Decoding Time to Sample Box and instead relies on either the Track Fragment Header Box or the Track Fragment Run Box for sample duration signaling.
This does not seem to apply to codecs that carry their own duration signaling, like FLAC in mp4.
ffmpeg version info:
ffmpeg version n6.0 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 13.1.1 (GCC) 20230429 configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libjxl --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-version3 --enable-vulkan libavutil 58. 2.100 / 58. 2.100 libavcodec 60. 3.100 / 60. 3.100 libavformat 60. 3.100 / 60. 3.100 libavdevice 60. 1.100 / 60. 1.100 libavfilter 9. 3.100 / 9. 3.100 libswscale 7. 1.100 / 7. 1.100 libswresample 4. 10.100 / 4. 10.100 libpostproc 57. 1.100 / 57. 1.100
Attachments (2)
Change History (6)
comment:1 by , 9 months ago
| Cc: | added |
|---|
comment:2 by , 9 months ago
| Description: | modified (diff) |
|---|---|
| Summary: | MPEG4 AAC decoding: end padding not trimmed → MPEG4 demuxing: last sample's duration ignored |
by , 9 months ago
| Attachment: | fragmented-aac.mp4 added |
|---|
Example fragmented mp4 file with a final sample duration of 768 frames. The final sample is in its own fragment, the duration is signaled in the Track Fragment Header Box as the default sample duration
by , 9 months ago
| Attachment: | fragmented-aac-trun.mp4 added |
|---|
Example fragmented mp4 file with a final sample duration of 768 frames. The final sample duration is signaled in the Track Fragment Run Box.
comment:3 by , 9 months ago
| Summary: | MPEG4 demuxing: last sample's duration ignored → MPEG4 demuxing: last audio sample's duration ignored |
|---|
comment:4 by , 9 months ago
| Resolution: | → duplicate |
|---|---|
| Status: | new → closed |
Known bug #7828, fixed in Chromium.



Discovered this isn't limited to just AAC - I think it may apply to any codec that relies on the mp4 file to signal the last sample's duration (tested with the native aac encoder, libmp3lame, and libopus). I've updated the bug title and description accordingly.