Opened 4 years ago
Last modified 4 months ago
#9471 open defect
EAC3 native encoder is only gapless in the beginning, not in the end
| Reported by: | Balling | Owned by: | |
|---|---|---|---|
| Priority: | normal | Component: | avformat |
| Version: | git-master | Keywords: | eac3 gapless mp4 editlist |
| Cc: | MasterQuestionable | Blocked By: | |
| Blocking: | Reproduced by developer: | no | |
| Analyzed by developer: | no |
Description (last modified by )
Summary of the bug:
edts atom (editlist) has media time and media duration, yet even though media time is correctly written for EAC3 and AAC (native EAC3 encoder has 256 sample of silence (a.k.a. encoder delay) that are then removed from the beginning with native encoder and native AAC has 1024 samples that are also working great) the media duration is not correctly written. Even if it were to be correctly written media duration is not applied on decoding even for AAC, see for example https://bugs.chromium.org/p/chromium/issues/detail?id=668999 that is still present in git-master!!!
How to reproduce:
% ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -c:a eac3 outeac3.mp4
ffmpeg version N-104341-g933765aa0e-20211013 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10-win32 (GCC) 20210408
configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --enable-shared --disable-static --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --enable-libvmaf --enable-vulkan --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libglslang --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20211013
libavutil 57. 7.100 / 57. 7.100
libavcodec 59. 12.100 / 59. 12.100
libavformat 59. 6.100 / 59. 6.100
libavdevice 59. 0.101 / 59. 0.101
libavfilter 8. 14.100 / 8. 14.100
libswscale 6. 1.100 / 6. 1.100
libswresample 4. 0.100 / 4. 0.100
libpostproc 56. 0.100 / 56. 0.100
Input #0, lavfi, from 'sine=frequency=1000:duration=5':
Duration: N/A, start: 0.000000, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> eac3 (native))
Press [q] to stop, [?] for help
Output #0, mp4, to 'outeac3.mp4':
Metadata:
encoder : Lavf59.6.100
Stream #0:0: Audio: eac3 (ec-3 / 0x332D6365), 44100 Hz, mono, fltp, 96 kb/s
Metadata:
encoder : Lavc59.12.100 eac3
size= 60kB time=00:00:05.00 bitrate= 98.2kbits/s speed= 487x
video:0kB audio:59kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.166617%
Track duration: 5010 (0x00001392) - 5010 (0x1392) ms Media time: 256 (0x00000100) - 256 (0x100) ms Media rate: 65536 (0x00010000) - 1.000
Now compare it to aac: ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -c:a aac fileaac.mp4:
Track duration: 5000 (0x00001388) - 5000 (0x1388) ms Media time: 1024 (0x00000400) - 1024 (0x400) Media rate: 65536 (0x00010000) - 1.000
Unfortunately a) Mediainfo tracer is buggy in that part: https://github.com/MediaArea/MediaInfoLib/issues/1441
b) I am not sure that media duration is really buggy since it is not applied anyway!
c) I checked it all decoding to wav and checking in Audacity.
d) I dunno whether sbgp and sgpd are needed (whether EAC3 depends on previous frames)
Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.
Change History (11)
comment:1 by , 4 years ago
| Description: | modified (diff) |
|---|
comment:3 by , 4 years ago
| Status: | new → open |
|---|
Okay, so now after c2424b1f35a1c6c06f1f9fe5f77a7157ed84e1cd it writes before editlist duration (as is in -ignore_editlist 1 to wav) in mdhd and dumb warning in mediainfo that was warning about mdhd_Duration: xxxxx being wrong is gone. But still wrong amount media duration is written in editlist and otherwise.
comment:4 by , 4 years ago
What is strange is that with -b:a 2048k it will write not 5010 number there, but 4975! What???
comment:5 by , 4 years ago
| Component: | ffmpeg → undetermined |
|---|---|
| Keywords: | mp4 editlist removed |
| Reproduced by developer: | unset |
comment:6 by , 4 years ago
I will also point out that Adobe Audacity 17, that was the last version that supports Dolby native encoders with even 7.1 encoding uses 1792 samples as initial priming (7 frames, each 256 samples).
And Plex encoder that uses Mediaconcept encoder linked with Dolby SDK (Easyaudioencoder.exe) uses 768 samples, that is 3 frames. Most of stuff encoded in the wild is 768 samples.
comment:7 by , 23 months ago
| Cc: | added |
|---|---|
| Component: | undetermined → avformat |
| Keywords: | mp4 editlist added |
͏ I think similar problems are essentially caused by the unjustifiable complexity and poor design of many things.
͏ See also: https://trac.ffmpeg.org/ticket/11002#comment:9
comment:8 by , 17 months ago
EC3A Entry (12 bytes) EC3A Track duration: 4994 (0x00001382) - 4994 (0x1382) ms EC3E Media time: 256 (0x00000100) - 256 (0x100) ms EC42 Media rate: 65536 (0x00010000) - 1.000
is now printed... The editlist is still not fully applied (media time is applied, track duration is not).
comment:9 by , 11 months ago
The padding from E-AC-3 tracks is still not being removed as of May 2025. The delay and padding can be variable. The current practice with music encoding is to prepend 1 or 2 syncframes shared with the end of the previous track to have the decoder prerolled with actual audio data and not silence.
An "elst" box my contain the following data, where 38FE80 is the correct end point in samples, and 800 is the delay:
00000000000000010038FE800000080000010000
ffmpeg reports
Chapter #0:0: start 0.042667, end 77.816000
But the decoded file has a duration approximately 77.781 and includes extra samples up to the end of the stream.
Older E-AC-3 files in the wild may contain garbage at the end without an elst element. I observe a delay of 256 in Adobe Audition when producing an EC3 elementary stream. With old encoders the delay depended on whether the 90° phase filter was active.
comment:10 by , 11 months ago
The padding from E-AC-3 tracks is still not being removed as of May 2025. The delay and padding can be variable
We know that. See comment 6.
The current practice with music encoding is to prepend 1 or 2 syncframes shared with the end of the previous track to have the decoder prerolled with actual audio data and not silence.
That is only the case for gapless albums where that is obviously needed, cause all tracks are one big audio.
comment:11 by , 4 months ago
Even more different now, 4995 instead of 4994:
EB66 Time scale: 1000 (0x000003E8) - 1000 Hz EB6A Duration: 4995 (0x00001383) - 4995 ms EB6E Preferred rate: 65536 (0x00010000) - 1.000 EB72 Preferred volume: 256 (0x0100) - 1.000 EB74 Reserved: (10 bytes) EB7E Matrix structure (36 bytes)
Probably this bug, since 4995 is start of last frame of eac3, not end of last frame of presentation that should be 5000.
Guessing this one https://code.ffmpeg.org/FFmpeg/FFmpeg/issues/20924



Who knows what spec says about this? I looked into TS 103 420 nothing there.
Oh, found it in TS 102 366 (only media time part though)!
J.1.3.2 Priming and delay
The codec uses audio blocks of a fixed length of 256 samples, and a transform which applies over two audio blocks. To obtain the correct audio from a block, both blocks in the transform are needed, and hence both the prior encoded block and the current encoded block need to be decoded to output the first frame. This is sometimes called "priming" and may be signaled using the 'roll' sample group. Thus, a full reconstruction of the first 256 audio samples is sometimes not possible since there is no previous access unit. If it is desired to achieve full reconstruction of these samples, it is possible to add silence to the beginning of the audio signal. In practice, an encoder might prepend an arbitrary amount of silent audio waveform samples to the signal. This portion of the audio signal is sometimes called "encoder delay" and varies depending on the implementation. This can be compensated using one of the following delay compensation approaches.
So roll is not written, wow!! (In sgpd.)