gapless playback (probably) doesn't work with AAC
|Reported by:||Christoph Anton Mitterer||Owned by:|
|Blocking:||Reproduced by developer:||no|
|Analyzed by developer:||no|
With current git master ffmpeg 1125277:
In the following a series of test files, based on https://commons.wikimedia.org/w/index.php?title=File%3ATelemann_-_2violin_Sonata_1-1.ogg, are used.
Each filename starts with a number, where the same number indicates the files belong together.
PCM WAV shortened version of the above Wikipedia Demo
00.test.wav split into two halfs (these are the actual base test files used with encoders)
LAME encoded versions of 01.split-track01.wav and 01.split-track02.wav
opusenc encoded versions of 01.split-track01.wav and 01.split-track02.wav
and so on.
1) How the to base test files (01.split-track01.wav and 01.split-track02.wav) were created
The Wikipedia demo file was first decoded to PCM WAV with opusdec, and split in two halfs with
$ shnsplit 00.test.wav
enter split points:
shnsplit: warning: rounding 0:05.317 (offset: 937919) to nearest sector boundary (offset: 938448)
shnsplit: warning: file 2 will not be cut on a sector boundary
Splitting [test.wav] (0:15.66) --> [01.split-track01.wav] (0:05.24) : 100% OK
Splitting [test.wav] (0:15.66) --> [01.split-track02.wav] (0:10.42) : 100% OK
For cross checking, the resulting files were joined again:
$ sox 01.split-track01.wav 01.split-track02.wav joined.wav
The concatenation is binary identical to the original file:
$ diff 00.test.wav joined.wav
which can also be seen (visually) in e.g. audacity or sonic-visualizer (i.e. there are no gaps or other distortions between 01.split-track01.wav and 01.split-track02.wav.
2) What is tested?
The split files will now be encoded with some reference encoders and played respectively decoded (to PCM WAV) again afterwards checking for the following:
- Does the "gaplass" playback even work for the plain PCM WAV?
- At playback, can any gap, crack, pop, etc. be heared between the two files (i.e. does "gapless playback" work)?
- At decoding to PCM WAV, is there any shift at the start of the 1st file respectively end of the 2nd file?
- At decoding to PCM WAV, is there any gap/shift/other distortion at the end of the 1st file and start of the 2nd file when these two are concatenated, in other words at the joining position?
Hearing tests were repeated multiple times, so the files were already in the OS cache and one should basically expect no delay at all from slow storage medium (which was anyway one of the fastest SSDs)
Unless otherwise noticed, all programs libraries were from Debian unstable.
Encoders with these options were used:
- lame --verbose -q 0 -v -V 4 --noreplaygain --id3v2-utf16 --add-id3v2 --id3v1-only LAME 64bits version 3.100
- opusenc --vbr --bitrate 96 split-track01.wav opus-tools 0.1.10
- fdkaac -p 29 -m 4 <gapless modes, part of the filename> 0.6.3 gapless modes: 0 iTunSMPB 1 ISO standard (edts and sgpd) 2 Both
- aac-enc -t 29 -v 4 0.1.6 => may not even set any gapless information, so can possibly completely ignored
- faac -q 100 -w 18.104.22.168 => may not even set any gapless information, so can possibly completely ignored
And for playback respectively decoding:
ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609
libavutil 56. 26.100 / 56. 26.100
libavcodec 58. 48.100 / 58. 48.100
libavformat 58. 26.101 / 58. 26.101
libavdevice 58. 7.100 / 58. 7.100
libavfilter 7. 48.100 / 7. 48.100
libswscale 5. 4.100 / 5. 4.100
libswresample 3. 4.100 / 3. 4.100
a) Hearing Tests
I couldn't use ffplay, because I had to run everything from git-master ffmpeg on another node.
What I did instead was using mpv to playback files decoded (below in (b) with git-master ffmpeg.
mpv 01.split-track0*.wav => OK, no gap/pop/click/etc. (original files, no ffmpeg here)
mpv ffmpeg1125277.02.split-track0*.mp3.wav => OK, no gap/pop/click/etc.
mpv ffmpeg1125277.03.split-track0*.opus.wav => OK, no gap/pop/click/etc.
but all with AAC:
mpv ffmpeg1125277.08.split-track0*.faac.m4a.wav => BAD, clearly audible gap
b) Visible tests
For these, all the encoded files were again decoded with:
ffmpeg -vn -i input.file output.wav
(this and only this was done with the ffmpeg from git master on some *buntu machine).
to PCM WAV files like:
which would be the yet again decoded files used in the visual tests (with audacity/sonic-visualizer), that is e.g. ffmpeg1125277.02.split-track01.mp3.wav would have been decoded with ffmpeg1125277 from 02.split-track01.mp3 .
For each such pair an image is attached, e.g.:
Comparing the intersection point:
top: the original 00.test.wav
middle: the joined ffmpeg1125277.02.split-track01.mp3.wav and ffmpeg1125277.02.split-track02.mp3.wav (named ffmpeg1125277.02.joined.mp3.wav)
(joins were made with sox 1.wav 2.wav 1-joined-with-2.wav)
bottom: ffmpeg1125277.02.split-track01.mp3.wav alone, serving just as reference as to where the intersection is
Opus seems to always sample at 48kHz, so in sonic visualizer there is an option that will do automatic resampling on opening, which I've enabled.
ffmpeg1125277.02.mp3.wav.png => OK, mostly, there might be a small distortion (red circle), but I guess nothing that anyone will be able to hear
ffmpeg1125277.03.opus.wav.png => OK, seems perfect
all the AAC ones:
ffmpeg1125277.*.*.m4a.wav.png => BAD, not only huge gaps, but it seems the as if end and start of the joined files was even like "faded out/in" (no idea whether encoder or decoder error)
fdkaac+iTunSMPB: gap + fade in AND out
fdkaac+ISO: gap + fade out
fdkaac+Both: gap + fade out
(mpv seems to always have gap + fade in AND out
aac-enc: gap + fade out AND in
faac: gap + fade out
(same for mpv)
So while we can probably toss aac-enc and faac,... one sees that something is already different with fdkaac depending on the gap detection method (though both have still big gaps).
Long story short:
I would guess that *somewhere* there's a bug with respect to gapless encoding and/or decoding of AAC.
Since fdkaac claims it would support gapless playback, one might assume the error is on ffmpeg's side.
Problem is, I have no encoder/decoder pair for which I know that it works... maybe one could try it with itunes?
I'd be happy to evaluate further, if any developer has an idea how to move on (i.e. how/where to get AAC files which are definitively considered to be correctly encoded for gapless playback and which one can test with ffmpeg), until then I'd assume that the fdkaac created files are in correctly created for gapless playback.
The test files an images can be found at:
FYI: I did the same tests with mpv (however with 4.1.1 ffmpeg):