Opened 8 years ago

Last modified 7 months ago

#5910 new defect

AAC to PCM conversion inserts extra silence in the beginning

Reported by: jwilhelmsson Owned by:
Priority: normal Component: undetermined
Version: git-master Keywords: aac mov
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
When converting AAC audio files/streams to PCM extra silence is inserted in the beginning of the output file.

Note:
This may very well be the same issue as ticket #2325, but since I believe I have more information I elected to create a new one.

The long version:
My company dub cartoons, so we receive many kinds of video formats from our various clients. Recently one of them complained that our final delivery was out of sync compared to the original material, and that's how this issue was discovered. The reference files from the client were mp4:s with aac audio, and when I converted said audio into wav files (for use in our recording software, Steinberg Nuendo) extra silence got inserted in the beginning, making us record everything out of sync.

That it was ffmpeg that was in the wrong was concluded by comparing with files converted by ProTools, Nuendo, and QuickTime - which are all the same and different from the ffmpeg output.

After lots of testing I concluded that it's the AAC to PCM conversion that's the culprit (ie. the video container format is mostly irrelevant), and also that the length of the inserted silence varies between different files. I haven't been able to pinpoint exactly what causes the difference.

Attached are five aac files, plus wav files converted by ffmpeg (3.1.4) and QuickTime Pro (7.7.9) clearly showing the difference. Since the files come from commercial productions I've only included 7 to 10 seconds from each, but it's enough to see the error.

Two of the files insert approximately 44 milliseconds (or about 2100 samples) of silence, two insert 108 milliseconds (about 5200 samples), and one oddly enough gets only 32 milliseconds of silence even though the audio is shifted 44 ms (this is easy to see since it starts with a test tone).

How to reproduce:
The aac files were converted by ffmpeg with the command (I'll attach outputs in separate messages below):

ffmpeg -i input -c:a pcm_s24le -ar 48k output

They were also converted with QuickTime Pro with the same settings (24 bits, 48kHz). I then compared the waveforms in both Nuendo and Audacity. The offset values were measured by manually marking an area in Audacity, so they are very approximate.

The files:
The attached files come from one movie and two tv series (two episodes each). The movie files are called "g", and the series "tj" and "td". The aac files were extracted from the original mp4 files by stream copying:

ffmpeg -i input -c:a copy -t 10 output

The movie file starts with a test tone, and is also the one which differs 32 ms in the beginning of the test tone, but 44 at the end of it.

Random notes:
The error is the same when converting directly from the mp4 file and when converting from an extracted aac.

Extracting PCM wav from a mov container produces no errors.

If I convert the aac stream to a new aac file there's still an error, but only half as long. I've only tested this on one file, but it produced a 22 ms gap instead of 44 ms.

Compounded converting does not compound the error. Ie: Converting from aac to aac, and then converting that output file to aac again does not increase the error.

Converting to a different bitrate/sample rate does not affect the result.

Final words:
I've done a lot of testing, but it's very possible that I've forgotten some vital information in this report, so please ask if you need more details.

Attachments (17)

g_7s.aac (274.2 KB ) - added by jwilhelmsson 8 years ago.
Clip from movie. 32 ms error in the beginning of the test tone, 44 at the end.
tj101_10s.aac (81.4 KB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 44 ms error.
tj103_10s.aac (160.5 KB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 102. 44 ms error.
td101_10s.aac (78.8 KB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 108 ms error.
td103_10s.aac (78.8 KB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 103. 108 ms error.
g_ffmpeg.wav (1.9 MB ) - added by jwilhelmsson 8 years ago.
Clip from movie, converted with ffmpeg.
g_quicktime.wav (1.9 MB ) - added by jwilhelmsson 8 years ago.
Clip from movie, converted with QuickTime Pro.
tj101_10s_ffmpeg.wav (1.4 MB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 44 ms error. Converted with ffmpeg.
tj101_10s_quicktime.wav (1.4 MB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 44 ms error. Converted with QuickTime Pro.
tj103_10s_ffmpeg.zip (526.0 KB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 103. 44 ms error. Converted with ffmpeg. Zipped for size.
tj103_10s_quicktime.zip (526.0 KB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 103. 44 ms error. Converted with QuickTime Pro. Zipped for size.
td101_10s_ffmpeg.zip (2.4 MB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 108 ms error. Converted with ffmpeg. Zipped for size.
td101_10s_quicktime.zip (2.4 MB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 108 ms error. Converted with ffmpeg. Zipped for size.
td103_10s_ffmpeg.zip (2.3 MB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 103. 108 ms error. Converted with ffmpeg. Zipped for size.
td103_10s_quicktime.zip (2.3 MB ) - added by jwilhelmsson 8 years ago.
Clip from tv series, episode 101. 108 ms error. Converted with QuickTime Pro. Zipped for size.
cut_movie_files.zip (884.1 KB ) - added by jwilhelmsson 8 years ago.
cut_playable_movie_files.zip (2.2 MB ) - added by jwilhelmsson 8 years ago.

Change History (36)

by jwilhelmsson, 8 years ago

Attachment: g_7s.aac added

Clip from movie. 32 ms error in the beginning of the test tone, 44 at the end.

by jwilhelmsson, 8 years ago

Attachment: tj101_10s.aac added

Clip from tv series, episode 101. 44 ms error.

by jwilhelmsson, 8 years ago

Attachment: tj103_10s.aac added

Clip from tv series, episode 102. 44 ms error.

by jwilhelmsson, 8 years ago

Attachment: td101_10s.aac added

Clip from tv series, episode 101. 108 ms error.

by jwilhelmsson, 8 years ago

Attachment: td103_10s.aac added

Clip from tv series, episode 103. 108 ms error.

by jwilhelmsson, 8 years ago

Attachment: g_ffmpeg.wav added

Clip from movie, converted with ffmpeg.

by jwilhelmsson, 8 years ago

Attachment: g_quicktime.wav added

Clip from movie, converted with QuickTime Pro.

by jwilhelmsson, 8 years ago

Attachment: tj101_10s_ffmpeg.wav added

Clip from tv series, episode 101. 44 ms error. Converted with ffmpeg.

by jwilhelmsson, 8 years ago

Attachment: tj101_10s_quicktime.wav added

Clip from tv series, episode 101. 44 ms error. Converted with QuickTime Pro.

by jwilhelmsson, 8 years ago

Attachment: tj103_10s_ffmpeg.zip added

Clip from tv series, episode 103. 44 ms error. Converted with ffmpeg. Zipped for size.

by jwilhelmsson, 8 years ago

Attachment: tj103_10s_quicktime.zip added

Clip from tv series, episode 103. 44 ms error. Converted with QuickTime Pro. Zipped for size.

by jwilhelmsson, 8 years ago

Attachment: td101_10s_ffmpeg.zip added

Clip from tv series, episode 101. 108 ms error. Converted with ffmpeg. Zipped for size.

by jwilhelmsson, 8 years ago

Attachment: td101_10s_quicktime.zip added

Clip from tv series, episode 101. 108 ms error. Converted with ffmpeg. Zipped for size.

by jwilhelmsson, 8 years ago

Attachment: td103_10s_ffmpeg.zip added

Clip from tv series, episode 103. 108 ms error. Converted with ffmpeg. Zipped for size.

by jwilhelmsson, 8 years ago

Attachment: td103_10s_quicktime.zip added

Clip from tv series, episode 101. 108 ms error. Converted with QuickTime Pro. Zipped for size.

comment:1 by jwilhelmsson, 8 years ago

The comment on the file td101_10s_quicktime.zip is wrong, it is of course converted with QuickTime Pro as the filename indicates.

Also please note that I've included the error length in the QT file descriptions even though those files are okay. This is to have something to connect the files with.

comment:2 by Carl Eugen Hoyos, 8 years ago

What is the difference between this ticket and ticket #2325?

comment:3 by jwilhelmsson, 8 years ago

Level of detail, mostly, that it's not directly related to the container, and that there are different error values possible instead of just 2 milliseconds. I also thought it'd be too much to put in a comment, but I could of course be wrong.

comment:4 by jwilhelmsson, 8 years ago

I almost forgot to add the ffmpeg outputs. Here is the one for converting the movie clip:

ffmpeg version 3.1.4 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5)
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --docdir=/usr/share/doc/ffmpeg-3.1.4/html --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-march=core2 -mtune=generic -O2 -pipe' --disable-static --enable-avfilter --enable-avresample --disable-stripping --enable-nonfree --disable-indev=v4l2 --disable-outdev=v4l2 --disable-indev=alsa --disable-indev=oss --disable-indev=jack --disable-outdev=alsa --disable-outdev=oss --disable-outdev=sdl --enable-version3 --enable-nonfree --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --disable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --disable-lzma --enable-network --disable-openssl --enable-postproc --disable-libsmbclient --disable-ffplay --disable-sdl --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-libxcb-shm --disable-libxcb-xfixes --enable-zlib --disable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --disable-openal --disable-opengl --disable-libv4l2 --disable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --enable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --enable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --disable-libssh --enable-libschroedinger --enable-libspeex --enable-libvorbis --disable-libvpx --disable-libzvbi --disable-libbs2b --disable-chromaprint --disable-libebur128 --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --enable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --enable-libfaac --disable-libkvazaar --disable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --disable-libwavpack --disable-libwebp --enable-libx264 --disable-libx265 --enable-libxvid --disable-x11grab --disable-amd3dnow --disable-amd3dnowext --disable-aesni --disable-avx --disable-avx2 --disable-fma3 --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4 --disable-sse42 --disable-xop --cpu=core2 --disable-doc --disable-htmlpages --enable-manpages
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 48.101 / 57. 48.101
  libavformat    57. 41.100 / 57. 41.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 47.100 /  6. 47.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
  libpostproc    54.  0.100 / 54.  0.100
[aac @ 0x8d12b0] Estimating duration from bitrate, this may be inaccurate
Input #0, aac, from 'g_7s.aac':
  Duration: 00:00:07.01, bitrate: 320 kb/s
    Stream #0:0: Audio: aac (LC), 48000 Hz, stereo, fltp, 320 kb/s
[wav @ 0x8ebf20] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, wav, to 'g_ffmpeg.wav':
  Metadata:
    ISFT            : Lavf57.41.100
    Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
    Metadata:
      encoder         : Lavc57.48.101 pcm_s24le
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
size=    1974kB time=00:00:07.01 bitrate=2304.1kbits/s speed= 197x
video:0kB audio:1974kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.005046%

comment:5 by jwilhelmsson, 8 years ago

The command line for the previous output was: ffmpeg -i g_7s.aac -c:a pcm_s24le -ar 48k g_ffmpeg.wav

In order not to spam with comments, here are all the rest:

tj101

ffmpeg -i tj101_10s.aac -c:a pcm_s24le -ar 48k tj101_10s_ffmpeg.wav
ffmpeg version 3.1.4 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5)
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --docdir=/usr/share/doc/ffmpeg-3.1.4/html --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-march=core2 -mtune=generic -O2 -pipe' --disable-static --enable-avfilter --enable-avresample --disable-stripping --enable-nonfree --disable-indev=v4l2 --disable-outdev=v4l2 --disable-indev=alsa --disable-indev=oss --disable-indev=jack --disable-outdev=alsa --disable-outdev=oss --disable-outdev=sdl --enable-version3 --enable-nonfree --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --disable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --disable-lzma --enable-network --disable-openssl --enable-postproc --disable-libsmbclient --disable-ffplay --disable-sdl --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-libxcb-shm --disable-libxcb-xfixes --enable-zlib --disable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --disable-openal --disable-opengl --disable-libv4l2 --disable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --enable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --enable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --disable-libssh --enable-libschroedinger --enable-libspeex --enable-libvorbis --disable-libvpx --disable-libzvbi --disable-libbs2b --disable-chromaprint --disable-libebur128 --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --enable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --enable-libfaac --disable-libkvazaar --disable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --disable-libwavpack --disable-libwebp --enable-libx264 --disable-libx265 --enable-libxvid --disable-x11grab --disable-amd3dnow --disable-amd3dnowext --disable-aesni --disable-avx --disable-avx2 --disable-fma3 --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4 --disable-sse42 --disable-xop --cpu=core2 --disable-doc --disable-htmlpages --enable-manpages
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 48.101 / 57. 48.101
  libavformat    57. 41.100 / 57. 41.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 47.100 /  6. 47.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
  libpostproc    54.  0.100 / 54.  0.100
[aac @ 0x13032b0] Estimating duration from bitrate, this may be inaccurate
Input #0, aac, from 'tj101_10s.aac':
  Duration: 00:00:05.02, bitrate: 132 kb/s
    Stream #0:0: Audio: aac (LC), 48000 Hz, mono, fltp, 132 kb/s
[wav @ 0x1304de0] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, wav, to 'tj101_10s_ffmpeg.wav':
  Metadata:
    ISFT            : Lavf57.41.100
    Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, mono, s32, 1152 kb/s
    Metadata:
      encoder         : Lavc57.48.101 pcm_s24le
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
size=    1407kB time=00:00:10.00 bitrate=1152.1kbits/s speed= 505x
video:0kB audio:1407kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.007080%

tj103

ffmpeg -i tj103_10s.aac -c:a pcm_s24le -ar 48k tj103_10s_ffmpeg.wav
ffmpeg version 3.1.4 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5)
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --docdir=/usr/share/doc/ffmpeg-3.1.4/html --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-march=core2 -mtune=generic -O2 -pipe' --disable-static --enable-avfilter --enable-avresample --disable-stripping --enable-nonfree --disable-indev=v4l2 --disable-outdev=v4l2 --disable-indev=alsa --disable-indev=oss --disable-indev=jack --disable-outdev=alsa --disable-outdev=oss --disable-outdev=sdl --enable-version3 --enable-nonfree --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --disable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --disable-lzma --enable-network --disable-openssl --enable-postproc --disable-libsmbclient --disable-ffplay --disable-sdl --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-libxcb-shm --disable-libxcb-xfixes --enable-zlib --disable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --disable-openal --disable-opengl --disable-libv4l2 --disable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --enable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --enable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --disable-libssh --enable-libschroedinger --enable-libspeex --enable-libvorbis --disable-libvpx --disable-libzvbi --disable-libbs2b --disable-chromaprint --disable-libebur128 --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --enable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --enable-libfaac --disable-libkvazaar --disable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --disable-libwavpack --disable-libwebp --enable-libx264 --disable-libx265 --enable-libxvid --disable-x11grab --disable-amd3dnow --disable-amd3dnowext --disable-aesni --disable-avx --disable-avx2 --disable-fma3 --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4 --disable-sse42 --disable-xop --cpu=core2 --disable-doc --disable-htmlpages --enable-manpages
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 48.101 / 57. 48.101
  libavformat    57. 41.100 / 57. 41.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 47.100 /  6. 47.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
  libpostproc    54.  0.100 / 54.  0.100
[aac @ 0x12472b0] Estimating duration from bitrate, this may be inaccurate
Input #0, aac, from 'tj103_10s.aac':
  Duration: 00:00:10.07, bitrate: 130 kb/s
    Stream #0:0: Audio: aac (LC), 48000 Hz, stereo, fltp, 130 kb/s
[wav @ 0x1261de0] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, wav, to 'tj103_10s_ffmpeg.wav':
  Metadata:
    ISFT            : Lavf57.41.100
    Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
    Metadata:
      encoder         : Lavc57.48.101 pcm_s24le
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
size=    2826kB time=00:00:10.04 bitrate=2304.1kbits/s speed= 392x
video:0kB audio:2826kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.003525%

td101

ffmpeg -i td101_10s.aac -c:a pcm_s24le -ar 48k td101_10s_ffmpeg.wav
ffmpeg version 3.1.4 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5)
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --docdir=/usr/share/doc/ffmpeg-3.1.4/html --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-march=core2 -mtune=generic -O2 -pipe' --disable-static --enable-avfilter --enable-avresample --disable-stripping --enable-nonfree --disable-indev=v4l2 --disable-outdev=v4l2 --disable-indev=alsa --disable-indev=oss --disable-indev=jack --disable-outdev=alsa --disable-outdev=oss --disable-outdev=sdl --enable-version3 --enable-nonfree --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --disable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --disable-lzma --enable-network --disable-openssl --enable-postproc --disable-libsmbclient --disable-ffplay --disable-sdl --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-libxcb-shm --disable-libxcb-xfixes --enable-zlib --disable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --disable-openal --disable-opengl --disable-libv4l2 --disable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --enable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --enable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --disable-libssh --enable-libschroedinger --enable-libspeex --enable-libvorbis --disable-libvpx --disable-libzvbi --disable-libbs2b --disable-chromaprint --disable-libebur128 --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --enable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --enable-libfaac --disable-libkvazaar --disable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --disable-libwavpack --disable-libwebp --enable-libx264 --disable-libx265 --enable-libxvid --disable-x11grab --disable-amd3dnow --disable-amd3dnowext --disable-aesni --disable-avx --disable-avx2 --disable-fma3 --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4 --disable-sse42 --disable-xop --cpu=core2 --disable-doc --disable-htmlpages --enable-manpages
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 48.101 / 57. 48.101
  libavformat    57. 41.100 / 57. 41.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 47.100 /  6. 47.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
  libpostproc    54.  0.100 / 54.  0.100
[aac @ 0x1d8d2b0] Estimating duration from bitrate, this may be inaccurate
Input #0, aac, from 'td101_10s.aac':
  Duration: 00:00:09.51, bitrate: 67 kb/s
    Stream #0:0: Audio: aac (HE-AAC), 48000 Hz, stereo, fltp, 67 kb/s
[wav @ 0x1da7e60] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, wav, to 'td101_10s_ffmpeg.wav':
  Metadata:
    ISFT            : Lavf57.41.100
    Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
    Metadata:
      encoder         : Lavc57.48.101 pcm_s24le
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
size=    2820kB time=00:00:10.02 bitrate=2304.1kbits/s speed= 236x
video:0kB audio:2820kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.003532%

and finally td103

ffmpeg -i td103_10s.aac -c:a pcm_s24le -ar 48k td103_10s_ffmpeg.wav
ffmpeg version 3.1.4 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5)
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --docdir=/usr/share/doc/ffmpeg-3.1.4/html --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-march=core2 -mtune=generic -O2 -pipe' --disable-static --enable-avfilter --enable-avresample --disable-stripping --enable-nonfree --disable-indev=v4l2 --disable-outdev=v4l2 --disable-indev=alsa --disable-indev=oss --disable-indev=jack --disable-outdev=alsa --disable-outdev=oss --disable-outdev=sdl --enable-version3 --enable-nonfree --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --disable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --disable-lzma --enable-network --disable-openssl --enable-postproc --disable-libsmbclient --disable-ffplay --disable-sdl --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-libxcb-shm --disable-libxcb-xfixes --enable-zlib --disable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --disable-openal --disable-opengl --disable-libv4l2 --disable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --enable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --enable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --disable-libssh --enable-libschroedinger --enable-libspeex --enable-libvorbis --disable-libvpx --disable-libzvbi --disable-libbs2b --disable-chromaprint --disable-libebur128 --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --enable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --enable-libfaac --disable-libkvazaar --disable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --disable-libwavpack --disable-libwebp --enable-libx264 --disable-libx265 --enable-libxvid --disable-x11grab --disable-amd3dnow --disable-amd3dnowext --disable-aesni --disable-avx --disable-avx2 --disable-fma3 --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4 --disable-sse42 --disable-xop --cpu=core2 --disable-doc --disable-htmlpages --enable-manpages
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 48.101 / 57. 48.101
  libavformat    57. 41.100 / 57. 41.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 47.100 /  6. 47.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
  libpostproc    54.  0.100 / 54.  0.100
[aac @ 0xbc42b0] Estimating duration from bitrate, this may be inaccurate
Input #0, aac, from 'td103_10s.aac':
  Duration: 00:00:10.93, bitrate: 59 kb/s
    Stream #0:0: Audio: aac (HE-AAC), 48000 Hz, stereo, fltp, 59 kb/s
[wav @ 0xbdee60] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, wav, to 'td103_10s_ffmpeg.wav':
  Metadata:
    ISFT            : Lavf57.41.100
    Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
    Metadata:
      encoder         : Lavc57.48.101 pcm_s24le
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
size=    2820kB time=00:00:10.02 bitrate=2304.1kbits/s speed= 219x
video:0kB audio:2820kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.003532%

comment:6 by Carl Eugen Hoyos, 8 years ago

Is the issue reproducible with current FFmpeg git head?

comment:7 by jwilhelmsson, 8 years ago

It seems so, yes. I asked our server guy to install the latest, and all files convert exactly the same way. Here's the output for the seven second movie clip:

ffmpeg -i g_7s.aac -c:a pcm_s24le -ar 48k g_7s_ffmpeg.wav
ffmpeg version N-82143-gbf14393 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5)
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --docdir=/usr/share/doc/ffmpeg-9999/html --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-march=core2 -mtune=generic -O2 -pipe' --disable-static --enable-avfilter --enable-avresample --disable-stripping --disable-indev=v4l2 --disable-outdev=v4l2 --disable-indev=alsa --disable-indev=oss --disable-indev=jack --disable-outdev=alsa --disable-outdev=oss --disable-outdev=sdl --enable-version3 --enable-nonfree --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --disable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --disable-lzma --enable-network --disable-openssl --enable-postproc --disable-libsmbclient --disable-ffplay --disable-sdl2 --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-libxcb-shm --disable-libxcb-xfixes --enable-zlib --disable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --disable-openal --disable-opengl --disable-libv4l2 --disable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --enable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --enable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --disable-libssh --enable-libschroedinger --enable-libspeex --enable-libvorbis --disable-libvpx --disable-libzvbi --disable-libbs2b --disable-chromaprint --disable-libebur128 --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --enable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --disable-libkvazaar --disable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --disable-libwavpack --disable-libwebp --enable-libx264 --disable-libx265 --enable-libxvid --disable-x11grab --disable-amd3dnow --disable-amd3dnowext --disable-aesni --disable-avx --disable-avx2 --disable-fma3 --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4 --disable-sse42 --disable-xop --cpu=core2 --disable-doc --disable-htmlpages --enable-manpages
  libavutil      55. 35.100 / 55. 35.100
  libavcodec     57. 65.100 / 57. 65.100
  libavformat    57. 57.100 / 57. 57.100
  libavdevice    57.  2.100 / 57.  2.100
  libavfilter     6. 66.100 /  6. 66.100
  libavresample   3.  2.  0 /  3.  2.  0
  libswscale      4.  3.100 /  4.  3.100
  libswresample   2.  4.100 /  2.  4.100
  libpostproc    54.  2.100 / 54.  2.100
[aac @ 0x88b2b0] Estimating duration from bitrate, this may be inaccurate
Input #0, aac, from 'g_7s.aac':
  Duration: 00:00:07.01, bitrate: 320 kb/s
    Stream #0:0: Audio: aac (LC), 48000 Hz, stereo, fltp, 320 kb/s
Output #0, wav, to 'g_7s_ffmpeg.wav':
  Metadata:
    ISFT            : Lavf57.57.100
    Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
    Metadata:
      encoder         : Lavc57.65.100 pcm_s24le
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
size=    1974kB time=00:00:07.01 bitrate=2304.1kbits/s speed= 297x
video:0kB audio:1974kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.005046%

comment:8 by Marton Balint, 8 years ago

This is normal for raw AAC bitstreams. If an AAC bitstream is in a proper container which correctly describes the encoder delay, ffmpeg should be able to skip the uneeded samples from the start.

https://developer.apple.com/library/content/documentation/QuickTime/QTFF/QTFFAppenG/QTFFAppenG.html

comment:9 by jwilhelmsson, 8 years ago

I may be misunderstanding the intent (and target) of your information, but as I wrote in my summary, the error appears even when converting the audio from the original MP4 file, which I feel must contain the needed delay information since both ProTools and Nuendo manage to produce correctly converted audio from it. Would it help if I attached a snippet of a complete container file?

in reply to:  9 comment:10 by Carl Eugen Hoyos, 8 years ago

Replying to jwilhelmsson:

I may be misunderstanding the intent (and target) of your information, but as I wrote in my summary, the error appears even when converting the audio from the original MP4 file

Then why did you provide sample file and command line with console output for another case?

comment:11 by jwilhelmsson, 8 years ago

My reasoning was that thought I had narrowed down the issue to the aac decoding, since I got the exact same error when encoding aac audio from an mp4 container as when doing it from the extracted audio. Perhaps this is a case of "premature optimization" on my part, but I honestly thought I was making it easier to pinpoint the error. If not, I apologize.

comment:12 by Marton Balint, 8 years ago

Please attach a small mp4 sample, so it can be checked if it contains audio priming metadata or not. If it does not, then it is possible that ProTools/Nuendo is using some default (implicit) priming, which happens to work by luck.

comment:13 by jwilhelmsson, 8 years ago

Sorry for the delay.

I have to say that it sounds unlikely that there would be some fixed value since I'm getting different values for different files, and identical results for extracting from mp4/mov containers and converting from standalone aac files. But if I knew the reason for the differences I wouldn't have opened this ticket in the first place, so...

I am attaching a zip with six movie files. There's one from each of the different offset errors (the movie, "g", and the two TV series "td" and "tj"). Since I noticed that ffmpeg rewrites some of the information in the beginning of the file I've also cut a piece of each original file with dd, matching the number of bytes in the ffmpeg-cut files. The dd files are not playable, but I figured it might be worth something to have the original untouched data. The zip file also contains a text file with the ffmpeg command lines and outputs, and the dd command lines (no useful output).

by jwilhelmsson, 8 years ago

Attachment: cut_movie_files.zip added

in reply to:  13 comment:14 by Carl Eugen Hoyos, 8 years ago

Replying to jwilhelmsson:

The dd files are not playable

Use tools/qt-faststart before using dd to make them playable, try to avoid uploading files made with FFmpeg.

comment:15 by jwilhelmsson, 8 years ago

Weird, for two of the files qt-faststart fails (see below). But I tried extracting larger chunks and managed to get playable files from all three without including too much of the actual content (to avoid the wrath of our client).

qt-faststart g_org.mp4 g_fs.mp4
ftyp          0 24
moov         24 2593869
skip    2593893 10248
mdat    2604141 1974368143
last atom in file was not a moov atom

qt-faststart td101_org.mp4 td101_fs.mp4
ftyp          0 24
moov         24 292864
mdat     292888 92353661
free   92646549 59
last atom in file was not a moov atom

qt-faststart tj103_org.mov tj103_fs.mov
ftyp          0 20
wide         20 8
mdat         28 204441012
moov  204441040 874408
 patching stco atom...
 patching stco atom...
 patching stco atom...
 writing ftyp atom...
 writing moov atom...
 copying rest of file...

by jwilhelmsson, 8 years ago

in reply to:  15 comment:16 by Carl Eugen Hoyos, 8 years ago

Replying to jwilhelmsson:

Weird, for two of the files qt-faststart fails (see below).

They already allowed faststart before.

What is the delay (is this the correct word?) that you observe with FFmpeg and what is the expected delay for the three files?

comment:17 by jwilhelmsson, 8 years ago

I figured.

I don't know what the expected delay is, or if there is any, only the difference between what ffmpeg produces when extracting and converting and what Nuendo/Protools/Quicktime produces.

As I wrote, the source material we receive is mp4 and/or mov files, but when we work with it we need to have the audio (in wav format) on a separate track so I need to extract and convert the aac audio streams. During this conversion, ffmpeg seems to insert silence in the beginning of the files, as follows (all numbers are approximate and measured by marking areas in Audacity):

The "tj" files get 44 ms delay (about 2100 samples, which sounds close to the 2112 value).

The "td" files get 108 ms delay.

The "g" file gets 32 ms delay *at the start*, but 44 ms overall. This is easy to see since it starts with color bars and a tone which should start at the very beginning and has a clean end. Comparing the end points you get 44 ms, but the start is for some reason shorter.

And as I've written, these delays appear regardless of whether the audio is extracted and converted from the original file, with ffmpeg -i source.mp4 -c:a pcm_s24le -ar48k out.wav, or if the audio is first extracted (ffmpeg -i source.mp4 -c:a copy source.aac) and then converted (ffmpeg -i source.aac -c:a pcm_s24le -ar48k out.wav).

comment:18 by Carl Eugen Hoyos, 8 years ago

Keywords: mov added; pcm removed
Version: unspecifiedgit-master

comment:19 by Balling, 7 months ago

The other file is HE-AAC. Apple uses 5186 samples for priming delay in that case, not 2112. Out of 5186 you have 2112*2 normal delay and then 481 * 2 of SBR decoder delay. Together they add up to 5186. There was a bug in decoding that ffmpeg has: 5114ce1e2a4c71ddf4971ad3cf9bd43ae16571c3

Obviously even then it assumes 960 SBR delay, not 481. Because of that FFmpeg still removes too much: 6144 and not 5186. Indeed, 6144 - 2 * 2112 will be 960*2.

So this is a bug about SBR delay.

Note: See TracTickets for help on using tickets.