Opened 6 months ago
Last modified 7 weeks ago
#11018 new defect
Bug: Export AAC from video file with specific duration don't give the good duration
Reported by: | Imprevisible | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | ffmpeg |
Version: | unspecified | Keywords: | segment aac seek seeking |
Cc: | Imprevisible, MasterQuestionable | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
I want to extract specific seconds of a video file, to an AAC, to create custom HLS file. When I export 5s, the outputed file is never 5s, maybe less, maybe more, but never 5s/
How to reproduce:
% ffmpeg -hide_banner -loglevel error -i "/media/Dockarr/downloads-vpn/media/Séries/The IT Crowd/Season 1/The It Crowd S01e01 - Yesterday's Jam.m4v" -ss 0:00:00 -to 0:00:05 -c:a aac -map 0:a:0 -ac 2 -vn -report test.aac ffmpeg version 4.4.2-0ubuntu0.22.04.1 built with gcc 11 (Ubuntu 11.2.0-19ubuntu1) configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
Attachments (3)
Change History (34)
by , 6 months ago
Attachment: | ffmpeg-20240516-201050.log added |
---|
comment:1 by , 6 months ago
So, for information I tried with -to and -t to set the duration/the end time. I tried accurate_seeking and not, so -ss as input and output parameter, and it happen for all my video files, every video file, every codec etc... maybe it's my ffmpeg version ?
comment:2 by , 6 months ago
͏ Lossy audio codecs tend to impose limit on the number of samples:
͏ Which must be a multiple of the Samples Per Frame.
͏ For AAC it's constant 1,024 SPF. [ Mostly. https://gitlab.com/mbunkus/mkvtoolnix/-/issues/2031 ]
͏ See also:
͏ https://stackoverflow.com/questions/59173435#59332275
͏ https://xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-260001.3.2
͏ https://ffmpeg.org/ffmpeg-codecs.html#libopus-1
͏ https://opus-codec.org/docs/html_api/group__opusencoder.html#ga88621a963b809ebfc27887f13518c966
͏ Also you might be doing pointless re-encoding with your command: ͏"-c copy" preferable.
͏ (and seemingly confused ͏"-map" directive)
comment:3 by , 6 months ago
Cc: | added |
---|
comment:4 by , 6 months ago
It is impossible to get to perfect 5 seconds unless you use mp4 container and editlist in mp4, that says what is the size of intial encoder delay and what is the size of the remainder in the end of audio -- even then ffmpeg does not support applying remainder, so typically you cannot get 5 seconds with aac anyway. See #10477 and many others.
comment:5 by , 6 months ago
Isn't that possible that I use any alternative way ? I can extract precise video segments, maybe I can cut the video and extract the audio from this presegmented file? Also I use aac cause it's to make HLS streaming, maybe another audio codec is compatible with AAC and is precise enough?
comment:6 by , 6 months ago
Based on that:https://datatracker.ietf.org/doc/html/draft-pantos-http-live-streaming-20#section-3.4(https://datatracker.ietf.org/doc/html/draft-pantos-http-live-streaming-20#section-3.4)
I can't use a lot of codecs:
Supported Packed Audio formats are AAC with ADTS framing [ISO_13818_7]; MP3 [ISO_13818_3]; AC-3 [AC_3]; and Enhanced AC-3 [AC_3].
As I can see on the apple website, FLAC is also possible. But I don't see the SPF of each one of them. For information, this ffmpeg command is generated with a python script, to select the right audio index, the right parameters.
comment:7 by , 6 months ago
͏ For reference:
͏ 1,024 * 1/48,000 ≈ 0.021333 s
͏ .
͏ Such difference mostly doesn't matter.
͏ The limitation is format binded: incorrigible.
͏ SPF may be variable, as previously indicated.
͏ And probably doesn't matter for FLAC: as it's lossless.
comment:8 by , 6 months ago
1024 is for our ffmpeg encoder. Apple encoder does not use 1024 samples for priming. It uses 2112. HE uses even more samples for priming.
Remainder is not related to this once again.
And probably doesn't matter for FLAC: as it's lossless
Alas, I only saw like one anime using it. Original uses Dolby TrueHD, anyway. And BTW, if you can use TrueHD, do so. Our decoder is finally lossless in all 24 bits mandated by TrueHD.
comment:9 by , 6 months ago
͏ 1,024 refers the SPF.
͏ I prefer FLAC and don't quite understand the necessity of things like Dolby TrueHD.
͏ And recently started to consider lossy and lossless codecs essentially the same thing. [ See <denoise>. ]
comment:11 by , 6 months ago
comment:12 by , 6 months ago
I do not think you need to use google. I described this issue ad nauseum: #7828.
comment:13 by , 6 months ago
MasterQuestionable, can you help review https://patchwork.ffmpeg.org/project/ffmpeg/patch/20220128052107.1678032-1-kode54@gmail.com/
Without this fix aac fix is kinda useless. Since previous gen audio was not fixed either.
comment:14 by , 6 months ago
͏ Many info I post there is much for self reference. (merely publicized in addition, believing which helpful)
͏ Hardly specifically targeting anybody.
͏ I shall review later.
͏ Related: https://trac.ffmpeg.org/ticket/2325#comment:20
͏ I also noted some anomaly (quality-wise) with FFmpeg's AAC implementation: regardless I don't much use.
͏ (and probably hardly sensitive enough to really notice...)
comment:15 by , 6 months ago
I did some tests, and I can see that only the aac and the mp3 dont have a proper seeking. I'm doing HLS streaming, and I need to generate precise audio files so I don't get the audio out of sync. I'm gonna try to play all this audio files instead, but I'm not sure it will be possible
AAC: command: ffmpeg -y -ss 00:00:00 -to 00:00:05 -i "../Téléchargements/The It Crowd S01e01 - Yesterday's Jam.m4v" -vn test.aac mediainfo: General Complete name : ../ffmpeg_test/test.aac Format : ADTS Format/Info : Audio Data Transport Stream File size : 83.6 KiB Overall bit rate mode : Variable Audio Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Format version : Version 4 Codec ID : 2 Bit rate mode : Variable Channel(s) : 2 channels Channel layout : L R Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 83.6 KiB (100%) MP3: command: ffmpeg -y -ss 00:00:00 -to 00:00:05 -i "../Téléchargements/The It Crowd S01e01 - Yesterday's Jam.m4v" -vn test.mp3 mediainfo: General Complete name : ../ffmpeg_test/test.mp3 Format : MPEG Audio File size : 79.1 KiB Duration : 5 s 40 ms Overall bit rate mode : Constant Overall bit rate : 128 kb/s Writing library : Lavf61.3.103 major_brand : mp42 minor_version : 512 compatible_brands : isomiso2mp41 Audio Format : MPEG Audio Format version : Version 1 Format profile : Layer 3 Format settings : Joint stereo Duration : 5 s 40 ms Bit rate mode : Constant Bit rate : 128 kb/s Channel(s) : 2 channels Sampling rate : 48.0 kHz Frame rate : 41.667 FPS (1152 SPF) Compression mode : Lossy Stream size : 78.8 KiB (100%) OGG: command: ffmpeg -y -ss 00:00:00 -to 00:00:05 -i "../Téléchargements/The It Crowd S01e01 - Yesterday's Jam.m4v" -vn test.ogg mediainfo: General Complete name : ../ffmpeg_test/test.ogg Format : Ogg File size : 62.3 KiB Duration : 5 s 0 ms Overall bit rate mode : Variable Overall bit rate : 102 kb/s Writing application : Lavc61.5.104 libvorbis creation_time : 2020-12-08T23:31:48.000000Z handler_name : Stereo vendor_id : [0][0][0][0] major_brand : mp42 minor_version : 512 compatible_brands : isomiso2mp41 Audio ID : 3624949711 (0xD81057CF) Format : Vorbis Format settings, Floor : 1 Duration : 5 s 0 ms Bit rate mode : Variable Bit rate : 112 kb/s Channel(s) : 2 channels Sampling rate : 48.0 kHz Compression mode : Lossy Stream size : 68.4 KiB Writing library : Lavf61.3.103 Language : English WAV: command: ffmpeg -y -ss 00:00:00 -to 00:00:05 -i "../Téléchargements/The It Crowd S01e01 - Yesterday's Jam.m4v" -vn test.wav mediainfo: General Complete name : ../ffmpeg_test/test.wav Format : Wave Format settings : PcmWaveformat File size : 938 KiB Duration : 5 s 0 ms Overall bit rate mode : Constant Overall bit rate : 1 536 kb/s Writing application : Lavf61.3.103 Audio Format : PCM Format settings : Little / Signed Codec ID : 1 Duration : 5 s 0 ms Bit rate mode : Constant Bit rate : 1 536 kb/s Channel(s) : 2 channels Sampling rate : 48.0 kHz Bit depth : 16 bits Stream size : 938 KiB (100%) FLAC: command: ffmpeg -y -ss 00:00:00 -to 00:00:05 -i "../Téléchargements/The It Crowd S01e01 - Yesterday's Jam.m4v" -vn test.flac mediainfo: General Complete name : ../ffmpeg_test/test.flac Format : FLAC Format/Info : Free Lossless Audio Codec File size : 916 KiB Duration : 5 s 0 ms Overall bit rate mode : Variable Overall bit rate : 1 501 kb/s Writing application : Lavf61.3.103 major_brand : mp42 minor_version : 512 compatible_brands : isomiso2mp41 Audio Format : FLAC Format/Info : Free Lossless Audio Codec Duration : 5 s 0 ms Bit rate mode : Variable Bit rate : 1 487 kb/s Channel(s) : 2 channels Channel layout : L R Sampling rate : 48.0 kHz Bit depth : 24 bits Compression mode : Lossless Stream size : 908 KiB (99%) Writing library : Lavf61.3.103 MD5 of the unencoded content : 44ADFEF5FD32EBC45E5E9B71D8637F2B
comment:16 by , 6 months ago
͏ Such segments shall be rejoined client-side.
͏ Try just slice using "-c copy" and let the downstream handle.
comment:17 by , 6 months ago
I tried things, and HLS.js won't stream the audio, if I don't force the format to adts... So I guess I'm fucked
follow-up: 19 comment:18 by , 6 months ago
͏ Similar library JS typically won't interfere the browser's media handling.
͏ The format support is usually decided by the browser.
͏ I believe MP3, Vorbis, FLAC, Opus shall be all supported by mainstream browsers.
͏ Probably you just have to put them into the appropriate container for the context:
͏ WebM: Vorbis, Opus
͏ MP4: MP3, FLAC
͏ .
͏ More info: https://github.com/richtr/NoSleep.js/issues/157#issuecomment-1529149759
͏ Regardless, first transcoding to (if not already in) AAC then slice with "-c copy" should work.
͏ Worth notice:
͏ The various media streams contained don't have to be of equal duration: in particular for HLS alike.
͏ .
͏ Though having the primary streams of unequal duration (after joining) tends to cause unspecified behavior.
comment:19 by , 6 months ago
Replying to MasterQuestionable:
͏ Worth notice:
͏ The various media streams contained don't have to be of equal duration: in particular for HLS alike.
͏ .
͏ Though having the primary streams of unequal duration (after joining) tends to cause unspecified behavior.
Sadly I have to, I'm generating dynamically with Python all the segments, so when I generate the segment 2, I don't know the duration of the segment 1. So if the segment is 4.88s, in the first segment, the second one will still start a 5s, so there's a 0.12s gap, for only two segment, but a segment is 5s, and a video file can be to like 3hours, so it's hundreds of segments
follow-up: 21 comment:20 by , 6 months ago
͏ The implication is minor duration variation between different ~ 5 s slices mostly doesn't matter.
͏ Not that you shall deliberately make such uneven streams without careful consideration.
comment:21 by , 6 months ago
Replying to MasterQuestionable:
͏ The implication is minor duration variation between different ~ 5 s slices mostly doesn't matter.
͏ Not that you shall deliberately make such uneven streams without careful consideration.
As I said, a gap of 0.12s for only two segments is okay, but for 1440 one, next to each other it's not the same, it goes to a 172.8s gap in all the duration, because a .ts file can be exactly 5s long
follow-up: 23 comment:22 by , 6 months ago
͏ The misalignment of seeking by timestamp tends to vary with different timestamps.
͏ And for consecutive ones: tends to self-recover.
͏ Also, per aforementioned (web streaming) sticking to M2TS isn't necessary.
͏ And the 5 s assertion doesn't have much solid base.
comment:23 by , 6 months ago
Replying to MasterQuestionable:
͏ The misalignment of seeking by timestamp tends to vary with different timestamps.
͏ And for consecutive ones: tends to self-recover.
͏ Also, per aforementioned (web streaming) sticking to M2TS isn't necessary.
͏ And the 5 s assertion doesn't have much solid base.
Sadly, what your saying is wrong, cause all of the .aac file I generated are less that 5s, also there's still, either audio that missing, or duplicating audio if the duration is more than 5s (but that's never the case), so no, it's no self-recovering
follow-up: 25 comment:24 by , 6 months ago
͏ So "-t 5 -ss $( 0, 5, 10, ... )" alike always result in < 5 s duration?
͏ "-c copy" here won't do re-encoding: merely slicing the packets/frames independently representable.
͏ You might have to "-map" only the interested audio to make things work properly.
͏ (other streams may interfere the slicing)
comment:25 by , 6 months ago
Replying to MasterQuestionable:
͏ So "-t 5 -ss $( 0, 5, 10, ... )" alike always result in < 5 s duration?
͏ "-c copy" here won't do re-encoding: merely slicing the packets/frames independently representable.
͏ You might have to "-map" only the interested audio to make things work properly.
͏ (other streams may interfere the slicing)
I map the right audio in other commands, for the readability here it's not shown, but I map the audio, and I even set the number of audio channels. I tried to -c copy, same issue, and yes, if I set -t to 5, it's always less than 5s for aac, for mp3 its always more, and for flac, ogg and wav, it's exactly 5s
follow-up: 27 comment:26 by , 6 months ago
͏ Last resort:
͏ "-c copy" the entire audio stream into 1 independent file.
͏ Try the "-ss" "-t" etc. as either input/output options. [ See also: <Seeking> ]
comment:27 by , 6 months ago
Replying to MasterQuestionable:
͏ Last resort:
͏ "-c copy" the entire audio stream into 1 independent file.
͏ Try the "-ss" "-t" etc. as either input/output options.
As I said, same issue, the segment is less that 5s...
follow-up: 29 comment:28 by , 6 months ago
͏ Likely there are issues with the seeking mechanism then.
͏ I may do further testing later.
͏ Would you upload a minimal sample AAC alleged?
comment:29 by , 6 months ago
Replying to MasterQuestionable:
͏ Likely there are issues with the seeking mechanism then.
͏ I may do further testing later.
͏ Would you upload a minimal sample AAC alleged?
Do you want the first 1min of the audio in aac ?
comment:30 by , 6 months ago
͏ Preferably, that just enough to reproduce the problem. (10 or 15 s?)
͏ Upload to the ticket's attachment. (as "in.aac")
comment:31 by , 6 months ago
͏ https://trac.ffmpeg.org/raw-attachment/ticket/11018/in.aac
͏ (~ 304.3 KiB; AAC: 15.018 s, 48,000 Hz)
ffmpeg log