Opened 10 years ago
Closed 9 years ago
#3842 closed enhancement (fixed)
AAC (mp4a) DASH takes abnormally long to detect (~4 - 6s)
Reported by: | viperfx | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | avformat |
Version: | git-master | Keywords: | mov |
Cc: | nfxjfg@googlemail.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
It takes roughly ~4s for FFmpeg to detect the codec of a mp4a audio stream in the DASH container for an audio of length ~3m. The audio stream is from youtube. I have checked the location of the moov atom and it appears to be at the front of the file. I have spoken with devs on #ffmpeg-devel and they have confirmed the same issue and suggest that it should not take this long.
Here is the output using boxdumper: http://tny.cz/571eb4af
How to reproduce:
Obtain the youtube audio stream link using youtube-dl (https://github.com/rg3/youtube-dl) % youtube-dl -f 140 -g YOUTUBE_URL where YOUTUBE_URL is the url of any youtube video. Pick a music video of length 3-4mins so you can notice the ~4s delay. The youtube-dl command will print out a long URL string that you can input to FFmpeg or ffplay to notice the issue % ffmpeg -i "STREAM_URL" or % ffplay "STREAM_URL" My FFmpeg info: ffmpeg version 2.2.1 Copyright (c) 2000-2014 the FFmpeg developers built on Aug 9 2014 10:03:55 with Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) configuration: --prefix=/usr/local/Cellar/ffmpeg/2.2.1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-avresample --enable-vda --cc=clang --host-cflags= --host-ldflags= --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-libxvid --enable-ffplay --enable-libfdk-aac --enable-openssl libavutil 52. 66.100 / 52. 66.100 libavcodec 55. 52.102 / 55. 52.102 libavformat 55. 33.100 / 55. 33.100 libavdevice 55. 10.100 / 55. 10.100 libavfilter 4. 2.100 / 4. 2.100 libavresample 1. 2. 0 / 1. 2. 0 libswscale 2. 5.102 / 2. 5.102 libswresample 0. 18.100 / 0. 18.100 libpostproc 52. 3.100 / 52. 3.100 Output of a typical audio stream: Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'STREAM_URL': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-03-08 01:30:29 Duration: 00:04:00.70, start: 0.000000, bitrate: 127 kb/s Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-03-08 01:30:29 handler_name : SoundHandler
Change History (16)
comment:1 by , 10 years ago
Keywords: | mov added; DASH mp4a aac removed |
---|
comment:2 by , 10 years ago
Here is a sample audio file (it was too large for the attachment): https://www.sendspace.com/file/nmzcbi
Yes, problem is reproducible with the current FFmpeg master.
comment:3 by , 10 years ago
I tested the following:
$ time ffmpeg -i videoplayback.mp4 ffmpeg version N-65479-g7117547 Copyright (c) 2000-2014 the FFmpeg developers built on Aug 10 2014 13:36:46 with gcc 4.7 (SUSE Linux) configuration: --enable-gpl libavutil 54. 0.100 / 54. 0.100 libavcodec 56. 0.100 / 56. 0.100 libavformat 56. 0.100 / 56. 0.100 libavdevice 56. 0.100 / 56. 0.100 libavfilter 5. 0.100 / 5. 0.100 libswscale 3. 0.100 / 3. 0.100 libswresample 1. 0.100 / 1. 0.100 libpostproc 53. 0.100 / 53. 0.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-04-25 14:50:56 Duration: 00:03:35.03, start: 0.000000, bitrate: 128 kb/s Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-04-25 14:50:56 handler_name : SoundHandler At least one output file must be specified real 0m0.011s user 0m0.007s sys 0m0.003s
The time spent on analyzing the mov file seems reasonable to me.
Playback with ffplay starts immediately afaict (see how the process time matches the duration):
$ time ffplay -i videoplayback.mp4 -autoexit ffplay version N-65479-g7117547 Copyright (c) 2003-2014 the FFmpeg developers built on Aug 10 2014 13:36:46 with gcc 4.7 (SUSE Linux) configuration: --enable-gpl libavutil 54. 0.100 / 54. 0.100 libavcodec 56. 0.100 / 56. 0.100 libavformat 56. 0.100 / 56. 0.100 libavdevice 56. 0.100 / 56. 0.100 libavfilter 5. 0.100 / 5. 0.100 libswscale 3. 0.100 / 3. 0.100 libswresample 1. 0.100 / 1. 0.100 libpostproc 53. 0.100 / 53. 0.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':f=0/0 Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-04-25 14:50:56 Duration: 00:03:35.03, start: 0.000000, bitrate: 128 kb/s Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-04-25 14:50:56 handler_name : SoundHandler 215.02 M-A: 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 real 3m35.104s user 0m4.030s sys 0m2.981s
How can I reproduce the issue?
comment:4 by , 10 years ago
Check how much data it reads when opening the file.
The issue is that it seems to read the whole file, which ruins network performance.
comment:5 by , 10 years ago
Replying to viperfx:
configuration: --prefix=/usr/local/Cellar/ffmpeg/2.2.1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-avresample --enable-vda --cc=clang --host-cflags= --host-ldflags= --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-libxvid --enable-ffplay --enable-libfdk-aac --enable-openssl
Unrelated: --enable-vda
and --enable-ffplay
do not work the way you seem to believe these options work, please remove them (configure
will silently ignore those options if SDL or vda are not available, if they are available, ffplay and vda will be enabled automatically). I also believe it is better to remove --host-cflags= --host-ldflags=
since they might lead to confusion.
Is --cc=clang
necessary? It looks redundant to me.
Most users prefer libfdk over libfaac, is there a particular reason why you enable libfaac?
I am curious: Did you test --enable-hardcoded-tables
? (I never did.) Does it have any measurable effect?
comment:6 by , 10 years ago
FFmpeg claims it did not read the whole file (less than one quarter), how can I reproduce that this is not correct?
$ ffmpeg -loglevel debug -i videoplayback.mp4 ffmpeg version N-65479-g7117547 Copyright (c) 2000-2014 the FFmpeg developers built on Aug 10 2014 13:36:46 with gcc 4.7 (SUSE Linux) configuration: --enable-gpl libavutil 54. 0.100 / 54. 0.100 libavcodec 56. 0.100 / 56. 0.100 libavformat 56. 0.100 / 56. 0.100 libavdevice 56. 0.100 / 56. 0.100 libavfilter 5. 0.100 / 5. 0.100 libswscale 3. 0.100 / 3. 0.100 libswresample 1. 0.100 / 1. 0.100 libpostproc 53. 0.100 / 53. 0.100 Splitting the commandline. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'. Reading option '-i' ... matched as input file with argument 'videoplayback.mp4'. Finished splitting the commandline. Parsing a group of options: global . Applying option loglevel (set logging level) with argument debug. Successfully parsed a group of options. Parsing a group of options: input file videoplayback.mp4. Successfully parsed a group of options. Opening an input file: videoplayback.mp4. [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] Format mov,mp4,m4a,3gp,3g2,mj2 probed with size=2048 and score=100 [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] ISO: File Type Major Brand: dash [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] Before avformat_find_stream_info() pos: 3374213 bytes read:720896 seeks:21 [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] All info found [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] After avformat_find_stream_info() pos: 4796 bytes read:753664 seeks:22 frames:1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-04-25 14:50:56 Duration: 00:03:35.03, start: 0.000000, bitrate: 128 kb/s Stream #0:0(und), 1, 1/44100: Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-04-25 14:50:56 handler_name : SoundHandler Successfully opened the file. At least one output file must be specified [AVIOContext @ 0x33f3be0] Statistics: 753664 bytes read, 22 seeks
comment:7 by , 10 years ago
Note that loading gets fast with fflags=+ignidx.
According to Paranoialmaniac on IRC, we need to support the "sidx" atom, which apparently contains an index.
comment:8 by , 10 years ago
Or is the issue that FFmpeg seeks to the end when opening the file?
(Sorry but imo this isn't obvious at all from the original report.)
follow-up: 11 comment:10 by , 10 years ago
I should note that I am compiling FFmpeg on iOS so those flags I posted a slightly different to what I have configured on the iOS app.
Snippet of the configure flags.
./configure --disable-programs --disable-shared --enable-static --enable-pic --enable-small --enable-openssl ${DEBUG_CONFIG_ARGS} \ --disable-decoders --enable-decoder=aac --enable-decoder=h264 --enable-decoder=vorbis \ --disable-encoders --enable-encoder=aac \ --disable-demuxers --enable-demuxer=aac --enable-demuxer=mov --enable-demuxer=matroska --enable-demuxer=h264 \ --disable-muxers --enable-muxer=mov --enable-muxer=mp4 --enable-muxer=hls --enable-muxer=h264 \ --disable-filters --disable-doc
As gjdfgh said, I recieved a suggestion of disabling indexing to make the seeking faster. It does indeed make the file open faster, almost instantly through HTTP infact. However, the issue is that it thinks that the duration of the file is 10s. It does not seem to detect the full duration. Therefore, for a UI perspective I cannot seek the audio track since I do not know the full duration of the file. The good thing is that the audio plays till the end of the file.
cehoyos: The issue is that it does seek to the end of the file when opening. This is evident when using ffmpeg -i to test with an HTTP URL rather than a local file. Using avformat_open_input call, takes >~4s.
follow-ups: 12 13 comment:11 by , 10 years ago
Replying to viperfx:
./configure --disable-programs --disable-shared --enable-static --enable-pic
Unrelated: Do you know if --enable-pic
has any advantages or disadvantages on iOS?
(We have discussed this internally and no clear consensus was reached iirc.)
--disable-shared --enable-static
is the default, you may remove it to get a shorter configure line.
--disable-decoders --enable-decoder=aac --enable-decoder=h264 --enable-decoder=vorbis
You can use --enable-decoder=aac,h264,vorbis
to make your configure line more readable (same for encoders and demuxers and muxers)
comment:12 by , 10 years ago
Replying to cehoyos:
Replying to viperfx:
./configure --disable-programs --disable-shared --enable-static --enable-picUnrelated: Do you know if
--enable-pic
has any advantages or disadvantages on iOS?
(We have discussed this internally and no clear consensus was reached iirc.)
--disable-shared --enable-static
is the default, you may remove it to get a shorter configure line.
--disable-decoders --enable-decoder=aac --enable-decoder=h264 --enable-decoder=vorbisYou can use
--enable-decoder=aac,h264,vorbis
to make your configure line more readable (same for encoders and demuxers and muxers)
Ah okay, thanks. No, I am afraid I am fairly new to FFmpeg and static compiling libs in general for iOS. I am the one who should be asking you for tips it seems :)
comment:13 by , 10 years ago
Replying to cehoyos:
Unrelated: Do you know if
--enable-pic
has any advantages or disadvantages on iOS?
(We have discussed this internally and no clear consensus was reached iirc.)
It is not completely impossible that --enable-pic
has a huge performance impact without having any advantage at all. But I don't know if anybody tested this.
comment:14 by , 10 years ago
Priority: | normal → wish |
---|---|
Status: | new → open |
Type: | defect → enhancement |
Version: | 2.2.4 → git-master |
Afaiu, this is an enhancement request to support reading the sidx
box as defined in 8.16.3 of ISO 14496-12.
comment:15 by , 10 years ago
Cc: | added |
---|
Making this bug report a feature request is pushing it a bit...
comment:16 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | open → closed |
Implemented by Roger Combs in 4ab56667594842283dc5ae07f0daba2a2cb4d3af
Did you upload a sample?
Is the problem reproducible with current FFmpeg git head?
Please do not use external resources (except for large sample files), always post all necessary information here in the ticket.