Opened 7 years ago
Closed 5 years ago
#3842 closed enhancement (fixed)
AAC (mp4a) DASH takes abnormally long to detect (~4 - 6s)
Reported by: | viperfx | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | avformat |
Version: | git-master | Keywords: | mov |
Cc: | nfxjfg@googlemail.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
It takes roughly ~4s for FFmpeg to detect the codec of a mp4a audio stream in the DASH container for an audio of length ~3m. The audio stream is from youtube. I have checked the location of the moov atom and it appears to be at the front of the file. I have spoken with devs on #ffmpeg-devel and they have confirmed the same issue and suggest that it should not take this long.
Here is the output using boxdumper: http://tny.cz/571eb4af
How to reproduce:
Obtain the youtube audio stream link using youtube-dl (https://github.com/rg3/youtube-dl) % youtube-dl -f 140 -g YOUTUBE_URL where YOUTUBE_URL is the url of any youtube video. Pick a music video of length 3-4mins so you can notice the ~4s delay. The youtube-dl command will print out a long URL string that you can input to FFmpeg or ffplay to notice the issue % ffmpeg -i "STREAM_URL" or % ffplay "STREAM_URL" My FFmpeg info: ffmpeg version 2.2.1 Copyright (c) 2000-2014 the FFmpeg developers built on Aug 9 2014 10:03:55 with Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) configuration: --prefix=/usr/local/Cellar/ffmpeg/2.2.1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-avresample --enable-vda --cc=clang --host-cflags= --host-ldflags= --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-libxvid --enable-ffplay --enable-libfdk-aac --enable-openssl libavutil 52. 66.100 / 52. 66.100 libavcodec 55. 52.102 / 55. 52.102 libavformat 55. 33.100 / 55. 33.100 libavdevice 55. 10.100 / 55. 10.100 libavfilter 4. 2.100 / 4. 2.100 libavresample 1. 2. 0 / 1. 2. 0 libswscale 2. 5.102 / 2. 5.102 libswresample 0. 18.100 / 0. 18.100 libpostproc 52. 3.100 / 52. 3.100 Output of a typical audio stream: Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'STREAM_URL': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-03-08 01:30:29 Duration: 00:04:00.70, start: 0.000000, bitrate: 127 kb/s Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-03-08 01:30:29 handler_name : SoundHandler
Change History (16)
comment:1 Changed 7 years ago by cehoyos
- Keywords mov added; DASH mp4a aac removed
comment:2 Changed 7 years ago by viperfx
Here is a sample audio file (it was too large for the attachment): https://www.sendspace.com/file/nmzcbi
Yes, problem is reproducible with the current FFmpeg master.
comment:3 Changed 7 years ago by cehoyos
I tested the following:
$ time ffmpeg -i videoplayback.mp4 ffmpeg version N-65479-g7117547 Copyright (c) 2000-2014 the FFmpeg developers built on Aug 10 2014 13:36:46 with gcc 4.7 (SUSE Linux) configuration: --enable-gpl libavutil 54. 0.100 / 54. 0.100 libavcodec 56. 0.100 / 56. 0.100 libavformat 56. 0.100 / 56. 0.100 libavdevice 56. 0.100 / 56. 0.100 libavfilter 5. 0.100 / 5. 0.100 libswscale 3. 0.100 / 3. 0.100 libswresample 1. 0.100 / 1. 0.100 libpostproc 53. 0.100 / 53. 0.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-04-25 14:50:56 Duration: 00:03:35.03, start: 0.000000, bitrate: 128 kb/s Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-04-25 14:50:56 handler_name : SoundHandler At least one output file must be specified real 0m0.011s user 0m0.007s sys 0m0.003s
The time spent on analyzing the mov file seems reasonable to me.
Playback with ffplay starts immediately afaict (see how the process time matches the duration):
$ time ffplay -i videoplayback.mp4 -autoexit ffplay version N-65479-g7117547 Copyright (c) 2003-2014 the FFmpeg developers built on Aug 10 2014 13:36:46 with gcc 4.7 (SUSE Linux) configuration: --enable-gpl libavutil 54. 0.100 / 54. 0.100 libavcodec 56. 0.100 / 56. 0.100 libavformat 56. 0.100 / 56. 0.100 libavdevice 56. 0.100 / 56. 0.100 libavfilter 5. 0.100 / 5. 0.100 libswscale 3. 0.100 / 3. 0.100 libswresample 1. 0.100 / 1. 0.100 libpostproc 53. 0.100 / 53. 0.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':f=0/0 Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-04-25 14:50:56 Duration: 00:03:35.03, start: 0.000000, bitrate: 128 kb/s Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-04-25 14:50:56 handler_name : SoundHandler 215.02 M-A: 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 real 3m35.104s user 0m4.030s sys 0m2.981s
How can I reproduce the issue?
comment:4 Changed 7 years ago by gjdfgh
Check how much data it reads when opening the file.
The issue is that it seems to read the whole file, which ruins network performance.
comment:5 in reply to: ↑ description Changed 7 years ago by cehoyos
Replying to viperfx:
configuration: --prefix=/usr/local/Cellar/ffmpeg/2.2.1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-avresample --enable-vda --cc=clang --host-cflags= --host-ldflags= --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-libxvid --enable-ffplay --enable-libfdk-aac --enable-openssl
Unrelated: --enable-vda and --enable-ffplay do not work the way you seem to believe these options work, please remove them (configure will silently ignore those options if SDL or vda are not available, if they are available, ffplay and vda will be enabled automatically). I also believe it is better to remove --host-cflags= --host-ldflags= since they might lead to confusion.
Is --cc=clang necessary? It looks redundant to me.
Most users prefer libfdk over libfaac, is there a particular reason why you enable libfaac?
I am curious: Did you test --enable-hardcoded-tables? (I never did.) Does it have any measurable effect?
comment:6 Changed 7 years ago by cehoyos
FFmpeg claims it did not read the whole file (less than one quarter), how can I reproduce that this is not correct?
$ ffmpeg -loglevel debug -i videoplayback.mp4 ffmpeg version N-65479-g7117547 Copyright (c) 2000-2014 the FFmpeg developers built on Aug 10 2014 13:36:46 with gcc 4.7 (SUSE Linux) configuration: --enable-gpl libavutil 54. 0.100 / 54. 0.100 libavcodec 56. 0.100 / 56. 0.100 libavformat 56. 0.100 / 56. 0.100 libavdevice 56. 0.100 / 56. 0.100 libavfilter 5. 0.100 / 5. 0.100 libswscale 3. 0.100 / 3. 0.100 libswresample 1. 0.100 / 1. 0.100 libpostproc 53. 0.100 / 53. 0.100 Splitting the commandline. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'. Reading option '-i' ... matched as input file with argument 'videoplayback.mp4'. Finished splitting the commandline. Parsing a group of options: global . Applying option loglevel (set logging level) with argument debug. Successfully parsed a group of options. Parsing a group of options: input file videoplayback.mp4. Successfully parsed a group of options. Opening an input file: videoplayback.mp4. [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] Format mov,mp4,m4a,3gp,3g2,mj2 probed with size=2048 and score=100 [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] ISO: File Type Major Brand: dash [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] Before avformat_find_stream_info() pos: 3374213 bytes read:720896 seeks:21 [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] All info found [mov,mp4,m4a,3gp,3g2,mj2 @ 0x33f4b20] After avformat_find_stream_info() pos: 4796 bytes read:753664 seeks:22 frames:1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6mp41 creation_time : 2014-04-25 14:50:56 Duration: 00:03:35.03, start: 0.000000, bitrate: 128 kb/s Stream #0:0(und), 1, 1/44100: Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2014-04-25 14:50:56 handler_name : SoundHandler Successfully opened the file. At least one output file must be specified [AVIOContext @ 0x33f3be0] Statistics: 753664 bytes read, 22 seeks
comment:7 Changed 7 years ago by gjdfgh
Note that loading gets fast with fflags=+ignidx.
According to Paranoialmaniac on IRC, we need to support the "sidx" atom, which apparently contains an index.
comment:8 Changed 7 years ago by cehoyos
Or is the issue that FFmpeg seeks to the end when opening the file?
(Sorry but imo this isn't obvious at all from the original report.)
comment:9 Changed 7 years ago by gjdfgh
22 seeks
There's your answer. These translate to 23 http connects.
comment:10 follow-up: ↓ 11 Changed 7 years ago by viperfx
I should note that I am compiling FFmpeg on iOS so those flags I posted a slightly different to what I have configured on the iOS app.
Snippet of the configure flags.
./configure --disable-programs --disable-shared --enable-static --enable-pic --enable-small --enable-openssl ${DEBUG_CONFIG_ARGS} \ --disable-decoders --enable-decoder=aac --enable-decoder=h264 --enable-decoder=vorbis \ --disable-encoders --enable-encoder=aac \ --disable-demuxers --enable-demuxer=aac --enable-demuxer=mov --enable-demuxer=matroska --enable-demuxer=h264 \ --disable-muxers --enable-muxer=mov --enable-muxer=mp4 --enable-muxer=hls --enable-muxer=h264 \ --disable-filters --disable-doc
As gjdfgh said, I recieved a suggestion of disabling indexing to make the seeking faster. It does indeed make the file open faster, almost instantly through HTTP infact. However, the issue is that it thinks that the duration of the file is 10s. It does not seem to detect the full duration. Therefore, for a UI perspective I cannot seek the audio track since I do not know the full duration of the file. The good thing is that the audio plays till the end of the file.
cehoyos: The issue is that it does seek to the end of the file when opening. This is evident when using ffmpeg -i to test with an HTTP URL rather than a local file. Using avformat_open_input call, takes >~4s.
comment:11 in reply to: ↑ 10 ; follow-ups: ↓ 12 ↓ 13 Changed 7 years ago by cehoyos
Replying to viperfx:
./configure --disable-programs --disable-shared --enable-static --enable-pic
Unrelated: Do you know if --enable-pic has any advantages or disadvantages on iOS?
(We have discussed this internally and no clear consensus was reached iirc.)
--disable-shared --enable-static is the default, you may remove it to get a shorter configure line.
--disable-decoders --enable-decoder=aac --enable-decoder=h264 --enable-decoder=vorbis
You can use --enable-decoder=aac,h264,vorbis to make your configure line more readable (same for encoders and demuxers and muxers)
comment:12 in reply to: ↑ 11 Changed 7 years ago by viperfx
Replying to cehoyos:
Replying to viperfx:
./configure --disable-programs --disable-shared --enable-static --enable-picUnrelated: Do you know if --enable-pic has any advantages or disadvantages on iOS?
(We have discussed this internally and no clear consensus was reached iirc.)
--disable-shared --enable-static is the default, you may remove it to get a shorter configure line.
--disable-decoders --enable-decoder=aac --enable-decoder=h264 --enable-decoder=vorbisYou can use --enable-decoder=aac,h264,vorbis to make your configure line more readable (same for encoders and demuxers and muxers)
Ah okay, thanks. No, I am afraid I am fairly new to FFmpeg and static compiling libs in general for iOS. I am the one who should be asking you for tips it seems :)
comment:13 in reply to: ↑ 11 Changed 7 years ago by cehoyos
Replying to cehoyos:
Unrelated: Do you know if --enable-pic has any advantages or disadvantages on iOS?
(We have discussed this internally and no clear consensus was reached iirc.)
It is not completely impossible that --enable-pic has a huge performance impact without having any advantage at all. But I don't know if anybody tested this.
comment:14 Changed 7 years ago by cehoyos
- Priority changed from normal to wish
- Status changed from new to open
- Type changed from defect to enhancement
- Version changed from 2.2.4 to git-master
Afaiu, this is an enhancement request to support reading the sidx box as defined in 8.16.3 of ISO 14496-12.
comment:15 Changed 7 years ago by gjdfgh
- Cc nfxjfg@googlemail.com added
Making this bug report a feature request is pushing it a bit...
comment:16 Changed 5 years ago by cehoyos
- Resolution set to fixed
- Status changed from open to closed
Implemented by Roger Combs in 4ab56667594842283dc5ae07f0daba2a2cb4d3af
Did you upload a sample?
Is the problem reproducible with current FFmpeg git head?
Please do not use external resources (except for large sample files), always post all necessary information here in the ticket.