Opened 7 years ago
Last modified 7 years ago
#7028 new defect
Improper rounding of output sample rate when using libopus
Reported by: | heicrd | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avcodec |
Version: | git-master | Keywords: | libopus |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
When used in combination with libopus
's limited selection of supported sample rates, an automatically-inserted aresample
filter will round the sample rate of the audio to the nearest supported sample rate, potentially rounding down and incurring a significant and unexpected loss in fidelity.
For example, a 32khz input will be downsampled to 24khz instead of upsampled to 48khz as opusenc
seems to do.
How to reproduce:
Use ffmpeg to encode a 32khz file with libopus
% ./ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -codec:a pcm_s16le -af aresample=32000 -f wav - | ./ffmpeg -v 9 -loglevel 99 -f wav -i - -codec:a libopus -f ogg -y /dev/null ffmpeg version N-90069-gdd8351b118ffmpeg version N-90069-gdd8351b118 Copyright (c) 2000-2018 the FFmpeg developers built with gcc 7 (Debian 7.3.0-3) Copyright (c) 2000-2018 the FFmpeg developers configuration: --enable-libopus built with gcc 7 (Debian 7.3.0-3) configuration: --enable-libopus libavutil 56. 7.101 / 56. 7.101 libavcodec 58. 11.101 / 58. 11.101 libavformat 58. 9.100 / 58. 9.100 libavdevice 58. 1.100 / 58. 1.100 libavfilter 7. 12.100 / 7. 12.100 libswscale 5. 0.101 / 5. 0.101 libswresample 3. 0.101 / 3. 0.101 libavutil 56. 7.101 / 56. 7.101 libavcodec 58. 11.101 / 58. 11.101 libavformat 58. 9.100 / 58. 9.100 libavdevice 58. 1.100 / 58. 1.100 libavfilter 7. 12.100 / 7. 12.100 libswscale 5. 0.101 / 5. 0.101 Splitting the commandline. libswresample 3. 0.101 / 3. 0.101 Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'. Reading option '-f' ... matched as option 'f' (force format) with argument 'wav'. Reading option '-i' ... matched as input url with argument '-'. Reading option '-codec:a' ... matched as option 'codec' (codec name) with argument 'libopus'. Reading option '-f' ... matched as option 'f' (force format) with argument 'ogg'. Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'. Reading option '/dev/null' ... matched as output url. Finished splitting the commandline. Parsing a group of options: global . Applying option v (set logging level) with argument 9. Applying option y (overwrite output files) with argument 1. Successfully parsed a group of options. Parsing a group of options: input url -. Applying option f (force format) with argument wav. Successfully parsed a group of options. Opening an input file: -. [wav @ 0x5647ff2d2340] Opening 'pipe:' for reading [pipe @ 0x5647ff2d2ec0] Setting default whitelist 'crypto' Input #0, lavfi, from 'sine=frequency=1000:duration=5': Duration: N/A, start: 0.000000, bitrate: 705 kb/s Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'pipe:': Metadata: ISFT : Lavf58.9.100 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, mono, s16, 512 kb/s Metadata: encoder : Lavc58.11.101 pcm_s16le [wav @ 0x5647ff2d2340] Ignoring maximum wav data size, file may be invalid [wav @ 0x5647ff2d2340] Before avformat_find_stream_info() pos: 78 bytes read:66920 seeks:0 nb_streams:1 [wav @ 0x5647ff2d2340] probing stream 0 pp:32 [wav @ 0x5647ff2d2340] probing stream 0 pp:31 [wav @ 0x5647ff2d2340] probing stream 0 pp:30 [wav @ 0x5647ff2d2340] probing stream 0 pp:29 [wav @ 0x5647ff2d2340] probing stream 0 pp:28 [wav @ 0x5647ff2d2340] probing stream 0 pp:27 [wav @ 0x5647ff2d2340] probing stream 0 pp:26 [wav @ 0x5647ff2d2340] probing stream 0 pp:25 [wav @ 0x5647ff2d2340] probing stream 0 pp:24 [wav @ 0x5647ff2d2340] probing stream 0 pp:23 [wav @ 0x5647ff2d2340] probing stream 0 pp:22 [wav @ 0x5647ff2d2340] probing stream 0 pp:21 [wav @ 0x5647ff2d2340] probing stream 0 pp:20 [wav @ 0x5647ff2d2340] probing stream 0 pp:19 [wav @ 0x5647ff2d2340] probing stream 0 pp:18 [wav @ 0x5647ff2d2340] probing stream 0 pp:17 [wav @ 0x5647ff2d2340] probing stream 0 pp:16 [wav @ 0x5647ff2d2340] probing stream 0 pp:15 [wav @ 0x5647ff2d2340] probing stream 0 pp:14 [wav @ 0x5647ff2d2340] probing stream 0 pp:13 [wav @ 0x5647ff2d2340] probing stream 0 pp:12 [wav @ 0x5647ff2d2340] probing stream 0 pp:11 [wav @ 0x5647ff2d2340] probing stream 0 pp:10 [wav @ 0x5647ff2d2340] probing stream 0 pp:9 [wav @ 0x5647ff2d2340] probing stream 0 pp:8 [wav @ 0x5647ff2d2340] probing stream 0 pp:7 [wav @ 0x5647ff2d2340] probing stream 0 pp:6 [wav @ 0x5647ff2d2340] probing stream 0 pp:5 [wav @ 0x5647ff2d2340] probing stream 0 pp:4 [wav @ 0x5647ff2d2340] probing stream 0 pp:3 [wav @ 0x5647ff2d2340] probing stream 0 pp:2 [wav @ 0x5647ff2d2340] probing stream 0 pp:1 [wav @ 0x5647ff2d2340] probed stream 0 [wav @ 0x5647ff2d2340] parser not found for codec pcm_s16le, packets or times may be invalid. [wav @ 0x5647ff2d2340] All info found [wav @ 0x5647ff2d2340] stream 0: start_time: -288230376151711.750 duration: -288230376151711.750 [wav @ 0x5647ff2d2340] format: start_time: -9223372036854.775 duration: -9223372036854.775 bitrate=512 kb/s [wav @ 0x5647ff2d2340] After avformat_find_stream_info() pos: 204878 bytes read:205124 seeks:0 frames:50 Guessed Channel Layout for Input Stream #0.0 : mono Input #0, wav, from 'pipe:': Metadata: encoder : Lavf58.9.100 Duration: N/A, bitrate: 512 kb/s Stream #0:0, 50, 1/32000: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, mono, s16, 512 kb/s Successfully opened the file. Parsing a group of options: output url /dev/null. Applying option codec:a (codec name) with argument libopus. Applying option f (force format) with argument ogg. Successfully parsed a group of options. Opening an output file: /dev/null. [file @ 0x5647ff2f5e40] Setting default whitelist 'file,crypto' Successfully opened the file. Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus)) cur_dts is invalid (this is harmless if it occurs once at the start per stream) detected 8 logical cores [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'time_base' to value '1/32000' [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_rate' to value '32000' [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_fmt' to value 's16' [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'channel_layout' to value '0x4' [graph_0_in_0_0 @ 0x5647ff320c80] tb:1/32000 samplefmt:s16 samplerate:32000 chlayout:0x4 [format_out_0_0 @ 0x5647ff320f40] Setting 'sample_fmts' to value 's16|flt' [format_out_0_0 @ 0x5647ff320f40] Setting 'sample_rates' to value '48000|24000|16000|12000|8000' [format_out_0_0 @ 0x5647ff320f40] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0' [AVFilterGraph @ 0x5647ff2f7400] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed [auto_resampler_0 @ 0x5647ff324580] [SWR @ 0x5647ff324a80] Using s16p internally between filters [auto_resampler_0 @ 0x5647ff324580] ch:1 chl:mono fmt:s16 r:32000Hz -> ch:1 chl:mono fmt:s16 r:24000Hz [libopus @ 0x5647ff2f5700] No bit rate set. Defaulting to 64000 bps. Output #0, ogg, to '/dev/null': Metadata: encoder : Lavf58.9.100 Stream #0:0, 0, 1/48000: Audio: opus (libopus), 24000 Hz, mono, s16, delay 156, 64 kb/s Metadata: encoder : Lavc58.11.101 libopus [Parsed_sine_0 @ 0x55a947218500] EOF timestamp not reliable size= 313kB time=00:00:05.00 bitrate= 512.1kbits/s speed= 107x video:0kB audio:312kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.024375% [out_0_0 @ 0x5647ff321cc0] EOF on sink link out_0_0:default. No more output streams to write to, finishing. [libopus @ 0x5647ff2f5700] Trying to remove 324 more samples than there are in the queue size= 60kB time=00:00:05.01 bitrate= 98.0kbits/s speed= 143x video:0kB audio:59kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.050050% Input file #0 (pipe:): Input stream #0:0 (audio): 79 packets read (320000 bytes); 79 frames decoded (160000 samples); Total: 79 packets (320000 bytes) demuxed Output file #0 (/dev/null): Output stream #0:0 (audio): 250 frames encoded (120000 samples); 251 packets muxed (60759 bytes); Total: 251 packets (60759 bytes) muxed 79 frames successfully decoded, 0 decoding errors [AVIOContext @ 0x5647ff2f60c0] Statistics: 0 seeks, 8 writeouts [AVIOContext @ 0x5647ff2db340] Statistics: 320078 bytes read, 0 seeks
Among the output is
[auto_resampler_0 @ 0x55e656ac6a80] ch:1 chl:mono fmt:s16 r:32000Hz -> ch:1 chl:mono fmt:s16 r:24000Hz
Indicating that the 32khz file was downsampled to 24khz.
This can be worked around by manually specifying -af resample=48000
, however ffmpeg makes no indication that any downsampling was performed unless verbose logging is enabled.