Opened 3 months ago

Last modified 3 months ago

#7028 new defect

Improper rounding of output sample rate when using libopus

Reported by: heicrd Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: libopus
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
When used in combination with libopus's limited selection of supported sample rates, an automatically-inserted aresample filter will round the sample rate of the audio to the nearest supported sample rate, potentially rounding down and incurring a significant and unexpected loss in fidelity.
For example, a 32khz input will be downsampled to 24khz instead of upsampled to 48khz as opusenc seems to do.

How to reproduce:
Use ffmpeg to encode a 32khz file with libopus

% ./ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -codec:a pcm_s16le -af aresample=32000 -f wav - | ./ffmpeg -v 9 -loglevel 99 -f wav -i - -codec:a libopus -f ogg -y /dev/null                                                                                    
ffmpeg version N-90069-gdd8351b118ffmpeg version N-90069-gdd8351b118 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7 (Debian 7.3.0-3)
 Copyright (c) 2000-2018 the FFmpeg developers  configuration: --enable-libopus

  built with gcc 7 (Debian 7.3.0-3)
  configuration: --enable-libopus
  libavutil      56.  7.101 / 56.  7.101
  libavcodec     58. 11.101 / 58. 11.101
  libavformat    58.  9.100 / 58.  9.100
  libavdevice    58.  1.100 / 58.  1.100
  libavfilter     7. 12.100 /  7. 12.100
  libswscale      5.  0.101 /  5.  0.101
  libswresample   3.  0.101 /  3.  0.101
  libavutil      56.  7.101 / 56.  7.101
  libavcodec     58. 11.101 / 58. 11.101
  libavformat    58.  9.100 / 58.  9.100
  libavdevice    58.  1.100 / 58.  1.100
  libavfilter     7. 12.100 /  7. 12.100
  libswscale      5.  0.101 /  5.  0.101
Splitting the commandline.
  libswresample   3.  0.101 /  3.  0.101
Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'wav'.
Reading option '-i' ... matched as input url with argument '-'.
Reading option '-codec:a' ... matched as option 'codec' (codec name) with argument 'libopus'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'ogg'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option '/dev/null' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url -.
Applying option f (force format) with argument wav.
Successfully parsed a group of options.
Opening an input file: -.
[wav @ 0x5647ff2d2340] Opening 'pipe:' for reading
[pipe @ 0x5647ff2d2ec0] Setting default whitelist 'crypto'
Input #0, lavfi, from 'sine=frequency=1000:duration=5':
  Duration: N/A, start: 0.000000, bitrate: 705 kb/s
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'pipe:':
  Metadata:
    ISFT            : Lavf58.9.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, mono, s16, 512 kb/s
    Metadata:
      encoder         : Lavc58.11.101 pcm_s16le
[wav @ 0x5647ff2d2340] Ignoring maximum wav data size, file may be invalid
[wav @ 0x5647ff2d2340] Before avformat_find_stream_info() pos: 78 bytes read:66920 seeks:0 nb_streams:1
[wav @ 0x5647ff2d2340] probing stream 0 pp:32
[wav @ 0x5647ff2d2340] probing stream 0 pp:31
[wav @ 0x5647ff2d2340] probing stream 0 pp:30
[wav @ 0x5647ff2d2340] probing stream 0 pp:29
[wav @ 0x5647ff2d2340] probing stream 0 pp:28
[wav @ 0x5647ff2d2340] probing stream 0 pp:27
[wav @ 0x5647ff2d2340] probing stream 0 pp:26
[wav @ 0x5647ff2d2340] probing stream 0 pp:25
[wav @ 0x5647ff2d2340] probing stream 0 pp:24
[wav @ 0x5647ff2d2340] probing stream 0 pp:23
[wav @ 0x5647ff2d2340] probing stream 0 pp:22
[wav @ 0x5647ff2d2340] probing stream 0 pp:21
[wav @ 0x5647ff2d2340] probing stream 0 pp:20
[wav @ 0x5647ff2d2340] probing stream 0 pp:19
[wav @ 0x5647ff2d2340] probing stream 0 pp:18
[wav @ 0x5647ff2d2340] probing stream 0 pp:17
[wav @ 0x5647ff2d2340] probing stream 0 pp:16
[wav @ 0x5647ff2d2340] probing stream 0 pp:15
[wav @ 0x5647ff2d2340] probing stream 0 pp:14
[wav @ 0x5647ff2d2340] probing stream 0 pp:13
[wav @ 0x5647ff2d2340] probing stream 0 pp:12
[wav @ 0x5647ff2d2340] probing stream 0 pp:11
[wav @ 0x5647ff2d2340] probing stream 0 pp:10
[wav @ 0x5647ff2d2340] probing stream 0 pp:9
[wav @ 0x5647ff2d2340] probing stream 0 pp:8
[wav @ 0x5647ff2d2340] probing stream 0 pp:7
[wav @ 0x5647ff2d2340] probing stream 0 pp:6
[wav @ 0x5647ff2d2340] probing stream 0 pp:5
[wav @ 0x5647ff2d2340] probing stream 0 pp:4
[wav @ 0x5647ff2d2340] probing stream 0 pp:3
[wav @ 0x5647ff2d2340] probing stream 0 pp:2
[wav @ 0x5647ff2d2340] probing stream 0 pp:1
[wav @ 0x5647ff2d2340] probed stream 0
[wav @ 0x5647ff2d2340] parser not found for codec pcm_s16le, packets or times may be invalid.
[wav @ 0x5647ff2d2340] All info found
[wav @ 0x5647ff2d2340] stream 0: start_time: -288230376151711.750 duration: -288230376151711.750
[wav @ 0x5647ff2d2340] format: start_time: -9223372036854.775 duration: -9223372036854.775 bitrate=512 kb/s
[wav @ 0x5647ff2d2340] After avformat_find_stream_info() pos: 204878 bytes read:205124 seeks:0 frames:50
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'pipe:':
  Metadata:
    encoder         : Lavf58.9.100
  Duration: N/A, bitrate: 512 kb/s
    Stream #0:0, 50, 1/32000: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, mono, s16, 512 kb/s
Successfully opened the file.
Parsing a group of options: output url /dev/null.
Applying option codec:a (codec name) with argument libopus.
Applying option f (force format) with argument ogg.
Successfully parsed a group of options.
Opening an output file: /dev/null.
[file @ 0x5647ff2f5e40] Setting default whitelist 'file,crypto'
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 8 logical cores
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'time_base' to value '1/32000'
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_rate' to value '32000'
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_fmt' to value 's16'
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'channel_layout' to value '0x4'
[graph_0_in_0_0 @ 0x5647ff320c80] tb:1/32000 samplefmt:s16 samplerate:32000 chlayout:0x4
[format_out_0_0 @ 0x5647ff320f40] Setting 'sample_fmts' to value 's16|flt'
[format_out_0_0 @ 0x5647ff320f40] Setting 'sample_rates' to value '48000|24000|16000|12000|8000'
[format_out_0_0 @ 0x5647ff320f40] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x5647ff2f7400] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto_resampler_0 @ 0x5647ff324580] [SWR @ 0x5647ff324a80] Using s16p internally between filters
[auto_resampler_0 @ 0x5647ff324580] ch:1 chl:mono fmt:s16 r:32000Hz -> ch:1 chl:mono fmt:s16 r:24000Hz
[libopus @ 0x5647ff2f5700] No bit rate set. Defaulting to 64000 bps.
Output #0, ogg, to '/dev/null':
  Metadata:
    encoder         : Lavf58.9.100
    Stream #0:0, 0, 1/48000: Audio: opus (libopus), 24000 Hz, mono, s16, delay 156, 64 kb/s
    Metadata:
      encoder         : Lavc58.11.101 libopus
[Parsed_sine_0 @ 0x55a947218500] EOF timestamp not reliable
size=     313kB time=00:00:05.00 bitrate= 512.1kbits/s speed= 107x    
video:0kB audio:312kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.024375%
[out_0_0 @ 0x5647ff321cc0] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
[libopus @ 0x5647ff2f5700] Trying to remove 324 more samples than there are in the queue
size=      60kB time=00:00:05.01 bitrate=  98.0kbits/s speed= 143x    
video:0kB audio:59kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.050050%
Input file #0 (pipe:):
  Input stream #0:0 (audio): 79 packets read (320000 bytes); 79 frames decoded (160000 samples); 
  Total: 79 packets (320000 bytes) demuxed
Output file #0 (/dev/null):
  Output stream #0:0 (audio): 250 frames encoded (120000 samples); 251 packets muxed (60759 bytes); 
  Total: 251 packets (60759 bytes) muxed
79 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x5647ff2f60c0] Statistics: 0 seeks, 8 writeouts
[AVIOContext @ 0x5647ff2db340] Statistics: 320078 bytes read, 0 seeks

Among the output is

[auto_resampler_0 @ 0x55e656ac6a80] ch:1 chl:mono fmt:s16 r:32000Hz -> ch:1 chl:mono fmt:s16 r:24000Hz

Indicating that the 32khz file was downsampled to 24khz.

This can be worked around by manually specifying -af resample=48000, however ffmpeg makes no indication that any downsampling was performed unless verbose logging is enabled.

Change History (1)

comment:1 Changed 3 months ago by cehoyos

  • Component changed from avfilter to avcodec
  • Keywords libopus added
Note: See TracTickets for help on using tickets.