Opened 2 years ago

Last modified 23 months ago

#7949 new defect

libopus ocl=downmix fails

Reported by: omniplex Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords: OPUS
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug: unlike libtheora the libopus default -mapping_family -1 does not permit ocl=downmix. How to reproduce:

% ffmpeg -i input -vn -filter:a aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p -c:a libopus -b:a 25KiB output

ffmpeg version 4.1.3 built with gcc 8.3.1 (GCC) 20190414

The same command with -mapping_family 255 works.

Maybe it is only me, but I expected ocl=downmix to work for ordinary MP3 stereo input while upgrading a script from VP8/libtheora to VP9/libopus. It's no bug, but had a high astonishment factor.

Change History (5)

comment:1 by Carl Eugen Hoyos, 2 years ago

Type: enhancementdefect
Version: 4.1unspecified

Please test current FFmpeg git head and provide the command line together with the complete, uncut console output to make this a valid ticket.
If a specific input sample is required to reproduce the issue, please upload it.

comment:2 by omniplex, 2 years ago

It was an enhancement suggestion, I don't have anything fresher than 4.1.3 on Windows:

c:\Temp>ffmpeg -report -i test.mp3 -filter:a aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p -c:a libopus -b:a 25KiB test.webm
ffmpeg started on 2019-06-12 at 03:44:57
Report written to "ffmpeg-20190612-034457.log"
[mp3 @ 0000000000384040] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'test.mp3':
  Metadata:
    encoder         : Lame3.99
    ... (details redacted) ...
    date            : 2013
  Duration: 00:04:26.95, start: 0.000000, bitrate: 305 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 256 kb/s
    Stream #0:1: Video: png, rgba(pc), 800x800 [SAR 3779:3779 DAR 1:1], 90k tbr, 90k tbn, 90k tbc
    Metadata:
      comment         : Other
Stream mapping:
  Stream #0:1 -> #0:0 (png (native) -> vp9 (libvpx-vp9))
  Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus))
Press [q] to stop, [?] for help
[libvpx-vp9 @ 00000000003a1d40] v1.8.0-424-ge50f4e411
[libopus @ 00000000003a3740] Invalid channel layout downmix for specified mapping family -1.
Error initializing output stream 0:1 -- Error while opening encoder for output stream #0:1 - maybe incorrect parameters such as bit_rate, rate, width or height
Conversion failed!

ffmpeg-20190612-034457.log content:

ffmpeg started on 2019-06-12 at 03:44:57
Report written to "ffmpeg-20190612-034457.log"
Command line:
ffmpeg.exe -hide_banner -report -i test.mp3 -filter:a "aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p" -c:a libopus -b:a 25KiB test.webm
Splitting the commandline.
Reading option '-hide_banner' ... matched as option 'hide_banner' (do not show program banner) with argument '1'.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-i' ... matched as input url with argument 'test.mp3'.
Reading option '-filter:a' ... matched as option 'filter' (set stream filtergraph) with argument 'aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p'.
Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'libopus'.
Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v)) with argument '25KiB'.
Reading option 'test.webm' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option hide_banner (do not show program banner) with argument 1.
Applying option report (generate a report) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url test.mp3.
Successfully parsed a group of options.
Opening an input file: test.mp3.
[NULL @ 0000000000384040] Opening 'test.mp3' for reading
[file @ 0000000000385940] Setting default whitelist 'file,crypto'
[mp3 @ 0000000000384040] Format mp3 probed with size=2048 and score=50
id3v2 ver:3 flags:00 len:1644027
[mp3 @ 0000000000384040] Skipping 0 bytes of junk at 1644037.
[mp3 @ 0000000000384040] Before avformat_find_stream_info() pos: 1644037 bytes read:1676805 seeks:0 nb_streams:2
[mp3 @ 0000000000384040] All info found
[mp3 @ 0000000000384040] Estimating duration from bitrate, this may be inaccurate
[mp3 @ 0000000000384040] After avformat_find_stream_info() pos: 1686021 bytes read:1709573 seeks:0 frames:51
Input #0, mp3, from 'test.mp3':
  Metadata:
    encoder         : Lame3.99
    comment         : www.NewAlbumReleases.net
    copyright       : 2013 Pendu Sound Recordings
    album_artist    : aTelecine
    genre           : Electronic
    disc            : 1/1
    track           : 15/15
    artist          : aTelecine
    title           : Der Baum Des Bosen
    album           : The Origin of the Obsolete Robot
    date            : 2013
  Duration: 00:04:26.95, start: 0.000000, bitrate: 305 kb/s
    Stream #0:0, 50, 1/14112000: Audio: mp3, 44100 Hz, stereo, fltp, 256 kb/s
    Stream #0:1, 1, 1/90000: Video: png, rgba(pc), 800x800 [SAR 3779:3779 DAR 1:1], 90k tbr, 90k tbn, 90k tbc
    Metadata:
      comment         : Other
Successfully opened the file.
Parsing a group of options: output url test.webm.
Applying option filter:a (set stream filtergraph) with argument aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p.
Applying option c:a (codec name) with argument libopus.
Applying option b:a (video bitrate (please use -b:v)) with argument 25KiB.
Successfully parsed a group of options.
Opening an output file: test.webm.
[file @ 0000000000396c40] Setting default whitelist 'file,crypto'
Successfully opened the file.
Stream mapping:
  Stream #0:1 -> #0:0 (png (native) -> vp9 (libvpx-vp9))
  Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 4 logical cores
[graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'video_size' to value '800x800'
[graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'pix_fmt' to value '26'
[graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'pixel_aspect' to value '3779/3779'
[graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'frame_rate' to value '90000/1'
[graph 0 input from stream 0:1 @ 00000000003fab80] w:800 h:800 pixfmt:rgba tb:1/90000 fr:90000/1 sar:3779/3779 sws_param:flags=2
[format @ 00000000003fc2c0] Setting 'pix_fmts' to value 'yuv420p|yuva420p|yuv422p|yuv440p|yuv444p|yuv420p10le|yuv422p10le|yuv440p10le|yuv444p10le|yuv420p12le|yuv422p12le|yuv440p12le|yuv444p12le|gbrp|gbrp10le|gbrp12le'
[auto_scaler_0 @ 00000000003fbf00] Setting 'flags' to value 'bicubic'
[auto_scaler_0 @ 00000000003fbf00] w:iw h:ih flags:'bicubic' interl:0
[format @ 00000000003fc2c0] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_null_0' and the filter 'format'
[AVFilterGraph @ 00000000003898c0] query_formats: 4 queried, 2 merged, 1 already done, 0 delayed
[auto_scaler_0 @ 00000000003fbf00] picking yuva420p out of 16 ref:rgba alpha:1
[auto_scaler_0 @ 00000000003fbf00] w:800 h:800 fmt:rgba sar:3779/3779 -> w:800 h:800 fmt:yuva420p sar:1/1 flags:0x4
[libvpx-vp9 @ 00000000003a1d40] v1.8.0-424-ge50f4e411
[libvpx-vp9 @ 00000000003a1d40] --prefix=/Users/kyle/software/libvpx/win64/libvpx-20190425-e50f4e4-win64 --enable-vp9-highbitdepth --disable-avx512 --target=x86_64-win64-gcc
[libvpx-vp9 @ 00000000003a1d40] vpx_codec_enc_cfg
[libvpx-vp9 @ 00000000003a1d40] generic settings
  g_usage:                      0
  g_threads:                    8
  g_profile:                    0
  g_w:                          320
  g_h:                          240
  g_bit_depth:                  8
  g_input_bit_depth:            8
  g_timebase:                   {1/30}
  g_error_resilient:            0
  g_pass:                       0
  g_lag_in_frames:              25
[libvpx-vp9 @ 00000000003a1d40] rate control settings
  rc_dropframe_thresh:          0
  rc_resize_allowed:            0
  rc_resize_up_thresh:          60
  rc_resize_down_thresh:        30
  rc_end_usage:                 0
  rc_twopass_stats_in:          0000000000000000(0)
  rc_target_bitrate:            256
[libvpx-vp9 @ 00000000003a1d40] quantizer settings
  rc_min_quantizer:             0
  rc_max_quantizer:             63
[libvpx-vp9 @ 00000000003a1d40] bitrate tolerance
  rc_undershoot_pct:            25
  rc_overshoot_pct:             25
[libvpx-vp9 @ 00000000003a1d40] decoder buffer model
  rc_buf_sz:                    6000
  rc_buf_initial_sz:            4000
  rc_buf_optimal_sz:            5000
[libvpx-vp9 @ 00000000003a1d40] 2 pass rate control settings
  rc_2pass_vbr_bias_pct:        50
  rc_2pass_vbr_minsection_pct:  0
  rc_2pass_vbr_maxsection_pct:  2000
[libvpx-vp9 @ 00000000003a1d40]   rc_2pass_vbr_corpus_complexity:0
[libvpx-vp9 @ 00000000003a1d40] keyframing settings
  kf_mode:                      1
  kf_min_dist:                  0
  kf_max_dist:                  128
[libvpx-vp9 @ 00000000003a1d40]
[libvpx-vp9 @ 00000000003a1d40] vpx_codec_enc_cfg
[libvpx-vp9 @ 00000000003a1d40] generic settings
  g_usage:                      0
  g_threads:                    4
  g_profile:                    0
  g_w:                          800
  g_h:                          800
  g_bit_depth:                  8
  g_input_bit_depth:            8
  g_timebase:                   {1/90000}
  g_error_resilient:            0
  g_pass:                       0
  g_lag_in_frames:              25
[libvpx-vp9 @ 00000000003a1d40] rate control settings
  rc_dropframe_thresh:          0
  rc_resize_allowed:            0
  rc_resize_up_thresh:          60
  rc_resize_down_thresh:        30
  rc_end_usage:                 0
  rc_twopass_stats_in:          0000000000000000(0)
  rc_target_bitrate:            200
[libvpx-vp9 @ 00000000003a1d40] quantizer settings
  rc_min_quantizer:             0
  rc_max_quantizer:             63
[libvpx-vp9 @ 00000000003a1d40] bitrate tolerance
  rc_undershoot_pct:            25
  rc_overshoot_pct:             25
[libvpx-vp9 @ 00000000003a1d40] decoder buffer model
  rc_buf_sz:                    6000
  rc_buf_initial_sz:            4000
  rc_buf_optimal_sz:            5000
[libvpx-vp9 @ 00000000003a1d40] 2 pass rate control settings
  rc_2pass_vbr_bias_pct:        50
  rc_2pass_vbr_minsection_pct:  0
  rc_2pass_vbr_maxsection_pct:  2000
[libvpx-vp9 @ 00000000003a1d40]   rc_2pass_vbr_corpus_complexity:0
[libvpx-vp9 @ 00000000003a1d40] keyframing settings
  kf_mode:                      1
  kf_min_dist:                  0
  kf_max_dist:                  128
[libvpx-vp9 @ 00000000003a1d40]
[libvpx-vp9 @ 00000000003a1d40] vpx_codec_control
[libvpx-vp9 @ 00000000003a1d40]   VP8E_SET_CPUUSED:             1
[libvpx-vp9 @ 00000000003a1d40]   VP8E_SET_ARNR_MAXFRAMES:      0
[libvpx-vp9 @ 00000000003a1d40]   VP8E_SET_ARNR_STRENGTH:       3
[libvpx-vp9 @ 00000000003a1d40]   VP8E_SET_ARNR_TYPE:           3
[libvpx-vp9 @ 00000000003a1d40]   VP8E_SET_STATIC_THRESHOLD:    0
[libvpx-vp9 @ 00000000003a1d40]   VP9E_SET_COLOR_SPACE:         0
[libvpx-vp9 @ 00000000003a1d40]   VP9E_SET_COLOR_RANGE:         0
[libvpx-vp9 @ 00000000003a1d40]   VP9E_SET_TARGET_LEVEL:        255
[libvpx-vp9 @ 00000000003a1d40] Using deadline: 1000000
Clipping frame in rate conversion by 0.000008
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
[Parsed_aresample_0 @ 0000000000404180] Setting 'sample_rate' to value '48000'
[Parsed_aresample_0 @ 0000000000404180] Setting 'resampler' to value 'soxr'
[Parsed_aresample_0 @ 0000000000404180] Setting 'precision' to value '28'
[Parsed_aresample_0 @ 0000000000404180] Setting 'ocl' to value 'downmix'
[Parsed_aresample_0 @ 0000000000404180] Setting 'cheby' to value '1'
[Parsed_aresample_0 @ 0000000000404180] Setting 'matrix_encoding' to value 'dolby'
[Parsed_aresample_0 @ 0000000000404180] Setting 'tsf' to value 's32p'
[graph_1_in_0_0 @ 0000000000404280] Setting 'time_base' to value '1/44100'
[graph_1_in_0_0 @ 0000000000404280] Setting 'sample_rate' to value '44100'
[graph_1_in_0_0 @ 0000000000404280] Setting 'sample_fmt' to value 'fltp'
[graph_1_in_0_0 @ 0000000000404280] Setting 'channel_layout' to value '0x3'
[graph_1_in_0_0 @ 0000000000404280] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x3
[format_out_0_1 @ 0000000000404480] Setting 'sample_fmts' to value 's16|flt'
[format_out_0_1 @ 0000000000404480] Setting 'sample_rates' to value '48000|24000|16000|12000|8000'
[AVFilterGraph @ 0000000000389980] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
[Parsed_aresample_0 @ 0000000000404180] picking flt out of 2 ref:fltp
[Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] Using s32p internally between filters
[Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] Matrix coefficients:
[Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] FL: FL:1.000000 FR:0.000000
[Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] FR: FL:0.000000 FR:1.000000
[Parsed_aresample_0 @ 0000000000404180] ch:2 chl:stereo fmt:fltp r:44100Hz -> ch:2 chl:downmix fmt:flt r:48000Hz
[libopus @ 00000000003a3740] Invalid channel layout downmix for specified mapping family -1.
Error initializing output stream 0:1 -- Error while opening encoder for output stream #0:1 - maybe incorrect parameters such as bit_rate, rate, width or height
[AVIOContext @ 00000000003847c0] Statistics: 0 seeks, 0 writeouts
[AVIOContext @ 000000000038dbc0] Statistics: 1709573 bytes read, 0 seeks
Conversion failed!

comment:3 by omniplex, 2 years ago

Remotely related: #5718

comment:4 by omniplex, 23 months ago

RFC 7845 supports the observed FFmpeg behaviour, or IOW, Opus is not Vorbis and has its own ideas about channel mapping families. But "cannot downmix stereo to stereo" still violates my idea of the principle of least astonishment.😛

comment:5 by omniplex, 23 months ago

Also see RFC 8486, because it updated RFC 7845, and defines the wonders of Ambisonics in an Ogg Opus Container, i.e., Channel Mapping Family 2 and 3.

Last edited 23 months ago by omniplex (previous) (diff)
Note: See TracTickets for help on using tickets.