Opened 5 years ago
Last modified 5 years ago
#7949 new defect
libopus ocl=downmix fails
Reported by: | omniplex | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | unspecified | Keywords: | OPUS |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug: unlike libtheora the libopus default -mapping_family -1 does not permit ocl=downmix. How to reproduce:
% ffmpeg -i input -vn -filter:a aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p -c:a libopus -b:a 25KiB output ffmpeg version 4.1.3 built with gcc 8.3.1 (GCC) 20190414
The same command with -mapping_family 255 works.
Maybe it is only me, but I expected ocl=downmix to work for ordinary MP3 stereo input while upgrading a script from VP8/libtheora to VP9/libopus. It's no bug, but had a high astonishment factor.
Change History (5)
comment:1 by , 5 years ago
Type: | enhancement → defect |
---|---|
Version: | 4.1 → unspecified |
comment:2 by , 5 years ago
It was an enhancement suggestion, I don't have anything fresher than 4.1.3 on Windows:
c:\Temp>ffmpeg -report -i test.mp3 -filter:a aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p -c:a libopus -b:a 25KiB test.webm ffmpeg started on 2019-06-12 at 03:44:57 Report written to "ffmpeg-20190612-034457.log" [mp3 @ 0000000000384040] Estimating duration from bitrate, this may be inaccurate Input #0, mp3, from 'test.mp3': Metadata: encoder : Lame3.99 ... (details redacted) ... date : 2013 Duration: 00:04:26.95, start: 0.000000, bitrate: 305 kb/s Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 256 kb/s Stream #0:1: Video: png, rgba(pc), 800x800 [SAR 3779:3779 DAR 1:1], 90k tbr, 90k tbn, 90k tbc Metadata: comment : Other Stream mapping: Stream #0:1 -> #0:0 (png (native) -> vp9 (libvpx-vp9)) Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus)) Press [q] to stop, [?] for help [libvpx-vp9 @ 00000000003a1d40] v1.8.0-424-ge50f4e411 [libopus @ 00000000003a3740] Invalid channel layout downmix for specified mapping family -1. Error initializing output stream 0:1 -- Error while opening encoder for output stream #0:1 - maybe incorrect parameters such as bit_rate, rate, width or height Conversion failed!
ffmpeg-20190612-034457.log content:
ffmpeg started on 2019-06-12 at 03:44:57 Report written to "ffmpeg-20190612-034457.log" Command line: ffmpeg.exe -hide_banner -report -i test.mp3 -filter:a "aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p" -c:a libopus -b:a 25KiB test.webm Splitting the commandline. Reading option '-hide_banner' ... matched as option 'hide_banner' (do not show program banner) with argument '1'. Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'. Reading option '-i' ... matched as input url with argument 'test.mp3'. Reading option '-filter:a' ... matched as option 'filter' (set stream filtergraph) with argument 'aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p'. Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'libopus'. Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v)) with argument '25KiB'. Reading option 'test.webm' ... matched as output url. Finished splitting the commandline. Parsing a group of options: global . Applying option hide_banner (do not show program banner) with argument 1. Applying option report (generate a report) with argument 1. Successfully parsed a group of options. Parsing a group of options: input url test.mp3. Successfully parsed a group of options. Opening an input file: test.mp3. [NULL @ 0000000000384040] Opening 'test.mp3' for reading [file @ 0000000000385940] Setting default whitelist 'file,crypto' [mp3 @ 0000000000384040] Format mp3 probed with size=2048 and score=50 id3v2 ver:3 flags:00 len:1644027 [mp3 @ 0000000000384040] Skipping 0 bytes of junk at 1644037. [mp3 @ 0000000000384040] Before avformat_find_stream_info() pos: 1644037 bytes read:1676805 seeks:0 nb_streams:2 [mp3 @ 0000000000384040] All info found [mp3 @ 0000000000384040] Estimating duration from bitrate, this may be inaccurate [mp3 @ 0000000000384040] After avformat_find_stream_info() pos: 1686021 bytes read:1709573 seeks:0 frames:51 Input #0, mp3, from 'test.mp3': Metadata: encoder : Lame3.99 comment : www.NewAlbumReleases.net copyright : 2013 Pendu Sound Recordings album_artist : aTelecine genre : Electronic disc : 1/1 track : 15/15 artist : aTelecine title : Der Baum Des Bosen album : The Origin of the Obsolete Robot date : 2013 Duration: 00:04:26.95, start: 0.000000, bitrate: 305 kb/s Stream #0:0, 50, 1/14112000: Audio: mp3, 44100 Hz, stereo, fltp, 256 kb/s Stream #0:1, 1, 1/90000: Video: png, rgba(pc), 800x800 [SAR 3779:3779 DAR 1:1], 90k tbr, 90k tbn, 90k tbc Metadata: comment : Other Successfully opened the file. Parsing a group of options: output url test.webm. Applying option filter:a (set stream filtergraph) with argument aresample=48000:resampler=soxr:precision=28:ocl=downmix:cheby=1:matrix_encoding=dolby:tsf=s32p. Applying option c:a (codec name) with argument libopus. Applying option b:a (video bitrate (please use -b:v)) with argument 25KiB. Successfully parsed a group of options. Opening an output file: test.webm. [file @ 0000000000396c40] Setting default whitelist 'file,crypto' Successfully opened the file. Stream mapping: Stream #0:1 -> #0:0 (png (native) -> vp9 (libvpx-vp9)) Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus)) Press [q] to stop, [?] for help cur_dts is invalid (this is harmless if it occurs once at the start per stream) detected 4 logical cores [graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'video_size' to value '800x800' [graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'pix_fmt' to value '26' [graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'time_base' to value '1/90000' [graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'pixel_aspect' to value '3779/3779' [graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'sws_param' to value 'flags=2' [graph 0 input from stream 0:1 @ 00000000003fab80] Setting 'frame_rate' to value '90000/1' [graph 0 input from stream 0:1 @ 00000000003fab80] w:800 h:800 pixfmt:rgba tb:1/90000 fr:90000/1 sar:3779/3779 sws_param:flags=2 [format @ 00000000003fc2c0] Setting 'pix_fmts' to value 'yuv420p|yuva420p|yuv422p|yuv440p|yuv444p|yuv420p10le|yuv422p10le|yuv440p10le|yuv444p10le|yuv420p12le|yuv422p12le|yuv440p12le|yuv444p12le|gbrp|gbrp10le|gbrp12le' [auto_scaler_0 @ 00000000003fbf00] Setting 'flags' to value 'bicubic' [auto_scaler_0 @ 00000000003fbf00] w:iw h:ih flags:'bicubic' interl:0 [format @ 00000000003fc2c0] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_null_0' and the filter 'format' [AVFilterGraph @ 00000000003898c0] query_formats: 4 queried, 2 merged, 1 already done, 0 delayed [auto_scaler_0 @ 00000000003fbf00] picking yuva420p out of 16 ref:rgba alpha:1 [auto_scaler_0 @ 00000000003fbf00] w:800 h:800 fmt:rgba sar:3779/3779 -> w:800 h:800 fmt:yuva420p sar:1/1 flags:0x4 [libvpx-vp9 @ 00000000003a1d40] v1.8.0-424-ge50f4e411 [libvpx-vp9 @ 00000000003a1d40] --prefix=/Users/kyle/software/libvpx/win64/libvpx-20190425-e50f4e4-win64 --enable-vp9-highbitdepth --disable-avx512 --target=x86_64-win64-gcc [libvpx-vp9 @ 00000000003a1d40] vpx_codec_enc_cfg [libvpx-vp9 @ 00000000003a1d40] generic settings g_usage: 0 g_threads: 8 g_profile: 0 g_w: 320 g_h: 240 g_bit_depth: 8 g_input_bit_depth: 8 g_timebase: {1/30} g_error_resilient: 0 g_pass: 0 g_lag_in_frames: 25 [libvpx-vp9 @ 00000000003a1d40] rate control settings rc_dropframe_thresh: 0 rc_resize_allowed: 0 rc_resize_up_thresh: 60 rc_resize_down_thresh: 30 rc_end_usage: 0 rc_twopass_stats_in: 0000000000000000(0) rc_target_bitrate: 256 [libvpx-vp9 @ 00000000003a1d40] quantizer settings rc_min_quantizer: 0 rc_max_quantizer: 63 [libvpx-vp9 @ 00000000003a1d40] bitrate tolerance rc_undershoot_pct: 25 rc_overshoot_pct: 25 [libvpx-vp9 @ 00000000003a1d40] decoder buffer model rc_buf_sz: 6000 rc_buf_initial_sz: 4000 rc_buf_optimal_sz: 5000 [libvpx-vp9 @ 00000000003a1d40] 2 pass rate control settings rc_2pass_vbr_bias_pct: 50 rc_2pass_vbr_minsection_pct: 0 rc_2pass_vbr_maxsection_pct: 2000 [libvpx-vp9 @ 00000000003a1d40] rc_2pass_vbr_corpus_complexity:0 [libvpx-vp9 @ 00000000003a1d40] keyframing settings kf_mode: 1 kf_min_dist: 0 kf_max_dist: 128 [libvpx-vp9 @ 00000000003a1d40] [libvpx-vp9 @ 00000000003a1d40] vpx_codec_enc_cfg [libvpx-vp9 @ 00000000003a1d40] generic settings g_usage: 0 g_threads: 4 g_profile: 0 g_w: 800 g_h: 800 g_bit_depth: 8 g_input_bit_depth: 8 g_timebase: {1/90000} g_error_resilient: 0 g_pass: 0 g_lag_in_frames: 25 [libvpx-vp9 @ 00000000003a1d40] rate control settings rc_dropframe_thresh: 0 rc_resize_allowed: 0 rc_resize_up_thresh: 60 rc_resize_down_thresh: 30 rc_end_usage: 0 rc_twopass_stats_in: 0000000000000000(0) rc_target_bitrate: 200 [libvpx-vp9 @ 00000000003a1d40] quantizer settings rc_min_quantizer: 0 rc_max_quantizer: 63 [libvpx-vp9 @ 00000000003a1d40] bitrate tolerance rc_undershoot_pct: 25 rc_overshoot_pct: 25 [libvpx-vp9 @ 00000000003a1d40] decoder buffer model rc_buf_sz: 6000 rc_buf_initial_sz: 4000 rc_buf_optimal_sz: 5000 [libvpx-vp9 @ 00000000003a1d40] 2 pass rate control settings rc_2pass_vbr_bias_pct: 50 rc_2pass_vbr_minsection_pct: 0 rc_2pass_vbr_maxsection_pct: 2000 [libvpx-vp9 @ 00000000003a1d40] rc_2pass_vbr_corpus_complexity:0 [libvpx-vp9 @ 00000000003a1d40] keyframing settings kf_mode: 1 kf_min_dist: 0 kf_max_dist: 128 [libvpx-vp9 @ 00000000003a1d40] [libvpx-vp9 @ 00000000003a1d40] vpx_codec_control [libvpx-vp9 @ 00000000003a1d40] VP8E_SET_CPUUSED: 1 [libvpx-vp9 @ 00000000003a1d40] VP8E_SET_ARNR_MAXFRAMES: 0 [libvpx-vp9 @ 00000000003a1d40] VP8E_SET_ARNR_STRENGTH: 3 [libvpx-vp9 @ 00000000003a1d40] VP8E_SET_ARNR_TYPE: 3 [libvpx-vp9 @ 00000000003a1d40] VP8E_SET_STATIC_THRESHOLD: 0 [libvpx-vp9 @ 00000000003a1d40] VP9E_SET_COLOR_SPACE: 0 [libvpx-vp9 @ 00000000003a1d40] VP9E_SET_COLOR_RANGE: 0 [libvpx-vp9 @ 00000000003a1d40] VP9E_SET_TARGET_LEVEL: 255 [libvpx-vp9 @ 00000000003a1d40] Using deadline: 1000000 Clipping frame in rate conversion by 0.000008 cur_dts is invalid (this is harmless if it occurs once at the start per stream) cur_dts is invalid (this is harmless if it occurs once at the start per stream) [Parsed_aresample_0 @ 0000000000404180] Setting 'sample_rate' to value '48000' [Parsed_aresample_0 @ 0000000000404180] Setting 'resampler' to value 'soxr' [Parsed_aresample_0 @ 0000000000404180] Setting 'precision' to value '28' [Parsed_aresample_0 @ 0000000000404180] Setting 'ocl' to value 'downmix' [Parsed_aresample_0 @ 0000000000404180] Setting 'cheby' to value '1' [Parsed_aresample_0 @ 0000000000404180] Setting 'matrix_encoding' to value 'dolby' [Parsed_aresample_0 @ 0000000000404180] Setting 'tsf' to value 's32p' [graph_1_in_0_0 @ 0000000000404280] Setting 'time_base' to value '1/44100' [graph_1_in_0_0 @ 0000000000404280] Setting 'sample_rate' to value '44100' [graph_1_in_0_0 @ 0000000000404280] Setting 'sample_fmt' to value 'fltp' [graph_1_in_0_0 @ 0000000000404280] Setting 'channel_layout' to value '0x3' [graph_1_in_0_0 @ 0000000000404280] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x3 [format_out_0_1 @ 0000000000404480] Setting 'sample_fmts' to value 's16|flt' [format_out_0_1 @ 0000000000404480] Setting 'sample_rates' to value '48000|24000|16000|12000|8000' [AVFilterGraph @ 0000000000389980] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed [Parsed_aresample_0 @ 0000000000404180] picking flt out of 2 ref:fltp [Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] Using s32p internally between filters [Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] Matrix coefficients: [Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] FL: FL:1.000000 FR:0.000000 [Parsed_aresample_0 @ 0000000000404180] [SWR @ 0000000006e10ec0] FR: FL:0.000000 FR:1.000000 [Parsed_aresample_0 @ 0000000000404180] ch:2 chl:stereo fmt:fltp r:44100Hz -> ch:2 chl:downmix fmt:flt r:48000Hz [libopus @ 00000000003a3740] Invalid channel layout downmix for specified mapping family -1. Error initializing output stream 0:1 -- Error while opening encoder for output stream #0:1 - maybe incorrect parameters such as bit_rate, rate, width or height [AVIOContext @ 00000000003847c0] Statistics: 0 seeks, 0 writeouts [AVIOContext @ 000000000038dbc0] Statistics: 1709573 bytes read, 0 seeks Conversion failed!
comment:4 by , 5 years ago
RFC 7845 supports the observed FFmpeg behaviour, or IOW, Opus is not Vorbis and has its own ideas about channel mapping families. But "cannot downmix stereo to stereo" still violates my idea of the principle of least astonishment.😛
comment:5 by , 5 years ago
Also see RFC 8486, because it updated RFC 7845, and defines the wonders of Ambisonics in an Ogg Opus Container, i.e., Channel Mapping Family 2 and 3.
Please test current FFmpeg git head and provide the command line together with the complete, uncut console output to make this a valid ticket.
If a specific input sample is required to reproduce the issue, please upload it.