Opened 6 years ago
Last modified 4 years ago
#7384 reopened defect
FFmpeg 4.0 does not set 5.1 surround sound into audio stream when converting mkv video with 5.1 surround sound
Reported by: | itrdev | Owned by: | |
---|---|---|---|
Priority: | important | Component: | avcodec |
Version: | git-master | Keywords: | aac regression |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
I upgraded ffmpeg to 4.0 version and found and issue that was not present in 3.4.2 version. When I convert MKV video with 5.1 surround sound audio ffmpeg gives an mp4 with audio that has channel_layout=unknown. As a result Windows and IE/Edge/Firefox browsers cannot play audio.
FFpmeg 3.4.2 gives a converted video with channel_layout=5.1. And audio is played without errors.
How to reproduce:
% ffmpeg.exe -noautorotate -i MKV.mkv -c:v libx264 -crf 31 -preset veryfast -y Mkv.mp4 ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers built with gcc 7.3.1 (GCC) 20180722 configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth libavutil 56. 14.100 / 56. 14.100 libavcodec 58. 18.100 / 58. 18.100 libavformat 58. 12.100 / 58. 12.100 libavdevice 58. 3.100 / 58. 3.100 libavfilter 7. 16.100 / 7. 16.100 libswscale 5. 1.100 / 5. 1.100 libswresample 3. 1.100 / 3. 1.100 libpostproc 55. 1.100 / 55. 1.100 [aac @ 000001229524e580] This stream seems to incorrectly report its last channel as SCE[1], mapping to LFE[0] Input #0, matroska,webm, from 'MKV.mkv': Metadata: encoder : libebml v0.7.5 + libmatroska v0.7.7 creation_time : 2005-08-22T17:01:27.000000Z Duration: 00:00:30.85, start: 0.000000, bitrate: 1128 kb/s Stream #0:0: Video: mpeg4 (Simple Profile) (XVID / 0x44495658), yuv420p, 360x240 [SAR 3:2 DAR 9:4], SAR 1:1 DAR 3:2, 15 fps, 15 tbr, 1k tbn, 15 tbc (default) Metadata: title : XviD Video Stream Stream #0:1(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 192 kb/s (default) Metadata: title : AC3 5.1 Test Audio Stream #0:2(eng): Audio: aac (HE-AAC), 48000 Hz, 5.1, fltp Metadata: title : AAC 5.1 Test Audio Stream mapping: Stream #0:0 -> #0:0 (mpeg4 (native) -> h264 (libx264)) Stream #0:1 -> #0:1 (ac3 (native) -> aac (native)) Press [q] to stop, [?] for help [aac @ 0000012295300e40] Using a PCE to encode channel layout [mpeg4 @ 0000012295303b80] Video uses a non-standard and wasteful way to store B-frames ('packed B-frames'). Consider using the mpeg4_unpack_bframes bitstream filter without encoding but stream copy to fix it. [libx264 @ 00000122956600c0] using SAR=1/1 [libx264 @ 00000122956600c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 [libx264 @ 00000122956600c0] profile High, level 1.2 [libx264 @ 00000122956600c0] 264 - core 155 r2901 7d0ff22 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=2 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=15 scenecut=40 intra_refresh=0 rc_lookahead=10 rc=crf mbtree=1 crf=31.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'Mkv.mp4': Metadata: encoder : Lavf58.12.100 Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 360x240 [SAR 1:1 DAR 3:2], q=-1--1, 15 fps, 15360 tbn, 15 tbc (default) Metadata: title : XviD Video Stream encoder : Lavc58.18.100 libx264 Side data: cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1 Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1(side), fltp, 394 kb/s (default) Metadata: title : AC3 5.1 Test Audio encoder : Lavc58.18.100 aac frame= 462 fps=136 q=-1.0 Lsize= 1658kB time=00:00:30.72 bitrate= 442.0kbits/s dup=1 drop=0 speed=9.03x video:343kB audio:1298kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.030490% [libx264 @ 00000122956600c0] frame I:14 Avg QP:26.16 size: 1752 [libx264 @ 00000122956600c0] frame P:179 Avg QP:28.83 size: 1026 [libx264 @ 00000122956600c0] frame B:269 Avg QP:30.51 size: 530 [libx264 @ 00000122956600c0] consecutive B-frames: 17.5% 10.8% 11.0% 60.6% [libx264 @ 00000122956600c0] mb I I16..4: 32.5% 56.3% 11.2% [libx264 @ 00000122956600c0] mb P I16..4: 17.6% 24.2% 2.5% P16..4: 11.3% 5.7% 2.6% 0.0% 0.0% skip:36.0% [libx264 @ 00000122956600c0] mb B I16..4: 2.0% 4.9% 0.3% B16..8: 13.6% 6.4% 0.8% direct: 3.3% skip:68.7% L0:40.2% L1:42.7% BI:17.1% [libx264 @ 00000122956600c0] 8x8 transform intra:57.2% inter:34.3% [libx264 @ 00000122956600c0] coded y,uvDC,uvAC intra: 29.5% 18.8% 2.1% inter: 4.5% 5.9% 0.1% [libx264 @ 00000122956600c0] i16 v,h,dc,p: 38% 34% 9% 19% [libx264 @ 00000122956600c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 39% 28% 3% 2% 3% 3% 4% 3% [libx264 @ 00000122956600c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 23% 17% 7% 6% 6% 6% 7% 5% [libx264 @ 00000122956600c0] i8c dc,h,v,p: 67% 23% 8% 2% [libx264 @ 00000122956600c0] Weighted P-Frames: Y:7.8% UV:4.5% [libx264 @ 00000122956600c0] kb/s:91.08 [aac @ 0000012295300e40] Qavg: 31339.592
FFprobe output for ffpmeg 4.0
% ffprobe -show_streams MKV.mp4 ffprobe version 4.0.2 Copyright (c) 2007-2018 the FFmpeg developers built with gcc 7.3.1 (GCC) 20180722 configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth libavutil 56. 14.100 / 56. 14.100 libavcodec 58. 18.100 / 58. 18.100 libavformat 58. 12.100 / 58. 12.100 libavdevice 58. 3.100 / 58. 3.100 libavfilter 7. 16.100 / 7. 16.100 libswscale 5. 1.100 / 5. 1.100 libswresample 3. 1.100 / 3. 1.100 libpostproc 55. 1.100 / 55. 1.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'MKV.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.12.100 Duration: 00:00:30.80, start: 0.000000, bitrate: 440 kb/s Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 360x240 [SAR 1:1 DAR 3:2], 91 kb/s, 15 fps, 15 tbr, 15360 tbn, 30 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 6 channels, fltp, 345 kb/s (default) Metadata: handler_name : SoundHandler [STREAM] index=0 codec_name=h264 codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 profile=High codec_type=video codec_time_base=1/30 codec_tag_string=avc1 codec_tag=0x31637661 width=360 height=240 coded_width=368 coded_height=240 has_b_frames=2 sample_aspect_ratio=1:1 display_aspect_ratio=3:2 pix_fmt=yuv420p level=12 color_range=unknown color_space=unknown color_transfer=unknown color_primaries=unknown chroma_location=left field_order=unknown timecode=N/A refs=1 is_avc=true nal_length_size=4 id=N/A r_frame_rate=15/1 avg_frame_rate=15/1 time_base=1/15360 start_pts=0 start_time=0.000000 duration_ts=473088 duration=30.800000 bit_rate=91261 max_bit_rate=N/A bits_per_raw_sample=8 nb_frames=462 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 DISPOSITION:timed_thumbnails=0 TAG:language=und TAG:handler_name=VideoHandler [/STREAM] [STREAM] index=1 codec_name=aac codec_long_name=AAC (Advanced Audio Coding) profile=LC codec_type=audio codec_time_base=1/48000 codec_tag_string=mp4a codec_tag=0x6134706d sample_fmt=fltp sample_rate=48000 channels=6 channel_layout=unknown bits_per_sample=0 id=N/A r_frame_rate=0/0 avg_frame_rate=0/0 time_base=1/48000 start_pts=0 start_time=0.000000 duration_ts=1474560 duration=30.720000 bit_rate=345780 max_bit_rate=394000 bits_per_raw_sample=N/A nb_frames=1441 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 DISPOSITION:timed_thumbnails=0 TAG:language=eng TAG:handler_name=SoundHandler [/STREAM]
FFprobe output for ffpmeg 3.4.2
% ffprobe -show_streams MKV.mp4 ffprobe version 4.0.2 Copyright (c) 2007-2018 the FFmpeg developers built with gcc 7.3.1 (GCC) 20180722 configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth libavutil 56. 14.100 / 56. 14.100 libavcodec 58. 18.100 / 58. 18.100 libavformat 58. 12.100 / 58. 12.100 libavdevice 58. 3.100 / 58. 3.100 libavfilter 7. 16.100 / 7. 16.100 libswscale 5. 1.100 / 5. 1.100 libswresample 3. 1.100 / 3. 1.100 libpostproc 55. 1.100 / 55. 1.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'MKV.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf57.83.100 Duration: 00:00:30.80, start: 0.000000, bitrate: 436 kb/s Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 360x240 [SAR 1:1 DAR 3:2], 91 kb/s, 15 fps, 15 tbr, 15360 tbn, 30 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 341 kb/s (default) Metadata: handler_name : SoundHandler [STREAM] index=0 codec_name=h264 codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 profile=High codec_type=video codec_time_base=1/30 codec_tag_string=avc1 codec_tag=0x31637661 width=360 height=240 coded_width=368 coded_height=240 has_b_frames=2 sample_aspect_ratio=1:1 display_aspect_ratio=3:2 pix_fmt=yuv420p level=12 color_range=unknown color_space=unknown color_transfer=unknown color_primaries=unknown chroma_location=left field_order=unknown timecode=N/A refs=1 is_avc=true nal_length_size=4 id=N/A r_frame_rate=15/1 avg_frame_rate=15/1 time_base=1/15360 start_pts=0 start_time=0.000000 duration_ts=473088 duration=30.800000 bit_rate=91261 max_bit_rate=N/A bits_per_raw_sample=8 nb_frames=462 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 DISPOSITION:timed_thumbnails=0 TAG:language=und TAG:handler_name=VideoHandler [/STREAM] [STREAM] index=1 codec_name=aac codec_long_name=AAC (Advanced Audio Coding) profile=LC codec_type=audio codec_time_base=1/48000 codec_tag_string=mp4a codec_tag=0x6134706d sample_fmt=fltp sample_rate=48000 channels=6 channel_layout=5.1 bits_per_sample=0 id=N/A r_frame_rate=0/0 avg_frame_rate=0/0 time_base=1/48000 start_pts=0 start_time=0.000000 duration_ts=1474560 duration=30.720000 bit_rate=341279 max_bit_rate=341279 bits_per_raw_sample=N/A nb_frames=1441 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 DISPOSITION:timed_thumbnails=0 TAG:language=eng TAG:handler_name=SoundHandler [/STREAM]
It looks like a defect because it worked in 3.4.2 version. I'm waiting for response. Thanks
Change History (14)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
Component: | ffmpeg → avcodec |
---|---|
Keywords: | aac added; ffmpeg 5.1 surround audio removed |
Priority: | normal → important |
Status: | new → open |
Version: | unspecified → git-master |
Seems to be a regression since fc9dcfe7d50d7f1b38fb287b4d92b5f3fc4bfb05
follow-up: 13 comment:3 by , 6 years ago
Resolution: | → invalid |
---|---|
Status: | open → closed |
The issue is that the file signals a layout of 5.1(side), but the AAC spec only supports 5.1(rear channels instead of side). Hence, to express that the encoder uses a not-so-universally supported feature to signal the non-standard layout. Firefox 62.0 here supports decoding it just fine, so maybe whatever you're using is old.
You can convert between 5.1(side) and 5.1(rear) by specifying -channel_layout "5.1", which will make the encoder use the standard well understood path.
comment:4 by , 6 years ago
But why it's working fine in ffmpeg 3.4.2? Were there any breaking changes in ffmpeg 4.0 connected to this?
comment:7 by , 6 years ago
Additional question: what should I do with ffmpeg 4 if input can contain videos with different channels layouts? In 3.4.2 ffmpeg desided what to do but in 4.0 this issue exists.
comment:8 by , 6 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
I reopened this issue because maybe closed issues are not tracked for additional response. Thanks!
comment:9 by , 6 years ago
My general question is how I should convert videos with different channels layouts to save theirs layouts with ffmpeg 4.
comment:10 by , 6 years ago
The default channel layouts supported by aac encoder are:
mono, stereo, 3.0, 4.0, 5.0 and 5.1
(FC, FL+FR, FL+FR+FC, FL+FR+FC+BC, FL+FR+FC+BL+BR, FL+FR+FC+BL+BR+LFE).
If one of your layouts is one of those, they will be encoded as before.
If your layout is not one of them but is close, for instance 5.1(side), you can do as atomnuker said:
ffmpeg -channel_layout 5.1 -i input etc...
comment:11 by , 6 years ago
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
comment:12 by , 6 years ago
Keywords: | regression added |
---|---|
Resolution: | invalid |
Status: | closed → reopened |
The additional option should of course not be required, even more so if this is a regression.
comment:13 by , 4 years ago
Replying to Rostislav Pehlivanov:
The issue is that the file signals a layout of 5.1(side), but the AAC spec only supports 5.1(rear channels instead of side).
Yes, but looks like 7.1 does support side? See https://git.1f0.de/gitweb/?p=ffmpeg.git;a=commit;h=71d762aaa1d96c1a6c4cc83b5a0c418b53028efe;js=1
comment:14 by , 4 years ago
I've only just bumped into this issue, and found quite a few closed tickets on the subject, but as this ticket is active recently I thought I'd add my 2 cents worth here.
Sorry if I'm getting any of this wrong, but I can't imagine how it's not the problem.
I've always understood that for AAC using a centre channel, the golden rule is to make the first element a single channel element, and to end with a single channel element, if it exists. I think the LFE channel is a specially labelled SCE.
In other words, the channel order begins with the centre channel, then the stereo channels are encoded in pairs from front to back, and the final SCE is the back centre channel (if it exists). Whether the surround channels are "back" or "side" in wave file channel order shouldn't matter. They're encoded as surround channels. If there's only a pair of them it's 5.1ch audio, if there's two pairs it's 7.1ch.
Best as I can tell, ffmpeg is storing the AAC elements in wave file channel order, rather then the AAC order of FC, FL, FR, Ls, Rs, LFE.
https://wiki.multimedia.cx/index.php/MPEG-4_Audio#Channel_Configurations
For AAC there's "num_front_channel_elements" for specifying the number of front speakers, but the specification seems fairly determined about the order of the elements.
https://web.archive.org/web/20110713115817/http://jongyeob.com/moniwiki/pds/upload/13818-7.pdf
8.5.1 Data Elements
num_front_channel_elements
The number of audio syntactic elements in the front channels, front center to back center, symmetrically by left and right, or alternating by left and right in the case of single channel elements (Table 25).
This is the other info I dug out of an AAC stream with MediaInfo, so in the order they appear to be stored (ffmpeg encoded).
channel_configuration: 0 (0x0) - (4 bits) -
num_front_channel_elements: 2 (0x2) - (4 bits) - Front: FL FR FC
num_side_channel_elements: 1 (0x1) - (4 bits) - LFE becomes Side: C
num_back_channel_elements: 1 (0x1) - (4 bits) - Back: L R
num_lfe_channel_elements: 0 (0x0) - (2 bits)
The worst part is, I don't think you need to use a PCE for 5.1ch audio if you follow the element order rules. Even if you don't, it'd probably be assumed to be the front centre channel if there's only one SCE.
8.5.2.3 Implicit channel mapping
1) Any number of SCE's may appear (as long as permitted by other constraints, for example profile). If this number of SCE's is odd, then the first SCE represents the front center channel, and the other SCE's represent L/R pairs of channels, proceeding from center front outwards and back to center rear.
If the number of SCE's is even, then the SCE's are assigned as pairs as center-front L/R, in pairs proceeding out and back from center front toward center back.
Or you can use Channel_Configuration to specify a channel layout, and not bother specifying the number of front or side channels etc, although from what I can tell, it has to be used with the elements in the correct order. That's all QAAC seems to do.
channel_configuration: 6 (0x6) - (4 bits) - Front: L C R, Side: L R, LFE
Unfortunately the implicit channel order for 7.1ch audio is the cinema layout (where there's extra stereo channels in the front rather than extra surround channels), and because ffmpeg has encoded 7.1ch as 7.1ch (front), then decoded it as 7.1ch (wide), and QAAC began using PCE's for 7.1ch a long time ago, I've not kept up with any changes to the way ffmpeg does it, so I've no idea if it's using PCE's now, or doing so correctly, but it seems to me this is something that it should be doing according to the spec.
MKV video to reproduce https://drive.google.com/open?id=1wLWnNY5HrxflHZq4QOhZaqu6S8QpP_tF