Opened 6 months ago

Last modified 6 months ago

#9637 open defect

Color matrix behaviour in colr box

Reported by: Ulysse Dansin Owned by:
Priority: normal Component: ffmpeg
Version: git-master Keywords: x265 hdr color_matrix colr_box
Cc: Ulysse Dansin Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary:

When encoding an MP4 HDR10 YUV file having color matrix=BT2020c both in the colr box and in the NAL_SPS with x265 to another MP4 file with BT2020nc color matrix, the output color matrix information in the colr box is not as we expect.

Here are details of the input:

>> mediainfo input.mp4
General
Complete name                            : input.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/mp41)
File size                                : 40.8 MiB
Duration                                 : 5 s 131 ms
Overall bit rate                         : 66.8 Mb/s
Writing application                      : Lavf59.4.101

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L5@High
HDR format                               : SMPTE ST 2086, HDR10 compatible
Codec ID                                 : hev1
Codec ID/Info                            : High Efficiency Video Coding
Duration                                 : 5 s 131 ms
Source duration                          : 6 s 798 ms
Bit rate                                 : 50.4 Mb/s
Width                                    : 3 840 pixels
Height                                   : 2 076 pixels
Display aspect ratio                     : 1.85:1
Frame rate mode                          : Constant
Frame rate                               : 23.976 (23976/1000) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 10 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.264
Stream size                              : 30.9 MiB (76%)
Source stream size                       : 40.8 MiB (100%)
Writing library                          : x265 3.1:[Linux][GCC 8.3.0][64 bit] 10bit
Encoding settings                        : cpuid=1111039 / frame-threads=2 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=3840x2076 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=2 / no-allow-non-conformance / repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=4 / b-adapt=0 / b-pyramid / bframe-bias=0 / rc-lookahead=15 / lookahead-slices=8 / scenecut=40 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=3 / no-limit-modes / me=1 / subme=2 / merange=57 / temporal-mvp / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / sao / no-sao-non-deblock / rd=2 / no-early-skip / rskip / fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=15.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / vbv-maxrate=50000 / vbv-bufsize=250000 / vbv-init=0.9 / crf-max=0.0 / crf-min=0.0 / ipratio=1.40 / pbratio=1.30 / aq-mode=2 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=9 / transfer=16 / colormatrix=10 / chromaloc=0 / display-window=0 / master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,50)cll=987,137 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / no-aq-motion / hdr / no-hdr-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=5 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=0 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=-1627389952 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00
Color range                              : Limited
Color primaries                          : BT.2020
Transfer characteristics                 : PQ
Matrix coefficients                      : BT.2020 constant
Mastering display color primaries        : Display P3
Mastering display luminance              : min: 0.0050 cd/m2, max: 1000 cd/m2
Maximum Content Light Level              : 987 cd/m2
Maximum Frame-Average Light Level        : 137 cd/m2
mdhd_Duration                            : 5130
Codec configuration box                  : hvcC

If we inspect the colr box with hexedit:

	63  6F 6C 72 6E  63 6C 78 00  09 00 10 00  0A   .....fiel......colrnclx.

The input color matrix is 0A -> 10 -> BT2020c

We encode using x265 and specify in the encoding params colormatrix=bt2020nc.

% ffmpeg -i input.mp4 -filter_complex "scale=3840x1604,setsar=1/1" -c:v libx265 -preset fast -x265-params "crf=15:vbv-maxrate=50000:vbv-bufsize=250000:colorprim=bt2020:transfer=smpte2084:colormatrix=bt2020nc:master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50):max-cll=1261,512:ref=2" -pix_fmt yuv420p10le -vsync 1 -an -map_chapters -1 -map_metadata:g -1:g -map_metadata:s:v -1:g -map_metadata:s:a -1:g -movflags +faststart -f mp4 out.mp4

The output is the following:

>> mediainfo out.mp4
General
Complete name                            : out.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/mp41)
File size                                : 12.1 MiB
Duration                                 : 5 s 131 ms
Overall bit rate                         : 19.7 Mb/s
Writing application                      : Lavf59.4.101

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L5@High
HDR format                               : SMPTE ST 2086, HDR10 compatible
Codec ID                                 : hev1
Codec ID/Info                            : High Efficiency Video Coding
Duration                                 : 5 s 131 ms
Bit rate                                 : 19.7 Mb/s
Width                                    : 3 840 pixels
Height                                   : 1 604 pixels
Display aspect ratio                     : 2.40:1
Frame rate mode                          : Constant
Frame rate                               : 23.976 (23976/1000) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 10 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.134
Stream size                              : 12.1 MiB (100%)
Writing library                          : x265 3.4+31-6722fce1f:[Mac OS X][clang 12.0.0][64 bit] 10bit
Encoding settings                        : cpuid=1111039 / frame-threads=3 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=3840x1604 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=2 / no-allow-non-conformance / repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=4 / b-adapt=0 / b-pyramid / bframe-bias=0 / rc-lookahead=15 / lookahead-slices=8 / scenecut=40 / hist-scenecut=0 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=3 / no-limit-modes / me=1 / subme=2 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / sao / no-sao-non-deblock / rd=2 / selective-sao=4 / no-early-skip / rskip / fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=15.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / vbv-maxrate=50000 / vbv-bufsize=250000 / vbv-init=0.9 / min-vbv-fullness=50.0 / max-vbv-fullness=80.0 / crf-max=0.0 / crf-min=0.0 / ipratio=1.40 / pbratio=1.30 / aq-mode=2 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=9 / transfer=16 / colormatrix=9 / chromaloc=0 / display-window=0 / master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50) / cll=1261,512 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / hist-threshold=0.03 / no-opt-cu-delta-qp / no-aq-motion / hdr10 / no-hdr10-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-rate=0 / no-vbv-live-multi-pass
Color range                              : Limited
Color primaries                          : BT.2020
Transfer characteristics                 : PQ
Matrix coefficients                      : BT.2020 constant
matrix_coefficients_Original             : BT.2020 non-constant
Mastering display color primaries        : Display P3
Mastering display luminance              : min: 0.0050 cd/m2, max: 4000 cd/m2
Maximum Content Light Level              : 1261 cd/m2
Maximum Frame-Average Light Level        : 512 cd/m2
Codec configuration box                  : hvcC

If we check the codec NAL_SPS, we can see that matrix coeffs is 9 as expected (see the screenshot https://rak.box.com/s/5sg0oq5d6llub5iv2etuzjv4chmikv0l).

However, the mp4 colr box is still 10 (bt2020c):

	63 6F 6C 72  6E 63 6C 78  00 09 00  10 00 0A     .....colrnclx....

We were expecting the colr box to have been modified with the same information as the codec.

In addition, we verified that this is indeed the behaviour when performing a direct codec copy.
For example, if we take out.mp4 which has bt2020nc in the NAL_SPS and bt2020c in the colr box and perform a codec copy:

ffmpeg -i out.mp4 -c copy out_1.mp4

We are getting the following mediainfo and colr box:

General
Complete name                            : out_1.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/mp41)
File size                                : 12.1 MiB
Duration                                 : 5 s 131 ms
Overall bit rate                         : 19.7 Mb/s
Writing application                      : Lavf59.4.101

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L5@High
HDR format                               : SMPTE ST 2086, HDR10 compatible
Codec ID                                 : hev1
Codec ID/Info                            : High Efficiency Video Coding
Duration                                 : 5 s 131 ms
Bit rate                                 : 19.7 Mb/s
Width                                    : 3 840 pixels
Height                                   : 1 604 pixels
Display aspect ratio                     : 2.40:1
Frame rate mode                          : Constant
Frame rate                               : 23.976 (23976/1000) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 10 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.134
Stream size                              : 12.1 MiB (100%)
Writing library                          : x265 3.4+31-6722fce1f:[Mac OS X][clang 12.0.0][64 bit] 10bit
Encoding settings                        : cpuid=1111039 / frame-threads=3 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=3840x1604 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=2 / no-allow-non-conformance / repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=4 / b-adapt=0 / b-pyramid / bframe-bias=0 / rc-lookahead=15 / lookahead-slices=8 / scenecut=40 / hist-scenecut=0 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=3 / no-limit-modes / me=1 / subme=2 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / sao / no-sao-non-deblock / rd=2 / selective-sao=4 / no-early-skip / rskip / fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=15.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / vbv-maxrate=50000 / vbv-bufsize=250000 / vbv-init=0.9 / min-vbv-fullness=50.0 / max-vbv-fullness=80.0 / crf-max=0.0 / crf-min=0.0 / ipratio=1.40 / pbratio=1.30 / aq-mode=2 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=9 / transfer=16 / colormatrix=9 / chromaloc=0 / display-window=0 / master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50) / cll=1261,512 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / hist-threshold=0.03 / no-opt-cu-delta-qp / no-aq-motion / hdr10 / no-hdr10-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-rate=0 / no-vbv-live-multi-pass
Color range                              : Limited
Color primaries                          : BT.2020
Transfer characteristics                 : PQ
Matrix coefficients                      : BT.2020 non-constant
Mastering display color primaries        : Display P3
Mastering display luminance              : min: 0.0050 cd/m2, max: 4000 cd/m2
Maximum Content Light Level              : 1261 cd/m2
Maximum Frame-Average Light Level        : 512 cd/m2
Codec configuration box                  : hvcC
63 6F 6C  72 6E 63 6C  78 00 09 00  10 00 09          colrnclx....

Now the colr box is 9, meaning BT2020nc.

As you can see, a codec copy is correcting the colr box with the NAL_SPS matrix coefficient, whereas an encode does not. Is this the expected behavior?

Thanks so much for your help!

Ulysse DANSIN & Jordi SOLSONA

Files:
https://rak.box.com/s/p14wrweghbg6ar9ttz4pgpexakbv4r9u

Change History (7)

comment:1 by Balling, 6 months ago

Status: newopen

Your original file is not actually encoded with BT.2020 CL. That is why it happens. No, PQ (PQ is HDR and HDR is PQ) does not support CL. Use ICtCp, that has both CL and CI.

Of course I agree that nclx atom should not use CL and should not preserve the "original" values like this. There is prefix SEI for that after all.

Also mediainfo thinks that the container values should overwrite the VUI in SPS data (AVIF mandates it) it is not actually the case in ffmpeg.

Last edited 6 months ago by Balling (previous) (diff)

comment:2 by Balling, 6 months ago

But the whole idea here is wrong. You cannot use x265 options for this. You should use actually ffmpeg global options, then it will work. Also of course ffmpeg does not support CL, not without zimg at least.

Last edited 6 months ago by Balling (previous) (diff)

comment:3 by Balling, 6 months ago

Exiftool will show Color Representation : nclx 9 16 9

ffmpeg -i "20200905-003746_TURKSAT_8K1.mkv" -filter_complex "scale=3840x1604,setsar=1/1" -c:v libx265 -preset fast -x265-params "crf=15:vbv-maxrate=50000:vbv-bufsize=250000:master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50):max-cll=1261,512:ref=2" -pix_fmt yuv420p10le -vsync 1 -an -map_chapters -1 -map_metadata:g -1:g -map_metadata:s:v -1:g -map_metadata:s:a -1:g -color_primaries bt2020 -color_trc smpte2084 -colorspace bt2020nc -movflags +faststart -f mp4 .\out12121212121_nostuff.mp4

Okay? You are using it wrong.

Last edited 6 months ago by Balling (previous) (diff)

in reply to:  3 comment:4 by Ulysse Dansin, 6 months ago

Hello Bailing
Thank you for the quick response.

We agree that setting -colorspace will fix the problem, our question is why this information cannot be read directly from the NAL SPS after encoding.

This would simplify the command while at the same time avoiding color matrix information mismatch between codec and container...

Replying to Balling:

Exiftool will show Color Representation : nclx 9 16 9

ffmpeg -i "20200905-003746_TURKSAT_8K1.mkv" -filter_complex "scale=3840x1604,setsar=1/1" -c:v libx265 -preset fast -x265-params "crf=15:vbv-maxrate=50000:vbv-bufsize=250000:master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50):max-cll=1261,512:ref=2" -pix_fmt yuv420p10le -vsync 1 -an -map_chapters -1 -map_metadata:g -1:g -map_metadata:s:v -1:g -map_metadata:s:a -1:g -color_primaries bt2020 -color_trc smpte2084 -colorspace bt2020nc -movflags +faststart -f mp4 .\out12121212121_nostuff.mp4

Okay? You are using it wrong.

comment:5 by Balling, 6 months ago

cannot be read directly from the NAL SPS after encoding.

Right, that is why I opened the issue! Even though it is more or less duplicate of #9167.

comment:6 by Balling, 6 months ago

BTW, who produces such bad files, looks like you use ffmpeg too? There is only one player that supports constant luminance like that (mpv) and even then bt.2020-cl is not supported with pq transfer. I saw it on 8k sattelites but even there sdr [transfer] was used.

Last edited 6 months ago by Balling (previous) (diff)

in reply to:  6 comment:7 by Ulysse Dansin, 6 months ago

Replying to Balling:

BTW, who produces such bad files, looks like you use ffmpeg too? There is only one player that supports constant luminance like that (mpv) and even then bt.2020-cl is not supported with pq transfer. I saw it on 8k sattelites but even there sdr [transfer] was used.

Us :) We were investigating the behavior of colr box with different inputs when we saw the mismatch between colr box and NAL SPS for one of our QA tests... Our current encodes are using bt.2020-ncl with ST2084. Thanks so much for your help!

Note: See TracTickets for help on using tickets.