Opened 10 months ago

Closed 10 months ago

Last modified 10 months ago

#10892 closed defect (invalid)

10-bit video shows up as gray using vulkan hardware acceleration

Reported by: Sam Lantinga Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords:
Cc: Sam Lantinga Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: yes

Description

Summary of the bug: 10-bit video shows up as gray using vulkan hardware acceleration
How to reproduce:

% ./ffmpeg -v 9 -loglevel 99 -ss 10 -hwaccel vulkan -i https://www.libsdl.org/tmp/hdr.mp4 -frames:v 1 frame.bmp
ffmpeg version 2024-03-04-git-e30369bc1c

I first discovered this when trying to use SDL's testffmpeg to decode 10-bit video, and found that it was an ffmpeg bug, not an issue with the Vulkan interop.

It looks oddly like the CbCr plane is being used instead of the Y plane.

It works fine with an 8-bit video, it's only an issue with 10-bit video.

The full output of the command is too large for this description field, but here are some relevant snippets:

ffmpeg version 2024-03-04-git-e30369bc1c-full_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 13.2.0 (Rev5, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --pkg-config=pkgconf --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-dxva2 --enable-d3d11va --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      58. 40.100 / 58. 40.100
  libavcodec     60. 41.100 / 60. 41.100
  libavformat    60. 23.100 / 60. 23.100
  libavdevice    60.  4.100 / 60.  4.100
  libavfilter     9. 17.100 /  9. 17.100
  libswscale      7.  6.100 /  7.  6.100
  libswresample   4. 14.100 /  4. 14.100
  libpostproc    57.  4.100 / 57.  4.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'.
Reading option '-ss' ... matched as option 'ss' (start transcoding at specified time) with argument '10'.
Reading option '-hwaccel' ... matched as option 'hwaccel' (use HW accelerated decoding) with argument 'vulkan'.
Reading option '-i' ... matched as output url with argument '/tmp/hdr.mp4'.
Reading option '-frames:v' ... matched as option 'frames' (set the number of frames to output) with argument '1'.
Reading option 'frame.bmp' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Successfully parsed a group of options.
Parsing a group of options: input url /tmp/hdr.mp4.
Applying option ss (start transcoding at specified time) with argument 10.
Applying option hwaccel (use HW accelerated decoding) with argument vulkan.
Successfully parsed a group of options.
Opening an input file: /tmp/hdr.mp4.
[AVFormatContext @ 00000253b6252700] Opening '/tmp/hdr.mp4' for reading
[file @ 00000253b6252d40] Setting default whitelist 'file,crypto,data'
Probing mov,mp4,m4a,3gp,3g2,mj2 score:100 size:2048
...
Selecting decoder 'hevc' because of requested hwaccel method vulkan
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/hdr.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isom
    creation_time   : 2016-02-03T08:01:30.000000Z
  Duration: 00:02:07.15, start: 0.000000, bitrate: 75806 kb/s
  Stream #0:0[0x1](und), 30, 1/60000: Video: hevc (Main 10), 1 reference frame (hvc1 / 0x31637668), yuv420p10le(tv, bt2020nc/bt2020/smpte2084, topleft), 3840x2160 [SAR 1:1 DAR 16:9], 0/1, 75620 kb/s, 59.94 fps, 59.94 tbr, 60k tbn (default)
      Metadata:
        creation_time   : 2016-02-03T07:59:49.000000Z
        handler_name    : Video Media Handler
        vendor_id       : [0][0][0][0]
        encoder         : HEVC Coding
  Stream #0:1[0x2](eng), 1, 1/48000: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 192 kb/s (default)
      Metadata:
        creation_time   : 2016-02-03T07:59:49.000000Z
        handler_name    : Sound Media Handler
        vendor_id       : [0][0][0][0]
Successfully opened the file.
Parsing a group of options: output url frame.bmp.
Applying option frames:v (set the number of frames to output) with argument 1.
Successfully parsed a group of options.
Opening an output file: frame.bmp.
[out#0/image2 @ 00000253b62d8b00] No explicit maps, mapping streams automatically...
[vost#0:0/bmp @ 00000253b66a4900] Created video stream from input stream 0:0
[AVHWDeviceContext @ 00000253b62dc640] Supported validation layers:
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_NV_optimus
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_AMD_switchable_graphics
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_VALVE_steam_overlay
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_VALVE_steam_fossilize
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_RENDERDOC_Capture
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_LUNARG_api_dump
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_LUNARG_gfxreconstruct
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_KHRONOS_synchronization2
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_KHRONOS_validation
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_LUNARG_monitor
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_LUNARG_screenshot
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_KHRONOS_profiles
[AVHWDeviceContext @ 00000253b62dc640]  VK_LAYER_KHRONOS_shader_object
[AVHWDeviceContext @ 00000253b62dc640] Using instance extension VK_KHR_portability_enumeration
[AVHWDeviceContext @ 00000253b62dc640] GPU listing:
[AVHWDeviceContext @ 00000253b62dc640]     0: NVIDIA GeForce RTX 4080 (discrete) (0x2704)
[AVHWDeviceContext @ 00000253b62dc640]     1: AMD Radeon(TM) Graphics (integrated) (0x164e)
[AVHWDeviceContext @ 00000253b62dc640] Device 0 selected: NVIDIA GeForce RTX 4080 (discrete) (0x2704)
[AVHWDeviceContext @ 00000253b62dc640] Queue families:
[AVHWDeviceContext @ 00000253b62dc640]     0: graphics compute transfer sparse (queues: 16)
[AVHWDeviceContext @ 00000253b62dc640]     1: transfer sparse (queues: 2)
[AVHWDeviceContext @ 00000253b62dc640]     2: compute transfer sparse (queues: 8)
[AVHWDeviceContext @ 00000253b62dc640]     3: transfer decode sparse (queues: 1)
[AVHWDeviceContext @ 00000253b62dc640]     4: transfer encode sparse (queues: 2)
[AVHWDeviceContext @ 00000253b62dc640]     5: transfer sparse (queues: 1)
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_push_descriptor
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_sampler_ycbcr_conversion
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_EXT_descriptor_buffer
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_EXT_shader_atomic_float
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_cooperative_matrix
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_EXT_external_memory_host
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_external_memory_win32
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_external_semaphore_win32
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_video_queue
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_video_decode_queue
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_video_decode_h264
[AVHWDeviceContext @ 00000253b62dc640] Using device extension VK_KHR_video_decode_h265
[AVHWDeviceContext @ 00000253b62dc640] Using device: NVIDIA GeForce RTX 4080
[AVHWDeviceContext @ 00000253b62dc640] Alignments:
[AVHWDeviceContext @ 00000253b62dc640]     optimalBufferCopyRowPitchAlignment: 1
[AVHWDeviceContext @ 00000253b62dc640]     minMemoryMapAlignment:              64
[AVHWDeviceContext @ 00000253b62dc640]     nonCoherentAtomSize:                64
[AVHWDeviceContext @ 00000253b62dc640]     minImportedHostPointerAlignment:    4096
[AVHWDeviceContext @ 00000253b62dc640] Using queue family 0 (queues: 16) for graphics
[AVHWDeviceContext @ 00000253b62dc640] Using queue family 2 (queues: 8) for compute
[AVHWDeviceContext @ 00000253b62dc640] Using queue family 1 (queues: 2) for transfers
[AVHWDeviceContext @ 00000253b62dc640] Using queue family 3 (queues: 1) for decode
[AVHWDeviceContext @ 00000253b62dc640] Using queue family 4 (queues: 2) for encode
detected 32 logical cores
[hevc @ 00000253b6300dc0] nal_unit_type: 32(VPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 00000253b6300dc0] Decoding VPS
[hevc @ 00000253b6300dc0] Main 10 profile bitstream
[hevc @ 00000253b6300dc0] nal_unit_type: 33(SPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 00000253b6300dc0] Decoding SPS
[hevc @ 00000253b6300dc0] Main 10 profile bitstream
[hevc @ 00000253b6300dc0] Decoding VUI
[hevc @ 00000253b6300dc0] nal_unit_type: 34(PPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 00000253b6300dc0] Decoding PPS
Successfully opened the file.
...
[hevc @ 00000253b6300dc0] Format vulkan chosen by get_format().
[hevc @ 00000253b6300dc0] Format vulkan requires hwaccel initialisation.
[hevc @ 00000253b6300dc0] Decoder capabilities for hevc profile "Main 10":
[hevc @ 00000253b6300dc0]     Maximum level: 61 (stream 153)
[hevc @ 00000253b6300dc0]     Width: from 144 to 8192
[hevc @ 00000253b6300dc0]     Height: from 144 to 8192
[hevc @ 00000253b6300dc0]     Width alignment: 16
[hevc @ 00000253b6300dc0]     Height alignment: 16
[hevc @ 00000253b6300dc0]     Bitstream offset alignment: 256
[hevc @ 00000253b6300dc0]     Bitstream size alignment: 256
[hevc @ 00000253b6300dc0]     Maximum references: 16
[hevc @ 00000253b6300dc0]     Maximum active references: 16
[hevc @ 00000253b6300dc0]     Codec header name: 'VK_STD_vulkan_video_codec_h265_decode' (driver), 'VK_STD_vulkan_video_codec_h265_decode' (compiled)
[hevc @ 00000253b6300dc0]     Codec header version: 1.0.0 (driver), 1.0.0 (compiled)
[hevc @ 00000253b6300dc0]     Decode modes: reuse_dst_dpb
[hevc @ 00000253b6300dc0]     Capability flags: separate_references
[hevc @ 00000253b6300dc0] Choosing best pixel format for decoding from 1:
[hevc @ 00000253b6300dc0]     p010le* (Vulkan ID: 1000156013)
[hevc @ 00000253b6300dc0] Chosen frame pixfmt: p010le (Vulkan ID: 1000156013)
[hevc @ 00000253b6300dc0] Allocating 4096 bytes in bind index 0 for video session
[hevc @ 00000253b6300dc0] Allocating 16384 bytes in bind index 1 for video session
[hevc @ 00000253b6300dc0] Allocating 71303168 bytes in bind index 2 for video session
[hevc @ 00000253b6300dc0] Allocating 247595008 bytes in bind index 3 for video session
[hevc @ 00000253b6300dc0] Vulkan decoder initialization sucessful
Mastering Display Metadata:
r(1.0000,1.0000) g(1.0000,1.0000) b(1.0000 1.0000) wp(1.0000, 1.0000)
min_luminance=0.100000, max_luminance=0.500000
Content Light Level Metadata:
MaxCLL=0, MaxFALL=0

Change History (1)

comment:1 by Lynne, 10 months ago

Analyzed by developer: set
Reproduced by developer: set
Resolution: invalid
Status: newclosed

Works fine on radv. Works-ish on anv.

Regular, non-HDR, 10-bit samples do work flawlessly on Nvidia and anv.

Only happens on HDR samples, which leads me to conclude it's a scaling list issue on the side of Nvidia's drivers. We had to fix radv's implementation, particularly the scaling lists, many many times until they were correct.

I do remember this working last year, so possibly they made the same mistakes we did, and we broke them while we fixed ours to comply with the spec, letter by letter.

I'll ask them to investigate again, and I'll ping the anv maintainer.

Last edited 10 months ago by Lynne (previous) (diff)
Note: See TracTickets for help on using tickets.