Opened 2 years ago
Closed 2 years ago
#10152 closed defect (worksforme)
hwupload from cuda to vulkan doesn't work on Linux
Reported by: | serql | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | git-master | Keywords: | |
Cc: | serql | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
How to reproduce:
% ffmpeg -hwaccel nvdec -hwaccel_output_format cuda -i 1.ts -init_hw_device cuda=cuda:0 -filter_hw_device cuda -vf hwupload=derive_device=vulkan,scale_vulkan=w=1920:h=1440 -f null - ffmpeg version N-109662-g2c3107c3e9 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 11 (GCC) configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --prefix=/root/ffmpeg_build --pkg-config-flags=--static --extra-cflags='-I/root/ffmpeg_build/include -I/usr/local/cuda/include' --extra-ldflags='-L/root/ffmpeg_build/lib -L/usr/local/cuda/lib64 -L/root/ffmpeg_sources/1.3.236.0/x86_64/lib' --extra-libs=-lpthread --extra-libs=-lm --bindir=/root/bin --enable-gpl --enable-libfdk_aac --enable-libfreetype --enable-libx264 --enable-libx265 --enable-cuvid --enable-hwaccel=hevc_nvdec --enable-hwaccel=h264_nvdec --enable-opencl --enable-libzimg --enable-vulkan --enable-libshaderc --enable-libplacebo libavutil 57. 44.100 / 57. 44.100 libavcodec 59. 57.100 / 59. 57.100 libavformat 59. 36.100 / 59. 36.100 libavdevice 59. 8.101 / 59. 8.101 libavfilter 8. 54.100 / 8. 54.100 libswscale 6. 8.112 / 6. 8.112 libswresample 4. 9.100 / 4. 9.100 libpostproc 56. 7.100 / 56. 7.100 Input #0, mpegts, from '1.ts': Duration: 00:01:00.06, start: 27664.427600, bitrate: 3665 kb/s Program 3 Stream #0:0[0x31]: Video: hevc (Main 10) ([36][0][0][0] / 0x0024), yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 59.94 fps, 59.94 tbr, 90k tbn Stream mapping: Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native)) Press [q] to stop, [?] for help [AVHWFramesContext @ 0x9cd0ac0] There are no supported modifiers for the given sw_format [Parsed_hwupload_0 @ 0xd038a80] Failed to configure output pad on Parsed_hwupload_0 Error reinitializing filters! Failed to inject frame into filter network: Invalid argument Error while processing the decoded data for stream #0:0 Conversion failed! ffmpeg version N-109662-g2c3107c3e9 built on Linux (Centos 7)
Ref: https://trac.ffmpeg.org/ticket/8512
What I'm trying to do: I want to decode hevc using nvidia gpu, then upload it to vulkan, then use libplacebo to convert HDR to SDR, then return it back to cpu mem. First step always fail (hwupload to vulkan). nvdec and nvenc works fine if I don't try to use vulkan filter
# nvidia-smi Thu Jan 26 16:13:33 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.85.05 Driver Version: 525.85.05 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA T1000 8GB Off | 00000000:17:00.0 Off | N/A | ... Vulkan info: Device Properties and Extensions: ================================= GPU0: VkPhysicalDeviceProperties: --------------------------- apiVersion = 1.3.224 (4206816) driverVersion = 525.85.5.320 (2203402560) vendorID = 0x10de deviceID = 0x1ff0 deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = NVIDIA T1000 8GB pipelineCacheUUID = 5e84b6ed-4e70-35e3-ab42-c0b2fad24793
Messages from trace level:
Parsed_hwupload_0 @ 0x64a1b40] Setting 'derive_device' to value 'vulkan' [Parsed_scale_vulkan_1 @ 0x64a1c40] Setting 'w' to value '1920' [Parsed_scale_vulkan_1 @ 0x64a1c40] Setting 'h' to value '1440' [graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'video_size' to value '1920x1080' [graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'pix_fmt' to value '117' [graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'time_base' to value '1/90000' [graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'pixel_aspect' to value '1/1' [graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'frame_rate' to value '60000/1001' [graph 0 input from stream 0:0 @ 0x9f51c00] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:60000/1001 sar:1/1 [AVHWDeviceContext @ 0x46f43c0] Calling cu->cuDeviceGetUuid((CUuuid *)&dev_select.uuid, cu_internal->cuda_device) [AVHWDeviceContext @ 0xb49cd40] Supported validation layers: [AVHWDeviceContext @ 0xb49cd40] VK_LAYER_NV_optimus [AVHWDeviceContext @ 0xb49cd40] GPU listing: [AVHWDeviceContext @ 0xb49cd40] 0: NVIDIA T1000 8GB (discrete) (0x1ff0) [AVHWDeviceContext @ 0xb49cd40] 1: NVIDIA T1000 8GB (discrete) (0x1ff0) [AVHWDeviceContext @ 0xb49cd40] Device 0 selected: NVIDIA T1000 8GB (discrete) (0x1ff0) [AVHWDeviceContext @ 0xb49cd40] Queue families: [AVHWDeviceContext @ 0xb49cd40] 0: graphics compute transfer sparse (queues: 16) [AVHWDeviceContext @ 0xb49cd40] 1: transfer sparse (queues: 2) [AVHWDeviceContext @ 0xb49cd40] 2: compute transfer sparse (queues: 8) [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_push_descriptor [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_sampler_ycbcr_conversion [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_synchronization2 [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_external_memory_fd [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_EXT_external_memory_dma_buf [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_EXT_image_drm_format_modifier [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_external_semaphore_fd [AVHWDeviceContext @ 0xb49cd40] Using device extension VK_EXT_external_memory_host [AVHWDeviceContext @ 0xb49cd40] Using device: NVIDIA T1000 8GB [AVHWDeviceContext @ 0xb49cd40] Alignments: [AVHWDeviceContext @ 0xb49cd40] optimalBufferCopyRowPitchAlignment: 1 [AVHWDeviceContext @ 0xb49cd40] minMemoryMapAlignment: 64 [AVHWDeviceContext @ 0xb49cd40] minImportedHostPointerAlignment: 4096 [AVHWDeviceContext @ 0xb49cd40] Using queue family 0 (queues: 16) for graphics [AVHWDeviceContext @ 0xb49cd40] Using queue family 1 (queues: 2) for transfers [AVHWDeviceContext @ 0xb49cd40] Using queue family 2 (queues: 8) for compute [AVFilterGraph @ 0xaf30c80] query_formats: 4 queried, 3 merged, 0 already done, 0 delayed [hwupload @ 0xee8ccc0] Surface format is cuda. [AVHWFramesContext @ 0xa720b00] There are no supported modifiers for the given sw_format [Parsed_hwupload_0 @ 0x64a1b40] Failed to configure output pad on Parsed_hwupload_0 Error reinitializing filters! Failed to inject frame into filter network: Invalid argument Error while processing the decoded data for stream #0:0 Terminating demuxer thread 0 [AVHWDeviceContext @ 0x46f43c0] Calling decoder->cudl->cuCtxPushCurrent(decoder->cuda_ctx) [AVHWDeviceContext @ 0x46f43c0] Calling decoder->cvdl->cuvidDestroyDecoder(decoder->decoder) [AVHWDeviceContext @ 0x46f43c0] Calling decoder->cudl->cuCtxPopCurrent(&dummy) [AVHWDeviceContext @ 0x46f43c0] Calling cu->cuCtxPushCurrent(hwctx->cuda_ctx) [AVHWDeviceContext @ 0x46f43c0] Calling cu->cuMemFree((CUdeviceptr)data) [AVHWDeviceContext @ 0x46f43c0] Calling cu->cuCtxPopCurrent(&dummy)
Note:
See TracTickets
for help on using tickets.
the issue has been solved. useful details: https://github.com/jellyfin/jellyfin-ffmpeg/issues/215#issuecomment-1410146318