Context Navigation

#10152 closed defect (worksforme)

hwupload from cuda to vulkan doesn't work on Linux

Reported by:	serql	Owned by:
Priority:	normal	Component:	undetermined
Version:	git-master	Keywords:
Cc:	serql	Blocked By:
Blocking:		Reproduced by developer:	no
Analyzed by developer:	no

Description

Summary of the bug:
How to reproduce:

% ffmpeg -hwaccel nvdec -hwaccel_output_format cuda -i 1.ts -init_hw_device cuda=cuda:0 -filter_hw_device cuda -vf hwupload=derive_device=vulkan,scale_vulkan=w=1920:h=1440 -f null -
ffmpeg version N-109662-g2c3107c3e9 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 11 (GCC)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --prefix=/root/ffmpeg_build --pkg-config-flags=--static --extra-cflags='-I/root/ffmpeg_build/include -I/usr/local/cuda/include' --extra-ldflags='-L/root/ffmpeg_build/lib -L/usr/local/cuda/lib64 -L/root/ffmpeg_sources/1.3.236.0/x86_64/lib' --extra-libs=-lpthread --extra-libs=-lm --bindir=/root/bin --enable-gpl --enable-libfdk_aac --enable-libfreetype --enable-libx264 --enable-libx265 --enable-cuvid --enable-hwaccel=hevc_nvdec --enable-hwaccel=h264_nvdec --enable-opencl --enable-libzimg --enable-vulkan --enable-libshaderc --enable-libplacebo
  libavutil      57. 44.100 / 57. 44.100
  libavcodec     59. 57.100 / 59. 57.100
  libavformat    59. 36.100 / 59. 36.100
  libavdevice    59.  8.101 / 59.  8.101
  libavfilter     8. 54.100 /  8. 54.100
  libswscale      6.  8.112 /  6.  8.112
  libswresample   4.  9.100 /  4.  9.100
  libpostproc    56.  7.100 / 56.  7.100
Input #0, mpegts, from '1.ts':
  Duration: 00:01:00.06, start: 27664.427600, bitrate: 3665 kb/s
  Program 3 
  Stream #0:0[0x31]: Video: hevc (Main 10) ([36][0][0][0] / 0x0024), yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 59.94 fps, 59.94 tbr, 90k tbn
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[AVHWFramesContext @ 0x9cd0ac0] There are no supported modifiers for the given sw_format
[Parsed_hwupload_0 @ 0xd038a80] Failed to configure output pad on Parsed_hwupload_0
Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while processing the decoded data for stream #0:0
Conversion failed!

ffmpeg version N-109662-g2c3107c3e9
built on Linux (Centos 7)

Ref: https://trac.ffmpeg.org/ticket/8512

What I'm trying to do: I want to decode hevc using nvidia gpu, then upload it to vulkan, then use libplacebo to convert HDR to SDR, then return it back to cpu mem. First step always fail (hwupload to vulkan). nvdec and nvenc works fine if I don't try to use vulkan filter

# nvidia-smi 
Thu Jan 26 16:13:33 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 525.85.05    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA T1000 8GB    Off  | 00000000:17:00.0 Off |                  N/A |
...

Vulkan info:
Device Properties and Extensions:
=================================
GPU0:
VkPhysicalDeviceProperties:
---------------------------
        apiVersion        = 1.3.224 (4206816)
        driverVersion     = 525.85.5.320 (2203402560)
        vendorID          = 0x10de
        deviceID          = 0x1ff0
        deviceType        = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName        = NVIDIA T1000 8GB
        pipelineCacheUUID = 5e84b6ed-4e70-35e3-ab42-c0b2fad24793

Messages from trace level:

Parsed_hwupload_0 @ 0x64a1b40] Setting 'derive_device' to value 'vulkan'
[Parsed_scale_vulkan_1 @ 0x64a1c40] Setting 'w' to value '1920'
[Parsed_scale_vulkan_1 @ 0x64a1c40] Setting 'h' to value '1440'
[graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'video_size' to value '1920x1080'
[graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'pix_fmt' to value '117'
[graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'pixel_aspect' to value '1/1'
[graph 0 input from stream 0:0 @ 0x9f51c00] Setting 'frame_rate' to value '60000/1001'
[graph 0 input from stream 0:0 @ 0x9f51c00] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:60000/1001 sar:1/1
[AVHWDeviceContext @ 0x46f43c0] Calling cu->cuDeviceGetUuid((CUuuid *)&dev_select.uuid, cu_internal->cuda_device)
[AVHWDeviceContext @ 0xb49cd40] Supported validation layers:
[AVHWDeviceContext @ 0xb49cd40]  VK_LAYER_NV_optimus
[AVHWDeviceContext @ 0xb49cd40] GPU listing:
[AVHWDeviceContext @ 0xb49cd40]     0: NVIDIA T1000 8GB (discrete) (0x1ff0)
[AVHWDeviceContext @ 0xb49cd40]     1: NVIDIA T1000 8GB (discrete) (0x1ff0)
[AVHWDeviceContext @ 0xb49cd40] Device 0 selected: NVIDIA T1000 8GB (discrete) (0x1ff0)
[AVHWDeviceContext @ 0xb49cd40] Queue families:
[AVHWDeviceContext @ 0xb49cd40]     0: graphics compute transfer sparse (queues: 16)
[AVHWDeviceContext @ 0xb49cd40]     1: transfer sparse (queues: 2)
[AVHWDeviceContext @ 0xb49cd40]     2: compute transfer sparse (queues: 8)
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_push_descriptor
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_sampler_ycbcr_conversion
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_synchronization2
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_external_memory_fd
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_EXT_external_memory_dma_buf
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_EXT_image_drm_format_modifier
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_KHR_external_semaphore_fd
[AVHWDeviceContext @ 0xb49cd40] Using device extension VK_EXT_external_memory_host
[AVHWDeviceContext @ 0xb49cd40] Using device: NVIDIA T1000 8GB
[AVHWDeviceContext @ 0xb49cd40] Alignments:
[AVHWDeviceContext @ 0xb49cd40]     optimalBufferCopyRowPitchAlignment: 1
[AVHWDeviceContext @ 0xb49cd40]     minMemoryMapAlignment:              64
[AVHWDeviceContext @ 0xb49cd40]     minImportedHostPointerAlignment:    4096
[AVHWDeviceContext @ 0xb49cd40] Using queue family 0 (queues: 16) for graphics
[AVHWDeviceContext @ 0xb49cd40] Using queue family 1 (queues: 2) for transfers
[AVHWDeviceContext @ 0xb49cd40] Using queue family 2 (queues: 8) for compute
[AVFilterGraph @ 0xaf30c80] query_formats: 4 queried, 3 merged, 0 already done, 0 delayed
[hwupload @ 0xee8ccc0] Surface format is cuda.
[AVHWFramesContext @ 0xa720b00] There are no supported modifiers for the given sw_format
[Parsed_hwupload_0 @ 0x64a1b40] Failed to configure output pad on Parsed_hwupload_0
Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while processing the decoded data for stream #0:0
Terminating demuxer thread 0
[AVHWDeviceContext @ 0x46f43c0] Calling decoder->cudl->cuCtxPushCurrent(decoder->cuda_ctx)
[AVHWDeviceContext @ 0x46f43c0] Calling decoder->cvdl->cuvidDestroyDecoder(decoder->decoder)
[AVHWDeviceContext @ 0x46f43c0] Calling decoder->cudl->cuCtxPopCurrent(&dummy)
[AVHWDeviceContext @ 0x46f43c0] Calling cu->cuCtxPushCurrent(hwctx->cuda_ctx)
[AVHWDeviceContext @ 0x46f43c0] Calling cu->cuMemFree((CUdeviceptr)data)
[AVHWDeviceContext @ 0x46f43c0] Calling cu->cuCtxPopCurrent(&dummy)

Change History (1)

comment:1 by serql, 15 months ago

Resolution:	→ worksforme
Status:	new → closed

the issue has been solved. useful details: https://github.com/jellyfin/jellyfin-ffmpeg/issues/215#issuecomment-1410146318

Note: See TracTickets for help on using tickets.

Download in other formats: