Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#8512 closed defect (invalid)

The Vulkan based filter scale_vulkan does not work on NVIDIA hardware.

Reported by: Dennis E. Mungai Owned by: Philip Langdale
Priority: normal Component: undetermined
Version: git-master Keywords:
Cc: dev@lynne.ee, philipl@overt.org Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Hello there,

I'm trying to get the Vulkan H/W filters, specifically scale_vulkan working with NVIDIA GPUs, but to no avail.

With the hwupload filter, using the derive_device option as documented:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda -hwaccel_device cuda -hwaccel nvdec \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

Console output:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> hwupload (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

If I attempt to use the hwmap filter instead of hwupload to derive the necessary Vulkan context:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda -hwaccel_device cuda -hwaccel nvdec \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwmap=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

The same error occurs:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> hwmap (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

At this stage, it may have to do with the supported pixel format(s), as the scale_vulcan filter (at the present) does not support texture conversion, as noted in the commit https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/d95c509cc64372f8b37d89310250785224751a90.

However, attempting to request for specific pixel formats from the decoder, such as nv12, paired with the hwmap filter results in failure too:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda -hwaccel_device cuda -hwaccel nvdec -hwaccel_output_format nv12 \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwmap=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

Results in:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> hwmap (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

Repeating the same while removing hwaccel, ie NVDEC in this case:

(a). With the hwmap filter:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwmap=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

Results in the same error:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> hwmap (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

(b). And with the hwupload filter with the derive_device=vulkan option passed through:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

Issues the same error:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> hwupload (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

(c). And attempting to use an OpenCL device to derive a Vulkan context (from the same underlying hardware) dumps out a new error:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device opencl=cl:0.0 -filter_hw_device cl \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

Returns a totally new error:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> hwupload (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
[Parsed_hwupload_0 @ 0x5556532b2e00] Query format failed for 'Parsed_hwupload_0': Function not implemented
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

Attempts at pixel format conversion, with or without nvdec in place, via the snippet:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]format=nv12,hwupload=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\'f=mp4]'hq.mp4'| \
   [select=\'v:1,a\'f=mp4]'med.mp4'"

Results in a new error:

ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1)
  configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree
  libavutil      56. 39.100 / 56. 39.100
  libavcodec     58. 67.101 / 58. 67.101
  libavformat    58. 37.100 / 58. 37.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 74.100 /  7. 74.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.37.100
  Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 (h264) -> format (graph 0)
  scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc)
  scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc)
  Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac))
Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_4' and the filter 'auto_scaler_1'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

And related, here's the platform information:

ffmpeg -hide_banner -v verbose -init_hw_device list

Supported hardware device types:
vdpau
cuda
drm
opencl
vulkan

And on Vulkan in particular:

ffmpeg -hide_banner -v verbose -init_hw_device vulkan
[AVHWDeviceContext @ 0x5569f5d90440] GPU listing:
[AVHWDeviceContext @ 0x5569f5d90440]     0: GeForce RTX 2080 (discrete) (0x1ed0)
[AVHWDeviceContext @ 0x5569f5d90440] Using device: GeForce RTX 2080
[AVHWDeviceContext @ 0x5569f5d90440] Alignments:
[AVHWDeviceContext @ 0x5569f5d90440]     optimalBufferCopyOffsetAlignment:   1
[AVHWDeviceContext @ 0x5569f5d90440]     optimalBufferCopyRowPitchAlignment: 1
[AVHWDeviceContext @ 0x5569f5d90440]     minMemoryMapAlignment:              64
[AVHWDeviceContext @ 0x5569f5d90440] Using queue family 0 for graphics, flags: (graphics) (compute) (transfer) (sparse) 
[AVHWDeviceContext @ 0x5569f5d90440] Using queue family 2 for compute, flags: (compute) (transfer) (sparse) 
[AVHWDeviceContext @ 0x5569f5d90440] Using queue family 1 for transfers, flags: (transfer) (sparse) 
[AVHWDeviceContext @ 0x5569f5d90440] Using device extension "VK_KHR_external_memory_fd"
[AVHWDeviceContext @ 0x5569f5d90440] Extension "VK_EXT_external_memory_dma_buf" not found!
[AVHWDeviceContext @ 0x5569f5d90440] Extension "VK_EXT_image_drm_format_modifier" not found!
[AVHWDeviceContext @ 0x5569f5d90440] Using device extension "VK_KHR_external_semaphore_fd"

I'm assuming that the warnings on the two Vulkan extensions not being found, ie "VK_EXT_external_memory_dma_buf" and "VK_EXT_image_drm_format_modifier" may point to issues with the NVIDIA driver.

Here's the output from Vulkaninfo:

==========
VULKANINFO
==========

Vulkan Instance Version: 1.1.126


Instance Extensions: count = 18
====================
	VK_EXT_acquire_xlib_display            : extension revision 1
	VK_EXT_debug_report                    : extension revision 8
	VK_EXT_debug_utils                     : extension revision 1
	VK_EXT_direct_mode_display             : extension revision 1
	VK_EXT_display_surface_counter         : extension revision 1
	VK_KHR_device_group_creation           : extension revision 1
	VK_KHR_display                         : extension revision 23
	VK_KHR_external_fence_capabilities     : extension revision 1
	VK_KHR_external_memory_capabilities    : extension revision 1
	VK_KHR_external_semaphore_capabilities : extension revision 1
	VK_KHR_get_display_properties2         : extension revision 1
	VK_KHR_get_physical_device_properties2 : extension revision 1
	VK_KHR_get_surface_capabilities2       : extension revision 1
	VK_KHR_surface                         : extension revision 25
	VK_KHR_surface_protected_capabilities  : extension revision 1
	VK_KHR_wayland_surface                 : extension revision 6
	VK_KHR_xcb_surface                     : extension revision 6
	VK_KHR_xlib_surface                    : extension revision 6

Layers: count = 0
=======
Presentable Surfaces:
=====================
GPU id : 0 (GeForce RTX 2080):
	Surface types: count = 2
		VK_KHR_xcb_surface
		VK_KHR_xlib_surface
	Formats: count = 2
		SurfaceFormat[0]:
			format = FORMAT_B8G8R8A8_UNORM
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
		SurfaceFormat[1]:
			format = FORMAT_B8G8R8A8_SRGB
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
	Present Modes: count = 3
		PRESENT_MODE_FIFO_KHR
		PRESENT_MODE_FIFO_RELAXED_KHR
		PRESENT_MODE_IMMEDIATE_KHR
	VkSurfaceCapabilitiesKHR:
	-------------------------
		minImageCount       = 2
		maxImageCount       = 8
		currentExtent:
			width  = 256
			height = 256
		minImageExtent:
			width  = 256
			height = 256
		maxImageExtent:
			width  = 256
			height = 256
		maxImageArrayLayers = 1
		supportedTransforms:
			SURFACE_TRANSFORM_IDENTITY_BIT_KHR
		currentTransform:
			SURFACE_TRANSFORM_IDENTITY_BIT_KHR
		supportedCompositeAlpha:
			COMPOSITE_ALPHA_OPAQUE_BIT_KHR
		supportedUsageFlags:
			IMAGE_USAGE_TRANSFER_SRC_BIT
			IMAGE_USAGE_TRANSFER_DST_BIT
			IMAGE_USAGE_SAMPLED_BIT
			IMAGE_USAGE_STORAGE_BIT
			IMAGE_USAGE_COLOR_ATTACHMENT_BIT
			IMAGE_USAGE_INPUT_ATTACHMENT_BIT
	VkSurfaceCapabilities2EXT:
	--------------------------
		supportedSurfaceCounters:
			None
	VkSurfaceProtectedCapabilitiesKHR:
	----------------------------------
		supportsProtected = false



Groups:
=======
	Device Group Properties (Group 0):
		physicalDeviceCount: count = 1
			GeForce RTX 2080 (ID: 0)
		subsetAllocation = 0

	Device Group Present Capabilities (Group 0):
		GeForce RTX 2080 (ID: 0)
		Can present images from the following devices:
			GeForce RTX 2080 (ID: 0)
		Present modes:
			DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR


Device Properties and Extensions:
=================================
GPU0:
VkPhysicalDeviceProperties:
---------------------------
	apiVersion     = 4198519 (1.1.119)
	driverVersion  = 1846034496 (0x6e084040)
	vendorID       = 0x10de
	deviceID       = 0x1ed0
	deviceType     = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
	deviceName     = GeForce RTX 2080

VkPhysicalDeviceLimits:
-----------------------
	maxImageDimension1D                             = 32768
	maxImageDimension2D                             = 32768
	maxImageDimension3D                             = 16384
	maxImageDimensionCube                           = 32768
	maxImageArrayLayers                             = 2048
	maxTexelBufferElements                          = 134217728
	maxUniformBufferRange                           = 65536
	maxStorageBufferRange                           = 4294967295
	maxPushConstantsSize                            = 256
	maxMemoryAllocationCount                        = 4294967295
	maxSamplerAllocationCount                       = 4000
	bufferImageGranularity                          = 0x00000400
	sparseAddressSpaceSize                          = 0xffffffffffffffff
	maxBoundDescriptorSets                          = 32
	maxPerStageDescriptorSamplers                   = 1048576
	maxPerStageDescriptorUniformBuffers             = 1048576
	maxPerStageDescriptorStorageBuffers             = 1048576
	maxPerStageDescriptorSampledImages              = 1048576
	maxPerStageDescriptorStorageImages              = 1048576
	maxPerStageDescriptorInputAttachments           = 1048576
	maxPerStageResources                            = 4294967295
	maxDescriptorSetSamplers                        = 1048576
	maxDescriptorSetUniformBuffers                  = 1048576
	maxDescriptorSetUniformBuffersDynamic           = 15
	maxDescriptorSetStorageBuffers                  = 1048576
	maxDescriptorSetStorageBuffersDynamic           = 16
	maxDescriptorSetSampledImages                   = 1048576
	maxDescriptorSetStorageImages                   = 1048576
	maxDescriptorSetInputAttachments                = 1048576
	maxVertexInputAttributes                        = 32
	maxVertexInputBindings                          = 32
	maxVertexInputAttributeOffset                   = 2047
	maxVertexInputBindingStride                     = 2048
	maxVertexOutputComponents                       = 128
	maxTessellationGenerationLevel                  = 64
	maxTessellationPatchSize                        = 32
	maxTessellationControlPerVertexInputComponents  = 128
	maxTessellationControlPerVertexOutputComponents = 128
	maxTessellationControlPerPatchOutputComponents  = 120
	maxTessellationControlTotalOutputComponents     = 4216
	maxTessellationEvaluationInputComponents        = 128
	maxTessellationEvaluationOutputComponents       = 128
	maxGeometryShaderInvocations                    = 32
	maxGeometryInputComponents                      = 128
	maxGeometryOutputComponents                     = 128
	maxGeometryOutputVertices                       = 1024
	maxGeometryTotalOutputComponents                = 1024
	maxFragmentInputComponents                      = 128
	maxFragmentOutputAttachments                    = 8
	maxFragmentDualSrcAttachments                   = 1
	maxFragmentCombinedOutputResources              = 16
	maxComputeSharedMemorySize                      = 49152
	maxComputeWorkGroupCount: count = 3
		2147483647
		65535
		65535
	maxComputeWorkGroupInvocations                  = 1024
	maxComputeWorkGroupSize: count = 3
		1024
		1024
		64
	subPixelPrecisionBits                           = 8
	subTexelPrecisionBits                           = 8
	mipmapPrecisionBits                             = 8
	maxDrawIndexedIndexValue                        = 4294967295
	maxDrawIndirectCount                            = 4294967295
	maxSamplerLodBias                               = 15
	maxSamplerAnisotropy                            = 16
	maxViewports                                    = 16
	maxViewportDimensions: count = 2
		32768
		32768
	viewportBoundsRange: count = 2
		-65536
		65536
	viewportSubPixelBits                            = 8
	minMemoryMapAlignment                           = 64
	minTexelBufferOffsetAlignment                   = 0x00000010
	minUniformBufferOffsetAlignment                 = 0x00000040
	minStorageBufferOffsetAlignment                 = 0x00000010
	minTexelOffset                                  = -8
	maxTexelOffset                                  = 7
	minTexelGatherOffset                            = -32
	maxTexelGatherOffset                            = 31
	minInterpolationOffset                          = -0.5
	maxInterpolationOffset                          = 0.4375
	subPixelInterpolationOffsetBits                 = 4
	maxFramebufferWidth                             = 32768
	maxFramebufferHeight                            = 32768
	maxFramebufferLayers                            = 2048
	framebufferColorSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
	framebufferDepthSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
	framebufferStencilSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
		SAMPLE_COUNT_16_BIT
	framebufferNoAttachmentsSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
		SAMPLE_COUNT_16_BIT
	maxColorAttachments                             = 8
	sampledImageColorSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
	sampledImageIntegerSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
	sampledImageDepthSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
	sampledImageStencilSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
		SAMPLE_COUNT_16_BIT
	storageImageSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
	maxSampleMaskWords                              = 1
	timestampComputeAndGraphics                     = true
	timestampPeriod                                 = 1
	maxClipDistances                                = 8
	maxCullDistances                                = 8
	maxCombinedClipAndCullDistances                 = 8
	discreteQueuePriorities                         = 2
	pointSizeRange: count = 2
		1
		2047.94
	lineWidthRange: count = 2
		1
		64
	pointSizeGranularity                            = 0.0625
	lineWidthGranularity                            = 0.0625
	strictLines                                     = true
	standardSampleLocations                         = true
	optimalBufferCopyOffsetAlignment                = 0x00000001
	optimalBufferCopyRowPitchAlignment              = 0x00000001
	nonCoherentAtomSize                             = 0x00000040

VkPhysicalDeviceSparseProperties:
---------------------------------
	residencyStandard2DBlockShape            = true
	residencyStandard2DMultisampleBlockShape = true
	residencyStandard3DBlockShape            = true
	residencyAlignedMipSize                  = false
	residencyNonResidentStrict               = true

VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT:
----------------------------------------------------
	advancedBlendMaxColorAttachments      = 8
	advancedBlendIndependentBlend         = false
	advancedBlendNonPremultipliedSrcColor = true
	advancedBlendNonPremultipliedDstColor = true
	advancedBlendCorrelatedOverlap        = true
	advancedBlendAllOperations            = true

VkPhysicalDeviceConservativeRasterizationPropertiesEXT:
-------------------------------------------------------
	primitiveOverestimationSize                 = 0.00195312
	maxExtraPrimitiveOverestimationSize         = 0.75
	extraPrimitiveOverestimationSizeGranularity = 0.25
	primitiveUnderestimation                    = true
	conservativePointAndLineRasterization       = true
	degenerateTrianglesRasterized               = true
	degenerateLinesRasterized                   = true
	fullyCoveredFragmentShaderInputVariable     = true
	conservativeRasterizationPostDepthCoverage  = true

VkPhysicalDeviceDepthStencilResolvePropertiesKHR:
-------------------------------------------------
	supportedDepthResolveModes:
		RESOLVE_MODE_SAMPLE_ZERO_BIT_KHR
		RESOLVE_MODE_AVERAGE_BIT_KHR
		RESOLVE_MODE_MIN_BIT_KHR
		RESOLVE_MODE_MAX_BIT_KHR
	supportedStencilResolveModes:
		RESOLVE_MODE_SAMPLE_ZERO_BIT_KHR
		RESOLVE_MODE_MIN_BIT_KHR
		RESOLVE_MODE_MAX_BIT_KHR
	independentResolveNone = true
	independentResolve     = true

VkPhysicalDeviceDescriptorIndexingPropertiesEXT:
------------------------------------------------
	maxUpdateAfterBindDescriptorsInAllPools              = 4294967295
	shaderUniformBufferArrayNonUniformIndexingNative     = true
	shaderSampledImageArrayNonUniformIndexingNative      = true
	shaderStorageBufferArrayNonUniformIndexingNative     = true
	shaderStorageImageArrayNonUniformIndexingNative      = true
	shaderInputAttachmentArrayNonUniformIndexingNative   = true
	robustBufferAccessUpdateAfterBind                    = true
	quadDivergentImplicitLod                             = true
	maxPerStageDescriptorUpdateAfterBindSamplers         = 1048576
	maxPerStageDescriptorUpdateAfterBindUniformBuffers   = 1048576
	maxPerStageDescriptorUpdateAfterBindStorageBuffers   = 1048576
	maxPerStageDescriptorUpdateAfterBindSampledImages    = 1048576
	maxPerStageDescriptorUpdateAfterBindStorageImages    = 1048576
	maxPerStageDescriptorUpdateAfterBindInputAttachments = 1048576
	maxPerStageUpdateAfterBindResources                  = 4294967295
	maxDescriptorSetUpdateAfterBindSamplers              = 1048576
	maxDescriptorSetUpdateAfterBindUniformBuffers        = 1048576
	maxDescriptorSetUpdateAfterBindUniformBuffersDynamic = 15
	maxDescriptorSetUpdateAfterBindStorageBuffers        = 1048576
	maxDescriptorSetUpdateAfterBindStorageBuffersDynamic = 16
	maxDescriptorSetUpdateAfterBindSampledImages         = 1048576
	maxDescriptorSetUpdateAfterBindStorageImages         = 1048576
	maxDescriptorSetUpdateAfterBindInputAttachments      = 1048576

VkPhysicalDeviceDiscardRectanglePropertiesEXT:
----------------------------------------------
	maxDiscardRectangles = 8

VkPhysicalDeviceDriverPropertiesKHR:
------------------------------------
	driverID           = DRIVER_ID_NVIDIA_PROPRIETARY_KHR
	driverName         = NVIDIA
	driverInfo         = 440.33.01
	conformanceVersion = 1.1.6.0

VkPhysicalDeviceFloatControlsPropertiesKHR:
-------------------------------------------
	denormBehaviorIndependence            = SHADER_FLOAT_CONTROLS_INDEPENDENCE_ALL_KHR
	roundingModeIndependence              = SHADER_FLOAT_CONTROLS_INDEPENDENCE_ALL_KHR
	shaderSignedZeroInfNanPreserveFloat16 = true
	shaderSignedZeroInfNanPreserveFloat32 = true
	shaderSignedZeroInfNanPreserveFloat64 = true
	shaderDenormPreserveFloat16           = true
	shaderDenormPreserveFloat32           = false
	shaderDenormPreserveFloat64           = false
	shaderDenormFlushToZeroFloat16        = false
	shaderDenormFlushToZeroFloat32        = false
	shaderDenormFlushToZeroFloat64        = false
	shaderRoundingModeRTEFloat16          = true
	shaderRoundingModeRTEFloat32          = true
	shaderRoundingModeRTEFloat64          = true
	shaderRoundingModeRTZFloat16          = false
	shaderRoundingModeRTZFloat32          = true
	shaderRoundingModeRTZFloat64          = true

VkPhysicalDeviceIDProperties:
-----------------------------
	deviceUUID      = e437e085-c68a-481d-f32-2acf2d506f69
	driverUUID      = c1b695f8-3620-ed2a-a1eb-5e40ab29d4e
	deviceNodeMask  = 1
	deviceLUIDValid = false

VkPhysicalDeviceInlineUniformBlockPropertiesEXT:
------------------------------------------------
	maxInlineUniformBlockSize                               = 256
	maxPerStageDescriptorInlineUniformBlocks                = 32
	maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks = 32
	maxDescriptorSetInlineUniformBlocks                     = 32
	maxDescriptorSetUpdateAfterBindInlineUniformBlocks      = 32

VkPhysicalDeviceLineRasterizationPropertiesEXT:
-----------------------------------------------
	lineSubPixelPrecisionBits = 8

VkPhysicalDeviceMaintenance3Properties:
---------------------------------------
	maxPerSetDescriptors    = 4294967295
	maxMemoryAllocationSize = 0xffe00000

VkPhysicalDeviceMultiviewProperties:
------------------------------------
	maxMultiviewViewCount     = 32
	maxMultiviewInstanceIndex = 134217727

VkPhysicalDevicePCIBusInfoPropertiesEXT:
----------------------------------------
	pciDomain   = 0
	pciBus      = 1
	pciDevice   = 0
	pciFunction = 0

VkPhysicalDevicePointClippingProperties:
----------------------------------------
	pointClippingBehavior = POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY

VkPhysicalDeviceProtectedMemoryProperties:
------------------------------------------
	protectedNoFault = false

VkPhysicalDevicePushDescriptorPropertiesKHR:
--------------------------------------------
	maxPushDescriptors = 32

VkPhysicalDeviceSampleLocationsPropertiesEXT:
---------------------------------------------
	sampleLocationSampleCounts:
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_2_BIT
		SAMPLE_COUNT_4_BIT
		SAMPLE_COUNT_8_BIT
		SAMPLE_COUNT_16_BIT
	maxSampleLocationGridSize:
		width  = 1
		height = 1
	sampleLocationCoordinateRange: count = 2
		0
		0.9375
	sampleLocationSubPixelBits       = 4
	variableSampleLocations          = true

VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT:
-------------------------------------------------
	filterMinmaxSingleComponentFormats = true
	filterMinmaxImageComponentMapping  = true

VkPhysicalDeviceSubgroupProperties:
-----------------------------------
	subgroupSize              = 32
	supportedStages:
		SHADER_STAGE_VERTEX_BIT
		SHADER_STAGE_TESSELLATION_CONTROL_BIT
		SHADER_STAGE_TESSELLATION_EVALUATION_BIT
		SHADER_STAGE_GEOMETRY_BIT
		SHADER_STAGE_FRAGMENT_BIT
		SHADER_STAGE_COMPUTE_BIT
		SHADER_STAGE_ALL_GRAPHICS
		SHADER_STAGE_ALL
		SHADER_STAGE_RAYGEN_BIT_NV
		SHADER_STAGE_ANY_HIT_BIT_NV
		SHADER_STAGE_CLOSEST_HIT_BIT_NV
		SHADER_STAGE_MISS_BIT_NV
		SHADER_STAGE_INTERSECTION_BIT_NV
		SHADER_STAGE_CALLABLE_BIT_NV
		SHADER_STAGE_TASK_BIT_NV
		SHADER_STAGE_MESH_BIT_NV
	supportedOperations:
		SUBGROUP_FEATURE_BASIC_BIT
		SUBGROUP_FEATURE_VOTE_BIT
		SUBGROUP_FEATURE_ARITHMETIC_BIT
		SUBGROUP_FEATURE_BALLOT_BIT
		SUBGROUP_FEATURE_SHUFFLE_BIT
		SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT
		SUBGROUP_FEATURE_CLUSTERED_BIT
		SUBGROUP_FEATURE_QUAD_BIT
		SUBGROUP_FEATURE_PARTITIONED_BIT_NV
	quadOperationsInAllStages = true

VkPhysicalDeviceSubgroupSizeControlPropertiesEXT:
-------------------------------------------------
	minSubgroupSize              = 32
	maxSubgroupSize              = 32
	maxComputeWorkgroupSubgroups = 2097152
	requiredSubgroupSizeStages:
		SHADER_STAGE_VERTEX_BIT
		SHADER_STAGE_TESSELLATION_CONTROL_BIT
		SHADER_STAGE_TESSELLATION_EVALUATION_BIT
		SHADER_STAGE_GEOMETRY_BIT
		SHADER_STAGE_FRAGMENT_BIT
		SHADER_STAGE_COMPUTE_BIT
		SHADER_STAGE_ALL_GRAPHICS
		SHADER_STAGE_ALL
		SHADER_STAGE_RAYGEN_BIT_NV
		SHADER_STAGE_ANY_HIT_BIT_NV
		SHADER_STAGE_CLOSEST_HIT_BIT_NV
		SHADER_STAGE_MISS_BIT_NV
		SHADER_STAGE_INTERSECTION_BIT_NV
		SHADER_STAGE_CALLABLE_BIT_NV
		SHADER_STAGE_TASK_BIT_NV
		SHADER_STAGE_MESH_BIT_NV

VkPhysicalDeviceTexelBufferAlignmentPropertiesEXT:
--------------------------------------------------
	storageTexelBufferOffsetAlignmentBytes       = 0x00000010
	storageTexelBufferOffsetSingleTexelAlignment = true
	uniformTexelBufferOffsetAlignmentBytes       = 0x00000010
	uniformTexelBufferOffsetSingleTexelAlignment = true

VkPhysicalDeviceTimelineSemaphorePropertiesKHR:
-----------------------------------------------
	maxTimelineSemaphoreValueDifference = 18446744073709551615

VkPhysicalDeviceTransformFeedbackPropertiesEXT:
-----------------------------------------------
	maxTransformFeedbackStreams                = 4
	maxTransformFeedbackBuffers                = 4
	maxTransformFeedbackBufferSize             = 0xffffffffffffffff
	maxTransformFeedbackStreamDataSize         = 2048
	maxTransformFeedbackBufferDataSize         = 512
	maxTransformFeedbackBufferDataStride       = 2048
	transformFeedbackQueries                   = true
	transformFeedbackStreamsLinesTriangles     = false
	transformFeedbackRasterizationStreamSelect = true
	transformFeedbackDraw                      = true

VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT:
----------------------------------------------------
	maxVertexAttribDivisor = 4294967295


Device Extensions: count = 100
	VK_EXT_blend_operation_advanced           : extension revision 2
	VK_EXT_buffer_device_address              : extension revision 2
	VK_EXT_calibrated_timestamps              : extension revision 1
	VK_EXT_conditional_rendering              : extension revision 1
	VK_EXT_conservative_rasterization         : extension revision 1
	VK_EXT_depth_clip_enable                  : extension revision 1
	VK_EXT_depth_range_unrestricted           : extension revision 1
	VK_EXT_descriptor_indexing                : extension revision 2
	VK_EXT_discard_rectangles                 : extension revision 1
	VK_EXT_display_control                    : extension revision 1
	VK_EXT_fragment_shader_interlock          : extension revision 1
	VK_EXT_global_priority                    : extension revision 2
	VK_EXT_host_query_reset                   : extension revision 1
	VK_EXT_index_type_uint8                   : extension revision 1
	VK_EXT_inline_uniform_block               : extension revision 1
	VK_EXT_line_rasterization                 : extension revision 1
	VK_EXT_memory_budget                      : extension revision 1
	VK_EXT_pci_bus_info                       : extension revision 2
	VK_EXT_pipeline_creation_feedback         : extension revision 1
	VK_EXT_post_depth_coverage                : extension revision 1
	VK_EXT_sample_locations                   : extension revision 1
	VK_EXT_sampler_filter_minmax              : extension revision 1
	VK_EXT_scalar_block_layout                : extension revision 1
	VK_EXT_separate_stencil_usage             : extension revision 1
	VK_EXT_shader_demote_to_helper_invocation : extension revision 1
	VK_EXT_shader_subgroup_ballot             : extension revision 1
	VK_EXT_shader_subgroup_vote               : extension revision 1
	VK_EXT_shader_viewport_index_layer        : extension revision 1
	VK_EXT_subgroup_size_control              : extension revision 2
	VK_EXT_texel_buffer_alignment             : extension revision 1
	VK_EXT_transform_feedback                 : extension revision 1
	VK_EXT_vertex_attribute_divisor           : extension revision 3
	VK_EXT_ycbcr_image_arrays                 : extension revision 1
	VK_KHR_16bit_storage                      : extension revision 1
	VK_KHR_8bit_storage                       : extension revision 1
	VK_KHR_bind_memory2                       : extension revision 1
	VK_KHR_create_renderpass2                 : extension revision 1
	VK_KHR_dedicated_allocation               : extension revision 3
	VK_KHR_depth_stencil_resolve              : extension revision 1
	VK_KHR_descriptor_update_template         : extension revision 1
	VK_KHR_device_group                       : extension revision 3
	VK_KHR_draw_indirect_count                : extension revision 1
	VK_KHR_driver_properties                  : extension revision 1
	VK_KHR_external_fence                     : extension revision 1
	VK_KHR_external_fence_fd                  : extension revision 1
	VK_KHR_external_memory                    : extension revision 1
	VK_KHR_external_memory_fd                 : extension revision 1
	VK_KHR_external_semaphore                 : extension revision 1
	VK_KHR_external_semaphore_fd              : extension revision 1
	VK_KHR_get_memory_requirements2           : extension revision 1
	VK_KHR_image_format_list                  : extension revision 1
	VK_KHR_imageless_framebuffer              : extension revision 1
	VK_KHR_maintenance1                       : extension revision 2
	VK_KHR_maintenance2                       : extension revision 1
	VK_KHR_maintenance3                       : extension revision 1
	VK_KHR_multiview                          : extension revision 1
	VK_KHR_pipeline_executable_properties     : extension revision 1
	VK_KHR_push_descriptor                    : extension revision 2
	VK_KHR_relaxed_block_layout               : extension revision 1
	VK_KHR_sampler_mirror_clamp_to_edge       : extension revision 1
	VK_KHR_sampler_ycbcr_conversion           : extension revision 1
	VK_KHR_shader_atomic_int64                : extension revision 1
	VK_KHR_shader_draw_parameters             : extension revision 1
	VK_KHR_shader_float16_int8                : extension revision 1
	VK_KHR_shader_float_controls              : extension revision 4
	VK_KHR_storage_buffer_storage_class       : extension revision 1
	VK_KHR_swapchain                          : extension revision 70
	VK_KHR_swapchain_mutable_format           : extension revision 1
	VK_KHR_timeline_semaphore                 : extension revision 2
	VK_KHR_uniform_buffer_standard_layout     : extension revision 1
	VK_KHR_variable_pointers                  : extension revision 1
	VK_KHR_vulkan_memory_model                : extension revision 3
	VK_NVX_binary_import                      : extension revision 1
	VK_NVX_device_generated_commands          : extension revision 3
	VK_NVX_image_view_handle                  : extension revision 1
	VK_NVX_multiview_per_view_attributes      : extension revision 1
	VK_NV_clip_space_w_scaling                : extension revision 1
	VK_NV_compute_shader_derivatives          : extension revision 1
	VK_NV_cooperative_matrix                  : extension revision 1
	VK_NV_corner_sampled_image                : extension revision 2
	VK_NV_coverage_reduction_mode             : extension revision 1
	VK_NV_dedicated_allocation                : extension revision 1
	VK_NV_dedicated_allocation_image_aliasing : extension revision 1
	VK_NV_device_diagnostic_checkpoints       : extension revision 2
	VK_NV_fill_rectangle                      : extension revision 1
	VK_NV_fragment_coverage_to_color          : extension revision 1
	VK_NV_fragment_shader_barycentric         : extension revision 1
	VK_NV_framebuffer_mixed_samples           : extension revision 1
	VK_NV_geometry_shader_passthrough         : extension revision 1
	VK_NV_mesh_shader                         : extension revision 1
	VK_NV_ray_tracing                         : extension revision 3
	VK_NV_representative_fragment_test        : extension revision 1
	VK_NV_sample_mask_override_coverage       : extension revision 1
	VK_NV_scissor_exclusive                   : extension revision 1
	VK_NV_shader_image_footprint              : extension revision 1
	VK_NV_shader_sm_builtins                  : extension revision 1
	VK_NV_shader_subgroup_partitioned         : extension revision 1
	VK_NV_shading_rate_image                  : extension revision 3
	VK_NV_viewport_array2                     : extension revision 1
	VK_NV_viewport_swizzle                    : extension revision 1

VkQueueFamilyProperties[0]:
==========================
	minImageTransferGranularity = (1, 1, 1)
	queueCount                  = 16
	queueFlags                  = QUEUE_GRAPHICS | QUEUE_COMPUTE | QUEUE_TRANSFER | QUEUE_SPARSE_BINDING
	timestampValidBits          = 64
	present support:
		VK_KHR_xcb_surface  = true
		VK_KHR_xlib_surface = true

VkQueueFamilyProperties[1]:
==========================
	minImageTransferGranularity = (1, 1, 1)
	queueCount                  = 2
	queueFlags                  = QUEUE_TRANSFER | QUEUE_SPARSE_BINDING
	timestampValidBits          = 64
	present support = false

VkQueueFamilyProperties[2]:
==========================
	minImageTransferGranularity = (1, 1, 1)
	queueCount                  = 8
	queueFlags                  = QUEUE_COMPUTE | QUEUE_TRANSFER | QUEUE_SPARSE_BINDING
	timestampValidBits          = 64
	present support:
		VK_KHR_xcb_surface  = false
		VK_KHR_xlib_surface = true

VkPhysicalDeviceMemoryProperties:
=================================
memoryHeaps: count = 2
	memoryHeaps[0]:
		size   = 8589934592 (0x200000000) (8.00 GiB)
		budget = 8033009664
		usage  = 0
		flags:
			MEMORY_HEAP_DEVICE_LOCAL_BIT
	memoryHeaps[1]:
		size   = 50584952832 (0xbc7192000) (47.11 GiB)
		budget = 50584952832
		usage  = 0
		flags:
			None
memoryTypes: count = 11
	memoryTypes[0]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: None
			IMAGE_TILING_LINEAR: None
	memoryTypes[1]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: color images
			IMAGE_TILING_LINEAR: None
	memoryTypes[2]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: FORMAT_D16_UNORM
			IMAGE_TILING_LINEAR: None
	memoryTypes[3]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: FORMAT_X8_D24_UNORM_PACK32, FORMAT_D24_UNORM_S8_UINT
			IMAGE_TILING_LINEAR: None
	memoryTypes[4]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: FORMAT_D32_SFLOAT
			IMAGE_TILING_LINEAR: None
	memoryTypes[5]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: FORMAT_D32_SFLOAT_S8_UINT
			IMAGE_TILING_LINEAR: None
	memoryTypes[6]:
		heapIndex     = 1
		propertyFlags = 0x0000:
			None
		usable for:
			IMAGE_TILING_OPTIMAL: FORMAT_S8_UINT
			IMAGE_TILING_LINEAR: None
	memoryTypes[7]:
		heapIndex     = 0
		propertyFlags = 0x0001:
			MEMORY_PROPERTY_DEVICE_LOCAL_BIT
		usable for:
			IMAGE_TILING_OPTIMAL: color images, FORMAT_D16_UNORM, FORMAT_X8_D24_UNORM_PACK32, FORMAT_D32_SFLOAT, FORMAT_S8_UINT, FORMAT_D24_UNORM_S8_UINT, FORMAT_D32_SFLOAT_S8_UINT
			IMAGE_TILING_LINEAR: None
	memoryTypes[8]:
		heapIndex     = 0
		propertyFlags = 0x0001:
			MEMORY_PROPERTY_DEVICE_LOCAL_BIT
		usable for:
			IMAGE_TILING_OPTIMAL: None
			IMAGE_TILING_LINEAR: None
	memoryTypes[9]:
		heapIndex     = 1
		propertyFlags = 0x0006:
			MEMORY_PROPERTY_HOST_VISIBLE_BIT
			MEMORY_PROPERTY_HOST_COHERENT_BIT
		usable for:
			IMAGE_TILING_OPTIMAL: None
			IMAGE_TILING_LINEAR: None
	memoryTypes[10]:
		heapIndex     = 1
		propertyFlags = 0x000e:
			MEMORY_PROPERTY_HOST_VISIBLE_BIT
			MEMORY_PROPERTY_HOST_COHERENT_BIT
			MEMORY_PROPERTY_HOST_CACHED_BIT
		usable for:
			IMAGE_TILING_OPTIMAL: None
			IMAGE_TILING_LINEAR: None

VkPhysicalDeviceFeatures:
=========================
	robustBufferAccess                      = true
	fullDrawIndexUint32                     = true
	imageCubeArray                          = true
	independentBlend                        = true
	geometryShader                          = true
	tessellationShader                      = true
	sampleRateShading                       = true
	dualSrcBlend                            = true
	logicOp                                 = true
	multiDrawIndirect                       = true
	drawIndirectFirstInstance               = true
	depthClamp                              = true
	depthBiasClamp                          = true
	fillModeNonSolid                        = true
	depthBounds                             = true
	wideLines                               = true
	largePoints                             = true
	alphaToOne                              = true
	multiViewport                           = true
	samplerAnisotropy                       = true
	textureCompressionETC2                  = false
	textureCompressionASTC_LDR              = false
	textureCompressionBC                    = true
	occlusionQueryPrecise                   = true
	pipelineStatisticsQuery                 = true
	vertexPipelineStoresAndAtomics          = true
	fragmentStoresAndAtomics                = true
	shaderTessellationAndGeometryPointSize  = true
	shaderImageGatherExtended               = true
	shaderStorageImageExtendedFormats       = true
	shaderStorageImageMultisample           = true
	shaderStorageImageReadWithoutFormat     = true
	shaderStorageImageWriteWithoutFormat    = true
	shaderUniformBufferArrayDynamicIndexing = true
	shaderSampledImageArrayDynamicIndexing  = true
	shaderStorageBufferArrayDynamicIndexing = true
	shaderStorageImageArrayDynamicIndexing  = true
	shaderClipDistance                      = true
	shaderCullDistance                      = true
	shaderFloat64                           = true
	shaderInt64                             = true
	shaderInt16                             = true
	shaderResourceResidency                 = true
	shaderResourceMinLod                    = true
	sparseBinding                           = true
	sparseResidencyBuffer                   = true
	sparseResidencyImage2D                  = true
	sparseResidencyImage3D                  = true
	sparseResidency2Samples                 = true
	sparseResidency4Samples                 = true
	sparseResidency8Samples                 = true
	sparseResidency16Samples                = true
	sparseResidencyAliased                  = true
	variableMultisampleRate                 = true
	inheritedQueries                        = true

VkPhysicalDevice16BitStorageFeatures:
-------------------------------------
	storageBuffer16BitAccess           = true
	uniformAndStorageBuffer16BitAccess = true
	storagePushConstant16              = true
	storageInputOutput16               = false

VkPhysicalDevice8BitStorageFeaturesKHR:
---------------------------------------
	storageBuffer8BitAccess           = true
	uniformAndStorageBuffer8BitAccess = true
	storagePushConstant8              = true

VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT:
--------------------------------------------------
	advancedBlendCoherentOperations = true

VkPhysicalDeviceBufferDeviceAddressFeaturesEXT:
-----------------------------------------------
	bufferDeviceAddress              = true
	bufferDeviceAddressCaptureReplay = false
	bufferDeviceAddressMultiDevice   = true

VkPhysicalDeviceConditionalRenderingFeaturesEXT:
------------------------------------------------
	conditionalRendering          = true
	inheritedConditionalRendering = true

VkPhysicalDeviceDepthClipEnableFeaturesEXT:
-------------------------------------------
	depthClipEnable = true

VkPhysicalDeviceDescriptorIndexingFeaturesEXT:
----------------------------------------------
	shaderInputAttachmentArrayDynamicIndexing          = true
	shaderUniformTexelBufferArrayDynamicIndexing       = true
	shaderStorageTexelBufferArrayDynamicIndexing       = true
	shaderUniformBufferArrayNonUniformIndexing         = true
	shaderSampledImageArrayNonUniformIndexing          = true
	shaderStorageBufferArrayNonUniformIndexing         = true
	shaderStorageImageArrayNonUniformIndexing          = true
	shaderInputAttachmentArrayNonUniformIndexing       = true
	shaderUniformTexelBufferArrayNonUniformIndexing    = true
	shaderStorageTexelBufferArrayNonUniformIndexing    = true
	descriptorBindingUniformBufferUpdateAfterBind      = true
	descriptorBindingSampledImageUpdateAfterBind       = true
	descriptorBindingStorageImageUpdateAfterBind       = true
	descriptorBindingStorageBufferUpdateAfterBind      = true
	descriptorBindingUniformTexelBufferUpdateAfterBind = true
	descriptorBindingStorageTexelBufferUpdateAfterBind = true
	descriptorBindingUpdateUnusedWhilePending          = true
	descriptorBindingPartiallyBound                    = true
	descriptorBindingVariableDescriptorCount           = true
	runtimeDescriptorArray                             = true

VkPhysicalDeviceFragmentShaderInterlockFeaturesEXT:
---------------------------------------------------
	fragmentShaderSampleInterlock      = true
	fragmentShaderPixelInterlock       = true
	fragmentShaderShadingRateInterlock = true

VkPhysicalDeviceHostQueryResetFeaturesEXT:
------------------------------------------
	hostQueryReset = true

VkPhysicalDeviceImagelessFramebufferFeaturesKHR:
------------------------------------------------
	imagelessFramebuffer = true

VkPhysicalDeviceIndexTypeUint8FeaturesEXT:
------------------------------------------
	indexTypeUint8 = true

VkPhysicalDeviceInlineUniformBlockFeaturesEXT:
----------------------------------------------
	inlineUniformBlock                                 = true
	descriptorBindingInlineUniformBlockUpdateAfterBind = true

VkPhysicalDeviceLineRasterizationFeaturesEXT:
---------------------------------------------
	rectangularLines         = true
	bresenhamLines           = true
	smoothLines              = true
	stippledRectangularLines = true
	stippledBresenhamLines   = true
	stippledSmoothLines      = true

VkPhysicalDeviceMultiviewFeatures:
----------------------------------
	multiview                   = true
	multiviewGeometryShader     = true
	multiviewTessellationShader = true

VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR:
--------------------------------------------------------
	pipelineExecutableInfo = true

VkPhysicalDeviceProtectedMemoryFeatures:
----------------------------------------
	protectedMemory = false

VkPhysicalDeviceSamplerYcbcrConversionFeatures:
-----------------------------------------------
	samplerYcbcrConversion = true

VkPhysicalDeviceScalarBlockLayoutFeaturesEXT:
---------------------------------------------
	scalarBlockLayout = true

VkPhysicalDeviceShaderAtomicInt64FeaturesKHR:
---------------------------------------------
	shaderBufferInt64Atomics = true
	shaderSharedInt64Atomics = true

VkPhysicalDeviceShaderDemoteToHelperInvocationFeaturesEXT:
----------------------------------------------------------
	shaderDemoteToHelperInvocation = true

VkPhysicalDeviceShaderDrawParametersFeatures:
---------------------------------------------
	shaderDrawParameters = true

VkPhysicalDeviceShaderFloat16Int8FeaturesKHR:
---------------------------------------------
	shaderFloat16 = true
	shaderInt8    = true

VkPhysicalDeviceSubgroupSizeControlFeaturesEXT:
-----------------------------------------------
	subgroupSizeControl  = true
	computeFullSubgroups = true

VkPhysicalDeviceTexelBufferAlignmentFeaturesEXT:
------------------------------------------------
	texelBufferAlignment = true

VkPhysicalDeviceTimelineSemaphoreFeaturesKHR:
---------------------------------------------
	timelineSemaphore = true

VkPhysicalDeviceTransformFeedbackFeaturesEXT:
---------------------------------------------
	transformFeedback = true
	geometryStreams   = true

VkPhysicalDeviceUniformBufferStandardLayoutFeaturesKHR:
-------------------------------------------------------
	uniformBufferStandardLayout = true

VkPhysicalDeviceVariablePointersFeatures:
-----------------------------------------
	variablePointersStorageBuffer = true
	variablePointers              = true

VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT:
--------------------------------------------------
	vertexAttributeInstanceRateDivisor     = true
	vertexAttributeInstanceRateZeroDivisor = true

VkPhysicalDeviceVulkanMemoryModelFeaturesKHR:
---------------------------------------------
	vulkanMemoryModel                             = true
	vulkanMemoryModelDeviceScope                  = true
	vulkanMemoryModelAvailabilityVisibilityChains = true

VkPhysicalDeviceYcbcrImageArraysFeaturesEXT:
--------------------------------------------
	ycbcrImageArrays = true

And on OpenCL, I get:

ffmpeg -hide_banner -v verbose -init_hw_device opencl
[AVHWDeviceContext @ 0x560de098d440] 0.0: NVIDIA CUDA / GeForce RTX 2080

In conclusion:

  1. What pixel formats are supported by the scale_vulkan filter? So far, no such information is provided in the documentation. See the output of:
ffmpeg -h filter=scale_vulkan

Filter scale_vulkan
  Scale Vulkan frames
    Inputs:
       #0: default (video)
    Outputs:
       #0: default (video)
scale_vulkan AVOptions:
  w                 <string>     ..FV...... Output video width (default "iw")
  h                 <string>     ..FV...... Output video height (default "ih")
  scaler            <int>        ..FV...... Scaler function (from 0 to 2) (default bilinear)
     bilinear        0            ..FV...... Bilinear interpolation (fastest)
     nearest         1            ..FV...... Nearest (useful for pixel art)
  1. With the new code path for HW-HW transfers, as documented in this patch work: https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/d7210ce7f5418508d6f8eec6e90d978e06a2d49e

What CLI options, apart from passing the hardware name to initialize via ffmpeg as shown above, are required to get the scale_vulkan filter working on NVIDIA hardware? As shown above, neither hwupload nor hwmap with device derivation for Vulkan yields usable results.

Change History (7)

comment:1 by Elon Musk, 4 years ago

-init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu

with usual hwdownload/hwupload filters.

comment:2 by Dennis E. Mungai, 4 years ago

Hello @richardpl ,

Your advice above works very well.

Here are two samples that work, one with NVDEC and the other without:

(a). With nvdec:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu \
   -hwaccel nvdec -hwaccel_device 0 -extra_hw_frames 2 \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]format=nv12,hwupload,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwdownload,format=nv12[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0,hwdownload,format=nv12[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\':f=mp4]'hq.mp4'| \
   [select=\'v:1,a\':f=mp4]'med.mp4'"

(b). No hwaccel decode:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]format=nv12,hwupload,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwdownload,format=nv12[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0,hwdownload,format=nv12[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\':f=mp4]'hq.mp4'| \
   [select=\'v:1,a\':f=mp4]'med.mp4'"

See the repeated use of the hwdownload filters trailing each scale_vulcan instance.

Where it all falls apart is if you attempt device derivation via hwupload.

The example above with NVDEC will fail if you attempt device derivation in hwupload so as to re-use the same context for hwaccel via nvdec:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu \
   -hwaccel nvdec -hwaccel_device cuda -extra_hw_frames 2 \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]format=nv12,hwupload=derive_device=cuda,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwdownload,format=nv12[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0,hwdownload,format=nv12[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\':f=mp4]'hq.mp4'| \
   [select=\'v:1,a\':f=mp4]'med.mp4'"

Error message:

Impossible to convert between the formats supported by the filter 'Parsed_split_2' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

comment:3 by Elon Musk, 4 years ago

I think you should use hwmap somehow in that case.

comment:4 by Philip Langdale, 4 years ago

Owner: set to Philip Langdale
Status: newopen

You need to tell the decoder to actually output to GPU memory - by default hwaccel always copies back to system memory.

So here's an example that decodes with nvdec, scales with vulkan and encodes with nvenc, all in GPU memory with no copy back to system memory. If you want to encode with a software encoder, then you need the hwdownload,format=nv12 at the end instead of the second hwupload.

ffmpeg -hwaccel nvdec -hwaccel_output_format cuda -i in.mp4 -init_hw_device cuda=cuda:0 -filter_hw_device cuda -vf hwupload=derive_device=vulkan,scale_vulkan=w=1920:h=1440,hwupload=derive_device=cuda -c:v hevc_nvenc out.mp4

comment:5 by Dennis E. Mungai, 4 years ago

Hello @philipl,

With your recommendation:

   ffmpeg -threads 1 -loglevel info -nostdin -y \
   -fflags +genpts-fastseek \
   -init_hw_device cuda=cuda:0 -filter_hw_device cuda \
   -hwaccel nvdec -hwaccel_output_format cuda -extra_hw_frames 2 \
   -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \
   -i feeds.mp4 -filter_complex \
  "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \
   [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwupload=derive_device=cuda[v0]; \
   [s1]scale_vulkan=w=1280:h=720:scaler=0,hwupload=derive_device=cuda[v1]" \
  -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \
  -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \
  -map "[v1]"  -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \
  -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \
  -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \
  -flags +global_header+cgop \
  -max_muxing_queue_size 9000000 -threads 2 -f tee  \
  "[select=\'v:0,a\':f=mp4]'hq.mp4'| \
   [select=\'v:1,a\':f=mp4]'med.mp4'"

That command works and its' exceptionally *faster* than the prior variant above.

The encoder in this case runs at about ~10x on an RTX 2080.
The previous command (with the repeated hwdownload instances) run at a paltry ~1.5x.

That's a ~7x speed up!

Thank you so much :-)

I can now (safely) close this ticket.

comment:6 by Philip Langdale, 4 years ago

Resolution: fixed
Status: openclosed

Great. Mind you, I would assume scale_cuda would be a better choice in your scenario, if that's the only filter you're actually using, and would probably be even faster.

comment:7 by Carl Eugen Hoyos, 4 years ago

Resolution: fixedinvalid
Note: See TracTickets for help on using tickets.