#8512 closed defect (invalid)
The Vulkan based filter scale_vulkan does not work on NVIDIA hardware.
Reported by: | Dennis E. Mungai | Owned by: | Philip Langdale |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | git-master | Keywords: | |
Cc: | dev@lynne.ee, philipl@overt.org | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Hello there,
I'm trying to get the Vulkan H/W filters, specifically scale_vulkan
working with NVIDIA GPUs, but to no avail.
With the hwupload filter, using the derive_device option as documented:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda -hwaccel_device cuda -hwaccel nvdec \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
Console output:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> hwupload (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
If I attempt to use the hwmap filter instead of hwupload to derive the necessary Vulkan context:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda -hwaccel_device cuda -hwaccel nvdec \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwmap=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
The same error occurs:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> hwmap (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
At this stage, it may have to do with the supported pixel format(s), as the scale_vulcan filter (at the present) does not support texture conversion, as noted in the commit https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/d95c509cc64372f8b37d89310250785224751a90.
However, attempting to request for specific pixel formats from the decoder, such as nv12, paired with the hwmap filter results in failure too:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda -hwaccel_device cuda -hwaccel nvdec -hwaccel_output_format nv12 \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwmap=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
Results in:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> hwmap (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
Repeating the same while removing hwaccel, ie NVDEC in this case:
(a). With the hwmap filter:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwmap=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
Results in the same error:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> hwmap (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
(b). And with the hwupload filter with the derive_device=vulkan option passed through:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
Issues the same error:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> hwupload (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_3' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
(c). And attempting to use an OpenCL device to derive a Vulkan context (from the same underlying hardware) dumps out a new error:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device opencl=cl:0.0 -filter_hw_device cl \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
Returns a totally new error:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> hwupload (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) [Parsed_hwupload_0 @ 0x5556532b2e00] Query format failed for 'Parsed_hwupload_0': Function not implemented Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
Attempts at pixel format conversion, with or without nvdec in place, via the snippet:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]format=nv12,hwupload=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\'f=mp4]'hq.mp4'| \ [select=\'v:1,a\'f=mp4]'med.mp4'"
Results in a new error:
ffmpeg version N-96640-gb4f300f8ea Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.3.0-6ubuntu1~18.04.1) configuration: --pkg-config-flags=--static --enable-static --disable-shared --prefix=/home/brainiarc7 --bindir=/home/brainiarc7/bin --extra-cflags=-I/home/brainiarc7/include --extra-ldflags=-L/home/brainiarc7/lib --enable-cuda-nvcc --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include/ --extra-ldflags=-L/usr/local/cuda/lib64/ --enable-nvenc --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-opencl --enable-libxml2 --enable-gpl --cpu=native --enable-opengl --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-openssl --enable-librtmp --enable-libpulse --enable-libzvbi --enable-librav1e --enable-libvmaf --enable-libglslang --enable-vulkan --enable-version3 --enable-pic --nvccflags='-gencode arch=compute_75,code=sm_75 -O2' --extra-libs='-lpthread -lm -lz -ldl' --enable-nonfree libavutil 56. 39.100 / 56. 39.100 libavcodec 58. 67.101 / 58. 67.101 libavformat 58. 37.100 / 58. 37.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 74.100 / 7. 74.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.1 : 5.1 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'feeds.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.37.100 Duration: 01:00:02.13, start: 0.000000, bitrate: 14468 kb/s Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 14070 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 389 kb/s (default) Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 (h264) -> format (graph 0) scale_vulkan (graph 0) -> Stream #0:0 (h264_nvenc) scale_vulkan (graph 0) -> Stream #0:1 (h264_nvenc) Stream #0:1 -> #0:2 (aac (native) -> aac (libfdk_aac)) Impossible to convert between the formats supported by the filter 'Parsed_scale_vulkan_4' and the filter 'auto_scaler_1' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
And related, here's the platform information:
ffmpeg -hide_banner -v verbose -init_hw_device list Supported hardware device types: vdpau cuda drm opencl vulkan
And on Vulkan in particular:
ffmpeg -hide_banner -v verbose -init_hw_device vulkan
[AVHWDeviceContext @ 0x5569f5d90440] GPU listing: [AVHWDeviceContext @ 0x5569f5d90440] 0: GeForce RTX 2080 (discrete) (0x1ed0) [AVHWDeviceContext @ 0x5569f5d90440] Using device: GeForce RTX 2080 [AVHWDeviceContext @ 0x5569f5d90440] Alignments: [AVHWDeviceContext @ 0x5569f5d90440] optimalBufferCopyOffsetAlignment: 1 [AVHWDeviceContext @ 0x5569f5d90440] optimalBufferCopyRowPitchAlignment: 1 [AVHWDeviceContext @ 0x5569f5d90440] minMemoryMapAlignment: 64 [AVHWDeviceContext @ 0x5569f5d90440] Using queue family 0 for graphics, flags: (graphics) (compute) (transfer) (sparse) [AVHWDeviceContext @ 0x5569f5d90440] Using queue family 2 for compute, flags: (compute) (transfer) (sparse) [AVHWDeviceContext @ 0x5569f5d90440] Using queue family 1 for transfers, flags: (transfer) (sparse) [AVHWDeviceContext @ 0x5569f5d90440] Using device extension "VK_KHR_external_memory_fd" [AVHWDeviceContext @ 0x5569f5d90440] Extension "VK_EXT_external_memory_dma_buf" not found! [AVHWDeviceContext @ 0x5569f5d90440] Extension "VK_EXT_image_drm_format_modifier" not found! [AVHWDeviceContext @ 0x5569f5d90440] Using device extension "VK_KHR_external_semaphore_fd"
I'm assuming that the warnings on the two Vulkan extensions not being found, ie "VK_EXT_external_memory_dma_buf" and "VK_EXT_image_drm_format_modifier" may point to issues with the NVIDIA driver.
Here's the output from Vulkaninfo:
========== VULKANINFO ========== Vulkan Instance Version: 1.1.126 Instance Extensions: count = 18 ==================== VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 8 VK_EXT_debug_utils : extension revision 1 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 1 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Layers: count = 0 ======= Presentable Surfaces: ===================== GPU id : 0 (GeForce RTX 2080): Surface types: count = 2 VK_KHR_xcb_surface VK_KHR_xlib_surface Formats: count = 2 SurfaceFormat[0]: format = FORMAT_B8G8R8A8_UNORM colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR SurfaceFormat[1]: format = FORMAT_B8G8R8A8_SRGB colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR Present Modes: count = 3 PRESENT_MODE_FIFO_KHR PRESENT_MODE_FIFO_RELAXED_KHR PRESENT_MODE_IMMEDIATE_KHR VkSurfaceCapabilitiesKHR: ------------------------- minImageCount = 2 maxImageCount = 8 currentExtent: width = 256 height = 256 minImageExtent: width = 256 height = 256 maxImageExtent: width = 256 height = 256 maxImageArrayLayers = 1 supportedTransforms: SURFACE_TRANSFORM_IDENTITY_BIT_KHR currentTransform: SURFACE_TRANSFORM_IDENTITY_BIT_KHR supportedCompositeAlpha: COMPOSITE_ALPHA_OPAQUE_BIT_KHR supportedUsageFlags: IMAGE_USAGE_TRANSFER_SRC_BIT IMAGE_USAGE_TRANSFER_DST_BIT IMAGE_USAGE_SAMPLED_BIT IMAGE_USAGE_STORAGE_BIT IMAGE_USAGE_COLOR_ATTACHMENT_BIT IMAGE_USAGE_INPUT_ATTACHMENT_BIT VkSurfaceCapabilities2EXT: -------------------------- supportedSurfaceCounters: None VkSurfaceProtectedCapabilitiesKHR: ---------------------------------- supportsProtected = false Groups: ======= Device Group Properties (Group 0): physicalDeviceCount: count = 1 GeForce RTX 2080 (ID: 0) subsetAllocation = 0 Device Group Present Capabilities (Group 0): GeForce RTX 2080 (ID: 0) Can present images from the following devices: GeForce RTX 2080 (ID: 0) Present modes: DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR Device Properties and Extensions: ================================= GPU0: VkPhysicalDeviceProperties: --------------------------- apiVersion = 4198519 (1.1.119) driverVersion = 1846034496 (0x6e084040) vendorID = 0x10de deviceID = 0x1ed0 deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = GeForce RTX 2080 VkPhysicalDeviceLimits: ----------------------- maxImageDimension1D = 32768 maxImageDimension2D = 32768 maxImageDimension3D = 16384 maxImageDimensionCube = 32768 maxImageArrayLayers = 2048 maxTexelBufferElements = 134217728 maxUniformBufferRange = 65536 maxStorageBufferRange = 4294967295 maxPushConstantsSize = 256 maxMemoryAllocationCount = 4294967295 maxSamplerAllocationCount = 4000 bufferImageGranularity = 0x00000400 sparseAddressSpaceSize = 0xffffffffffffffff maxBoundDescriptorSets = 32 maxPerStageDescriptorSamplers = 1048576 maxPerStageDescriptorUniformBuffers = 1048576 maxPerStageDescriptorStorageBuffers = 1048576 maxPerStageDescriptorSampledImages = 1048576 maxPerStageDescriptorStorageImages = 1048576 maxPerStageDescriptorInputAttachments = 1048576 maxPerStageResources = 4294967295 maxDescriptorSetSamplers = 1048576 maxDescriptorSetUniformBuffers = 1048576 maxDescriptorSetUniformBuffersDynamic = 15 maxDescriptorSetStorageBuffers = 1048576 maxDescriptorSetStorageBuffersDynamic = 16 maxDescriptorSetSampledImages = 1048576 maxDescriptorSetStorageImages = 1048576 maxDescriptorSetInputAttachments = 1048576 maxVertexInputAttributes = 32 maxVertexInputBindings = 32 maxVertexInputAttributeOffset = 2047 maxVertexInputBindingStride = 2048 maxVertexOutputComponents = 128 maxTessellationGenerationLevel = 64 maxTessellationPatchSize = 32 maxTessellationControlPerVertexInputComponents = 128 maxTessellationControlPerVertexOutputComponents = 128 maxTessellationControlPerPatchOutputComponents = 120 maxTessellationControlTotalOutputComponents = 4216 maxTessellationEvaluationInputComponents = 128 maxTessellationEvaluationOutputComponents = 128 maxGeometryShaderInvocations = 32 maxGeometryInputComponents = 128 maxGeometryOutputComponents = 128 maxGeometryOutputVertices = 1024 maxGeometryTotalOutputComponents = 1024 maxFragmentInputComponents = 128 maxFragmentOutputAttachments = 8 maxFragmentDualSrcAttachments = 1 maxFragmentCombinedOutputResources = 16 maxComputeSharedMemorySize = 49152 maxComputeWorkGroupCount: count = 3 2147483647 65535 65535 maxComputeWorkGroupInvocations = 1024 maxComputeWorkGroupSize: count = 3 1024 1024 64 subPixelPrecisionBits = 8 subTexelPrecisionBits = 8 mipmapPrecisionBits = 8 maxDrawIndexedIndexValue = 4294967295 maxDrawIndirectCount = 4294967295 maxSamplerLodBias = 15 maxSamplerAnisotropy = 16 maxViewports = 16 maxViewportDimensions: count = 2 32768 32768 viewportBoundsRange: count = 2 -65536 65536 viewportSubPixelBits = 8 minMemoryMapAlignment = 64 minTexelBufferOffsetAlignment = 0x00000010 minUniformBufferOffsetAlignment = 0x00000040 minStorageBufferOffsetAlignment = 0x00000010 minTexelOffset = -8 maxTexelOffset = 7 minTexelGatherOffset = -32 maxTexelGatherOffset = 31 minInterpolationOffset = -0.5 maxInterpolationOffset = 0.4375 subPixelInterpolationOffsetBits = 4 maxFramebufferWidth = 32768 maxFramebufferHeight = 32768 maxFramebufferLayers = 2048 framebufferColorSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT framebufferDepthSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT framebufferStencilSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT framebufferNoAttachmentsSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT maxColorAttachments = 8 sampledImageColorSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT sampledImageIntegerSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT sampledImageDepthSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT sampledImageStencilSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT storageImageSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT maxSampleMaskWords = 1 timestampComputeAndGraphics = true timestampPeriod = 1 maxClipDistances = 8 maxCullDistances = 8 maxCombinedClipAndCullDistances = 8 discreteQueuePriorities = 2 pointSizeRange: count = 2 1 2047.94 lineWidthRange: count = 2 1 64 pointSizeGranularity = 0.0625 lineWidthGranularity = 0.0625 strictLines = true standardSampleLocations = true optimalBufferCopyOffsetAlignment = 0x00000001 optimalBufferCopyRowPitchAlignment = 0x00000001 nonCoherentAtomSize = 0x00000040 VkPhysicalDeviceSparseProperties: --------------------------------- residencyStandard2DBlockShape = true residencyStandard2DMultisampleBlockShape = true residencyStandard3DBlockShape = true residencyAlignedMipSize = false residencyNonResidentStrict = true VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT: ---------------------------------------------------- advancedBlendMaxColorAttachments = 8 advancedBlendIndependentBlend = false advancedBlendNonPremultipliedSrcColor = true advancedBlendNonPremultipliedDstColor = true advancedBlendCorrelatedOverlap = true advancedBlendAllOperations = true VkPhysicalDeviceConservativeRasterizationPropertiesEXT: ------------------------------------------------------- primitiveOverestimationSize = 0.00195312 maxExtraPrimitiveOverestimationSize = 0.75 extraPrimitiveOverestimationSizeGranularity = 0.25 primitiveUnderestimation = true conservativePointAndLineRasterization = true degenerateTrianglesRasterized = true degenerateLinesRasterized = true fullyCoveredFragmentShaderInputVariable = true conservativeRasterizationPostDepthCoverage = true VkPhysicalDeviceDepthStencilResolvePropertiesKHR: ------------------------------------------------- supportedDepthResolveModes: RESOLVE_MODE_SAMPLE_ZERO_BIT_KHR RESOLVE_MODE_AVERAGE_BIT_KHR RESOLVE_MODE_MIN_BIT_KHR RESOLVE_MODE_MAX_BIT_KHR supportedStencilResolveModes: RESOLVE_MODE_SAMPLE_ZERO_BIT_KHR RESOLVE_MODE_MIN_BIT_KHR RESOLVE_MODE_MAX_BIT_KHR independentResolveNone = true independentResolve = true VkPhysicalDeviceDescriptorIndexingPropertiesEXT: ------------------------------------------------ maxUpdateAfterBindDescriptorsInAllPools = 4294967295 shaderUniformBufferArrayNonUniformIndexingNative = true shaderSampledImageArrayNonUniformIndexingNative = true shaderStorageBufferArrayNonUniformIndexingNative = true shaderStorageImageArrayNonUniformIndexingNative = true shaderInputAttachmentArrayNonUniformIndexingNative = true robustBufferAccessUpdateAfterBind = true quadDivergentImplicitLod = true maxPerStageDescriptorUpdateAfterBindSamplers = 1048576 maxPerStageDescriptorUpdateAfterBindUniformBuffers = 1048576 maxPerStageDescriptorUpdateAfterBindStorageBuffers = 1048576 maxPerStageDescriptorUpdateAfterBindSampledImages = 1048576 maxPerStageDescriptorUpdateAfterBindStorageImages = 1048576 maxPerStageDescriptorUpdateAfterBindInputAttachments = 1048576 maxPerStageUpdateAfterBindResources = 4294967295 maxDescriptorSetUpdateAfterBindSamplers = 1048576 maxDescriptorSetUpdateAfterBindUniformBuffers = 1048576 maxDescriptorSetUpdateAfterBindUniformBuffersDynamic = 15 maxDescriptorSetUpdateAfterBindStorageBuffers = 1048576 maxDescriptorSetUpdateAfterBindStorageBuffersDynamic = 16 maxDescriptorSetUpdateAfterBindSampledImages = 1048576 maxDescriptorSetUpdateAfterBindStorageImages = 1048576 maxDescriptorSetUpdateAfterBindInputAttachments = 1048576 VkPhysicalDeviceDiscardRectanglePropertiesEXT: ---------------------------------------------- maxDiscardRectangles = 8 VkPhysicalDeviceDriverPropertiesKHR: ------------------------------------ driverID = DRIVER_ID_NVIDIA_PROPRIETARY_KHR driverName = NVIDIA driverInfo = 440.33.01 conformanceVersion = 1.1.6.0 VkPhysicalDeviceFloatControlsPropertiesKHR: ------------------------------------------- denormBehaviorIndependence = SHADER_FLOAT_CONTROLS_INDEPENDENCE_ALL_KHR roundingModeIndependence = SHADER_FLOAT_CONTROLS_INDEPENDENCE_ALL_KHR shaderSignedZeroInfNanPreserveFloat16 = true shaderSignedZeroInfNanPreserveFloat32 = true shaderSignedZeroInfNanPreserveFloat64 = true shaderDenormPreserveFloat16 = true shaderDenormPreserveFloat32 = false shaderDenormPreserveFloat64 = false shaderDenormFlushToZeroFloat16 = false shaderDenormFlushToZeroFloat32 = false shaderDenormFlushToZeroFloat64 = false shaderRoundingModeRTEFloat16 = true shaderRoundingModeRTEFloat32 = true shaderRoundingModeRTEFloat64 = true shaderRoundingModeRTZFloat16 = false shaderRoundingModeRTZFloat32 = true shaderRoundingModeRTZFloat64 = true VkPhysicalDeviceIDProperties: ----------------------------- deviceUUID = e437e085-c68a-481d-f32-2acf2d506f69 driverUUID = c1b695f8-3620-ed2a-a1eb-5e40ab29d4e deviceNodeMask = 1 deviceLUIDValid = false VkPhysicalDeviceInlineUniformBlockPropertiesEXT: ------------------------------------------------ maxInlineUniformBlockSize = 256 maxPerStageDescriptorInlineUniformBlocks = 32 maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks = 32 maxDescriptorSetInlineUniformBlocks = 32 maxDescriptorSetUpdateAfterBindInlineUniformBlocks = 32 VkPhysicalDeviceLineRasterizationPropertiesEXT: ----------------------------------------------- lineSubPixelPrecisionBits = 8 VkPhysicalDeviceMaintenance3Properties: --------------------------------------- maxPerSetDescriptors = 4294967295 maxMemoryAllocationSize = 0xffe00000 VkPhysicalDeviceMultiviewProperties: ------------------------------------ maxMultiviewViewCount = 32 maxMultiviewInstanceIndex = 134217727 VkPhysicalDevicePCIBusInfoPropertiesEXT: ---------------------------------------- pciDomain = 0 pciBus = 1 pciDevice = 0 pciFunction = 0 VkPhysicalDevicePointClippingProperties: ---------------------------------------- pointClippingBehavior = POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY VkPhysicalDeviceProtectedMemoryProperties: ------------------------------------------ protectedNoFault = false VkPhysicalDevicePushDescriptorPropertiesKHR: -------------------------------------------- maxPushDescriptors = 32 VkPhysicalDeviceSampleLocationsPropertiesEXT: --------------------------------------------- sampleLocationSampleCounts: SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT maxSampleLocationGridSize: width = 1 height = 1 sampleLocationCoordinateRange: count = 2 0 0.9375 sampleLocationSubPixelBits = 4 variableSampleLocations = true VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT: ------------------------------------------------- filterMinmaxSingleComponentFormats = true filterMinmaxImageComponentMapping = true VkPhysicalDeviceSubgroupProperties: ----------------------------------- subgroupSize = 32 supportedStages: SHADER_STAGE_VERTEX_BIT SHADER_STAGE_TESSELLATION_CONTROL_BIT SHADER_STAGE_TESSELLATION_EVALUATION_BIT SHADER_STAGE_GEOMETRY_BIT SHADER_STAGE_FRAGMENT_BIT SHADER_STAGE_COMPUTE_BIT SHADER_STAGE_ALL_GRAPHICS SHADER_STAGE_ALL SHADER_STAGE_RAYGEN_BIT_NV SHADER_STAGE_ANY_HIT_BIT_NV SHADER_STAGE_CLOSEST_HIT_BIT_NV SHADER_STAGE_MISS_BIT_NV SHADER_STAGE_INTERSECTION_BIT_NV SHADER_STAGE_CALLABLE_BIT_NV SHADER_STAGE_TASK_BIT_NV SHADER_STAGE_MESH_BIT_NV supportedOperations: SUBGROUP_FEATURE_BASIC_BIT SUBGROUP_FEATURE_VOTE_BIT SUBGROUP_FEATURE_ARITHMETIC_BIT SUBGROUP_FEATURE_BALLOT_BIT SUBGROUP_FEATURE_SHUFFLE_BIT SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT SUBGROUP_FEATURE_CLUSTERED_BIT SUBGROUP_FEATURE_QUAD_BIT SUBGROUP_FEATURE_PARTITIONED_BIT_NV quadOperationsInAllStages = true VkPhysicalDeviceSubgroupSizeControlPropertiesEXT: ------------------------------------------------- minSubgroupSize = 32 maxSubgroupSize = 32 maxComputeWorkgroupSubgroups = 2097152 requiredSubgroupSizeStages: SHADER_STAGE_VERTEX_BIT SHADER_STAGE_TESSELLATION_CONTROL_BIT SHADER_STAGE_TESSELLATION_EVALUATION_BIT SHADER_STAGE_GEOMETRY_BIT SHADER_STAGE_FRAGMENT_BIT SHADER_STAGE_COMPUTE_BIT SHADER_STAGE_ALL_GRAPHICS SHADER_STAGE_ALL SHADER_STAGE_RAYGEN_BIT_NV SHADER_STAGE_ANY_HIT_BIT_NV SHADER_STAGE_CLOSEST_HIT_BIT_NV SHADER_STAGE_MISS_BIT_NV SHADER_STAGE_INTERSECTION_BIT_NV SHADER_STAGE_CALLABLE_BIT_NV SHADER_STAGE_TASK_BIT_NV SHADER_STAGE_MESH_BIT_NV VkPhysicalDeviceTexelBufferAlignmentPropertiesEXT: -------------------------------------------------- storageTexelBufferOffsetAlignmentBytes = 0x00000010 storageTexelBufferOffsetSingleTexelAlignment = true uniformTexelBufferOffsetAlignmentBytes = 0x00000010 uniformTexelBufferOffsetSingleTexelAlignment = true VkPhysicalDeviceTimelineSemaphorePropertiesKHR: ----------------------------------------------- maxTimelineSemaphoreValueDifference = 18446744073709551615 VkPhysicalDeviceTransformFeedbackPropertiesEXT: ----------------------------------------------- maxTransformFeedbackStreams = 4 maxTransformFeedbackBuffers = 4 maxTransformFeedbackBufferSize = 0xffffffffffffffff maxTransformFeedbackStreamDataSize = 2048 maxTransformFeedbackBufferDataSize = 512 maxTransformFeedbackBufferDataStride = 2048 transformFeedbackQueries = true transformFeedbackStreamsLinesTriangles = false transformFeedbackRasterizationStreamSelect = true transformFeedbackDraw = true VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT: ---------------------------------------------------- maxVertexAttribDivisor = 4294967295 Device Extensions: count = 100 VK_EXT_blend_operation_advanced : extension revision 2 VK_EXT_buffer_device_address : extension revision 2 VK_EXT_calibrated_timestamps : extension revision 1 VK_EXT_conditional_rendering : extension revision 1 VK_EXT_conservative_rasterization : extension revision 1 VK_EXT_depth_clip_enable : extension revision 1 VK_EXT_depth_range_unrestricted : extension revision 1 VK_EXT_descriptor_indexing : extension revision 2 VK_EXT_discard_rectangles : extension revision 1 VK_EXT_display_control : extension revision 1 VK_EXT_fragment_shader_interlock : extension revision 1 VK_EXT_global_priority : extension revision 2 VK_EXT_host_query_reset : extension revision 1 VK_EXT_index_type_uint8 : extension revision 1 VK_EXT_inline_uniform_block : extension revision 1 VK_EXT_line_rasterization : extension revision 1 VK_EXT_memory_budget : extension revision 1 VK_EXT_pci_bus_info : extension revision 2 VK_EXT_pipeline_creation_feedback : extension revision 1 VK_EXT_post_depth_coverage : extension revision 1 VK_EXT_sample_locations : extension revision 1 VK_EXT_sampler_filter_minmax : extension revision 1 VK_EXT_scalar_block_layout : extension revision 1 VK_EXT_separate_stencil_usage : extension revision 1 VK_EXT_shader_demote_to_helper_invocation : extension revision 1 VK_EXT_shader_subgroup_ballot : extension revision 1 VK_EXT_shader_subgroup_vote : extension revision 1 VK_EXT_shader_viewport_index_layer : extension revision 1 VK_EXT_subgroup_size_control : extension revision 2 VK_EXT_texel_buffer_alignment : extension revision 1 VK_EXT_transform_feedback : extension revision 1 VK_EXT_vertex_attribute_divisor : extension revision 3 VK_EXT_ycbcr_image_arrays : extension revision 1 VK_KHR_16bit_storage : extension revision 1 VK_KHR_8bit_storage : extension revision 1 VK_KHR_bind_memory2 : extension revision 1 VK_KHR_create_renderpass2 : extension revision 1 VK_KHR_dedicated_allocation : extension revision 3 VK_KHR_depth_stencil_resolve : extension revision 1 VK_KHR_descriptor_update_template : extension revision 1 VK_KHR_device_group : extension revision 3 VK_KHR_draw_indirect_count : extension revision 1 VK_KHR_driver_properties : extension revision 1 VK_KHR_external_fence : extension revision 1 VK_KHR_external_fence_fd : extension revision 1 VK_KHR_external_memory : extension revision 1 VK_KHR_external_memory_fd : extension revision 1 VK_KHR_external_semaphore : extension revision 1 VK_KHR_external_semaphore_fd : extension revision 1 VK_KHR_get_memory_requirements2 : extension revision 1 VK_KHR_image_format_list : extension revision 1 VK_KHR_imageless_framebuffer : extension revision 1 VK_KHR_maintenance1 : extension revision 2 VK_KHR_maintenance2 : extension revision 1 VK_KHR_maintenance3 : extension revision 1 VK_KHR_multiview : extension revision 1 VK_KHR_pipeline_executable_properties : extension revision 1 VK_KHR_push_descriptor : extension revision 2 VK_KHR_relaxed_block_layout : extension revision 1 VK_KHR_sampler_mirror_clamp_to_edge : extension revision 1 VK_KHR_sampler_ycbcr_conversion : extension revision 1 VK_KHR_shader_atomic_int64 : extension revision 1 VK_KHR_shader_draw_parameters : extension revision 1 VK_KHR_shader_float16_int8 : extension revision 1 VK_KHR_shader_float_controls : extension revision 4 VK_KHR_storage_buffer_storage_class : extension revision 1 VK_KHR_swapchain : extension revision 70 VK_KHR_swapchain_mutable_format : extension revision 1 VK_KHR_timeline_semaphore : extension revision 2 VK_KHR_uniform_buffer_standard_layout : extension revision 1 VK_KHR_variable_pointers : extension revision 1 VK_KHR_vulkan_memory_model : extension revision 3 VK_NVX_binary_import : extension revision 1 VK_NVX_device_generated_commands : extension revision 3 VK_NVX_image_view_handle : extension revision 1 VK_NVX_multiview_per_view_attributes : extension revision 1 VK_NV_clip_space_w_scaling : extension revision 1 VK_NV_compute_shader_derivatives : extension revision 1 VK_NV_cooperative_matrix : extension revision 1 VK_NV_corner_sampled_image : extension revision 2 VK_NV_coverage_reduction_mode : extension revision 1 VK_NV_dedicated_allocation : extension revision 1 VK_NV_dedicated_allocation_image_aliasing : extension revision 1 VK_NV_device_diagnostic_checkpoints : extension revision 2 VK_NV_fill_rectangle : extension revision 1 VK_NV_fragment_coverage_to_color : extension revision 1 VK_NV_fragment_shader_barycentric : extension revision 1 VK_NV_framebuffer_mixed_samples : extension revision 1 VK_NV_geometry_shader_passthrough : extension revision 1 VK_NV_mesh_shader : extension revision 1 VK_NV_ray_tracing : extension revision 3 VK_NV_representative_fragment_test : extension revision 1 VK_NV_sample_mask_override_coverage : extension revision 1 VK_NV_scissor_exclusive : extension revision 1 VK_NV_shader_image_footprint : extension revision 1 VK_NV_shader_sm_builtins : extension revision 1 VK_NV_shader_subgroup_partitioned : extension revision 1 VK_NV_shading_rate_image : extension revision 3 VK_NV_viewport_array2 : extension revision 1 VK_NV_viewport_swizzle : extension revision 1 VkQueueFamilyProperties[0]: ========================== minImageTransferGranularity = (1, 1, 1) queueCount = 16 queueFlags = QUEUE_GRAPHICS | QUEUE_COMPUTE | QUEUE_TRANSFER | QUEUE_SPARSE_BINDING timestampValidBits = 64 present support: VK_KHR_xcb_surface = true VK_KHR_xlib_surface = true VkQueueFamilyProperties[1]: ========================== minImageTransferGranularity = (1, 1, 1) queueCount = 2 queueFlags = QUEUE_TRANSFER | QUEUE_SPARSE_BINDING timestampValidBits = 64 present support = false VkQueueFamilyProperties[2]: ========================== minImageTransferGranularity = (1, 1, 1) queueCount = 8 queueFlags = QUEUE_COMPUTE | QUEUE_TRANSFER | QUEUE_SPARSE_BINDING timestampValidBits = 64 present support: VK_KHR_xcb_surface = false VK_KHR_xlib_surface = true VkPhysicalDeviceMemoryProperties: ================================= memoryHeaps: count = 2 memoryHeaps[0]: size = 8589934592 (0x200000000) (8.00 GiB) budget = 8033009664 usage = 0 flags: MEMORY_HEAP_DEVICE_LOCAL_BIT memoryHeaps[1]: size = 50584952832 (0xbc7192000) (47.11 GiB) budget = 50584952832 usage = 0 flags: None memoryTypes: count = 11 memoryTypes[0]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None memoryTypes[1]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: color images IMAGE_TILING_LINEAR: None memoryTypes[2]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: FORMAT_D16_UNORM IMAGE_TILING_LINEAR: None memoryTypes[3]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: FORMAT_X8_D24_UNORM_PACK32, FORMAT_D24_UNORM_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[4]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: FORMAT_D32_SFLOAT IMAGE_TILING_LINEAR: None memoryTypes[5]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: FORMAT_D32_SFLOAT_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[6]: heapIndex = 1 propertyFlags = 0x0000: None usable for: IMAGE_TILING_OPTIMAL: FORMAT_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[7]: heapIndex = 0 propertyFlags = 0x0001: MEMORY_PROPERTY_DEVICE_LOCAL_BIT usable for: IMAGE_TILING_OPTIMAL: color images, FORMAT_D16_UNORM, FORMAT_X8_D24_UNORM_PACK32, FORMAT_D32_SFLOAT, FORMAT_S8_UINT, FORMAT_D24_UNORM_S8_UINT, FORMAT_D32_SFLOAT_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[8]: heapIndex = 0 propertyFlags = 0x0001: MEMORY_PROPERTY_DEVICE_LOCAL_BIT usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None memoryTypes[9]: heapIndex = 1 propertyFlags = 0x0006: MEMORY_PROPERTY_HOST_VISIBLE_BIT MEMORY_PROPERTY_HOST_COHERENT_BIT usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None memoryTypes[10]: heapIndex = 1 propertyFlags = 0x000e: MEMORY_PROPERTY_HOST_VISIBLE_BIT MEMORY_PROPERTY_HOST_COHERENT_BIT MEMORY_PROPERTY_HOST_CACHED_BIT usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None VkPhysicalDeviceFeatures: ========================= robustBufferAccess = true fullDrawIndexUint32 = true imageCubeArray = true independentBlend = true geometryShader = true tessellationShader = true sampleRateShading = true dualSrcBlend = true logicOp = true multiDrawIndirect = true drawIndirectFirstInstance = true depthClamp = true depthBiasClamp = true fillModeNonSolid = true depthBounds = true wideLines = true largePoints = true alphaToOne = true multiViewport = true samplerAnisotropy = true textureCompressionETC2 = false textureCompressionASTC_LDR = false textureCompressionBC = true occlusionQueryPrecise = true pipelineStatisticsQuery = true vertexPipelineStoresAndAtomics = true fragmentStoresAndAtomics = true shaderTessellationAndGeometryPointSize = true shaderImageGatherExtended = true shaderStorageImageExtendedFormats = true shaderStorageImageMultisample = true shaderStorageImageReadWithoutFormat = true shaderStorageImageWriteWithoutFormat = true shaderUniformBufferArrayDynamicIndexing = true shaderSampledImageArrayDynamicIndexing = true shaderStorageBufferArrayDynamicIndexing = true shaderStorageImageArrayDynamicIndexing = true shaderClipDistance = true shaderCullDistance = true shaderFloat64 = true shaderInt64 = true shaderInt16 = true shaderResourceResidency = true shaderResourceMinLod = true sparseBinding = true sparseResidencyBuffer = true sparseResidencyImage2D = true sparseResidencyImage3D = true sparseResidency2Samples = true sparseResidency4Samples = true sparseResidency8Samples = true sparseResidency16Samples = true sparseResidencyAliased = true variableMultisampleRate = true inheritedQueries = true VkPhysicalDevice16BitStorageFeatures: ------------------------------------- storageBuffer16BitAccess = true uniformAndStorageBuffer16BitAccess = true storagePushConstant16 = true storageInputOutput16 = false VkPhysicalDevice8BitStorageFeaturesKHR: --------------------------------------- storageBuffer8BitAccess = true uniformAndStorageBuffer8BitAccess = true storagePushConstant8 = true VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT: -------------------------------------------------- advancedBlendCoherentOperations = true VkPhysicalDeviceBufferDeviceAddressFeaturesEXT: ----------------------------------------------- bufferDeviceAddress = true bufferDeviceAddressCaptureReplay = false bufferDeviceAddressMultiDevice = true VkPhysicalDeviceConditionalRenderingFeaturesEXT: ------------------------------------------------ conditionalRendering = true inheritedConditionalRendering = true VkPhysicalDeviceDepthClipEnableFeaturesEXT: ------------------------------------------- depthClipEnable = true VkPhysicalDeviceDescriptorIndexingFeaturesEXT: ---------------------------------------------- shaderInputAttachmentArrayDynamicIndexing = true shaderUniformTexelBufferArrayDynamicIndexing = true shaderStorageTexelBufferArrayDynamicIndexing = true shaderUniformBufferArrayNonUniformIndexing = true shaderSampledImageArrayNonUniformIndexing = true shaderStorageBufferArrayNonUniformIndexing = true shaderStorageImageArrayNonUniformIndexing = true shaderInputAttachmentArrayNonUniformIndexing = true shaderUniformTexelBufferArrayNonUniformIndexing = true shaderStorageTexelBufferArrayNonUniformIndexing = true descriptorBindingUniformBufferUpdateAfterBind = true descriptorBindingSampledImageUpdateAfterBind = true descriptorBindingStorageImageUpdateAfterBind = true descriptorBindingStorageBufferUpdateAfterBind = true descriptorBindingUniformTexelBufferUpdateAfterBind = true descriptorBindingStorageTexelBufferUpdateAfterBind = true descriptorBindingUpdateUnusedWhilePending = true descriptorBindingPartiallyBound = true descriptorBindingVariableDescriptorCount = true runtimeDescriptorArray = true VkPhysicalDeviceFragmentShaderInterlockFeaturesEXT: --------------------------------------------------- fragmentShaderSampleInterlock = true fragmentShaderPixelInterlock = true fragmentShaderShadingRateInterlock = true VkPhysicalDeviceHostQueryResetFeaturesEXT: ------------------------------------------ hostQueryReset = true VkPhysicalDeviceImagelessFramebufferFeaturesKHR: ------------------------------------------------ imagelessFramebuffer = true VkPhysicalDeviceIndexTypeUint8FeaturesEXT: ------------------------------------------ indexTypeUint8 = true VkPhysicalDeviceInlineUniformBlockFeaturesEXT: ---------------------------------------------- inlineUniformBlock = true descriptorBindingInlineUniformBlockUpdateAfterBind = true VkPhysicalDeviceLineRasterizationFeaturesEXT: --------------------------------------------- rectangularLines = true bresenhamLines = true smoothLines = true stippledRectangularLines = true stippledBresenhamLines = true stippledSmoothLines = true VkPhysicalDeviceMultiviewFeatures: ---------------------------------- multiview = true multiviewGeometryShader = true multiviewTessellationShader = true VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR: -------------------------------------------------------- pipelineExecutableInfo = true VkPhysicalDeviceProtectedMemoryFeatures: ---------------------------------------- protectedMemory = false VkPhysicalDeviceSamplerYcbcrConversionFeatures: ----------------------------------------------- samplerYcbcrConversion = true VkPhysicalDeviceScalarBlockLayoutFeaturesEXT: --------------------------------------------- scalarBlockLayout = true VkPhysicalDeviceShaderAtomicInt64FeaturesKHR: --------------------------------------------- shaderBufferInt64Atomics = true shaderSharedInt64Atomics = true VkPhysicalDeviceShaderDemoteToHelperInvocationFeaturesEXT: ---------------------------------------------------------- shaderDemoteToHelperInvocation = true VkPhysicalDeviceShaderDrawParametersFeatures: --------------------------------------------- shaderDrawParameters = true VkPhysicalDeviceShaderFloat16Int8FeaturesKHR: --------------------------------------------- shaderFloat16 = true shaderInt8 = true VkPhysicalDeviceSubgroupSizeControlFeaturesEXT: ----------------------------------------------- subgroupSizeControl = true computeFullSubgroups = true VkPhysicalDeviceTexelBufferAlignmentFeaturesEXT: ------------------------------------------------ texelBufferAlignment = true VkPhysicalDeviceTimelineSemaphoreFeaturesKHR: --------------------------------------------- timelineSemaphore = true VkPhysicalDeviceTransformFeedbackFeaturesEXT: --------------------------------------------- transformFeedback = true geometryStreams = true VkPhysicalDeviceUniformBufferStandardLayoutFeaturesKHR: ------------------------------------------------------- uniformBufferStandardLayout = true VkPhysicalDeviceVariablePointersFeatures: ----------------------------------------- variablePointersStorageBuffer = true variablePointers = true VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT: -------------------------------------------------- vertexAttributeInstanceRateDivisor = true vertexAttributeInstanceRateZeroDivisor = true VkPhysicalDeviceVulkanMemoryModelFeaturesKHR: --------------------------------------------- vulkanMemoryModel = true vulkanMemoryModelDeviceScope = true vulkanMemoryModelAvailabilityVisibilityChains = true VkPhysicalDeviceYcbcrImageArraysFeaturesEXT: -------------------------------------------- ycbcrImageArrays = true
And on OpenCL, I get:
ffmpeg -hide_banner -v verbose -init_hw_device opencl [AVHWDeviceContext @ 0x560de098d440] 0.0: NVIDIA CUDA / GeForce RTX 2080
In conclusion:
- What pixel formats are supported by the scale_vulkan filter? So far, no such information is provided in the documentation. See the output of:
ffmpeg -h filter=scale_vulkan Filter scale_vulkan Scale Vulkan frames Inputs: #0: default (video) Outputs: #0: default (video) scale_vulkan AVOptions: w <string> ..FV...... Output video width (default "iw") h <string> ..FV...... Output video height (default "ih") scaler <int> ..FV...... Scaler function (from 0 to 2) (default bilinear) bilinear 0 ..FV...... Bilinear interpolation (fastest) nearest 1 ..FV...... Nearest (useful for pixel art)
- With the new code path for HW-HW transfers, as documented in this patch work: https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/d7210ce7f5418508d6f8eec6e90d978e06a2d49e
What CLI options, apart from passing the hardware name to initialize via ffmpeg as shown above, are required to get the scale_vulkan filter working on NVIDIA hardware? As shown above, neither hwupload nor hwmap with device derivation for Vulkan yields usable results.
Change History (7)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
Hello @richardpl ,
Your advice above works very well.
Here are two samples that work, one with NVDEC and the other without:
(a). With nvdec:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu \ -hwaccel nvdec -hwaccel_device 0 -extra_hw_frames 2 \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]format=nv12,hwupload,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwdownload,format=nv12[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0,hwdownload,format=nv12[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\':f=mp4]'hq.mp4'| \ [select=\'v:1,a\':f=mp4]'med.mp4'"
(b). No hwaccel decode:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]format=nv12,hwupload,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwdownload,format=nv12[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0,hwdownload,format=nv12[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\':f=mp4]'hq.mp4'| \ [select=\'v:1,a\':f=mp4]'med.mp4'"
See the repeated use of the hwdownload filters trailing each scale_vulcan instance.
Where it all falls apart is if you attempt device derivation via hwupload.
The example above with NVDEC will fail if you attempt device derivation in hwupload so as to re-use the same context for hwaccel via nvdec:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu \ -hwaccel nvdec -hwaccel_device cuda -extra_hw_frames 2 \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]format=nv12,hwupload=derive_device=cuda,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwdownload,format=nv12[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0,hwdownload,format=nv12[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\':f=mp4]'hq.mp4'| \ [select=\'v:1,a\':f=mp4]'med.mp4'"
Error message:
Impossible to convert between the formats supported by the filter 'Parsed_split_2' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0 Conversion failed!
comment:4 by , 5 years ago
Owner: | set to |
---|---|
Status: | new → open |
You need to tell the decoder to actually output to GPU memory - by default hwaccel always copies back to system memory.
So here's an example that decodes with nvdec, scales with vulkan and encodes with nvenc, all in GPU memory with no copy back to system memory. If you want to encode with a software encoder, then you need the hwdownload,format=nv12 at the end instead of the second hwupload.
ffmpeg -hwaccel nvdec -hwaccel_output_format cuda -i in.mp4 -init_hw_device cuda=cuda:0 -filter_hw_device cuda -vf hwupload=derive_device=vulkan,scale_vulkan=w=1920:h=1440,hwupload=derive_device=cuda -c:v hevc_nvenc out.mp4
comment:5 by , 5 years ago
Hello @philipl,
With your recommendation:
ffmpeg -threads 1 -loglevel info -nostdin -y \ -fflags +genpts-fastseek \ -init_hw_device cuda=cuda:0 -filter_hw_device cuda \ -hwaccel nvdec -hwaccel_output_format cuda -extra_hw_frames 2 \ -reinit_filter 1 -vsync 1 -async 1 -filter_threads 2 -filter_complex_threads 2 \ -i feeds.mp4 -filter_complex \ "[0:v]hwupload=derive_device=vulkan,split=2[s0][s1]; \ [s0]scale_vulkan=w=1920:h=1080:scaler=0,hwupload=derive_device=cuda[v0]; \ [s1]scale_vulkan=w=1280:h=720:scaler=0,hwupload=derive_device=cuda[v1]" \ -map "[v0]" -b:v:0 5800k -minrate:v:0 5800k -maxrate:v:0 5800k -bufsize:v:0 5800k -c:v:0 h264_nvenc -r:v:0 ntsc \ -profile:v:0 high -preset:v:0 llhp -rc:v:0 cbr_ld_hq -g:v:0 60 -gpu:v:0 0 -strict_gop:v:0 1 -bf:v:0 0 \ -map "[v1]" -b:v:1 4000k -minrate:v:1 4000k -maxrate:v:1 4000k -bufsize:v:1 4000k -c:v:1 h264_nvenc -r:v:1 ntsc \ -profile:v:1 high -preset:v:1 llhp -rc:v:1 cbr_ld_hq -g:v:1 60 -gpu:v:1 0 -strict_gop:v:1 1 -bf:v:1 0 \ -map 0:a -c:a libfdk_aac -ac 2 -ar 48000 -b:a 128k \ -flags +global_header+cgop \ -max_muxing_queue_size 9000000 -threads 2 -f tee \ "[select=\'v:0,a\':f=mp4]'hq.mp4'| \ [select=\'v:1,a\':f=mp4]'med.mp4'"
That command works and its' exceptionally *faster* than the prior variant above.
The encoder in this case runs at about ~10x on an RTX 2080.
The previous command (with the repeated hwdownload instances) run at a paltry ~1.5x.
That's a ~7x speed up!
Thank you so much :-)
I can now (safely) close this ticket.
comment:6 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | open → closed |
Great. Mind you, I would assume scale_cuda would be a better choice in your scenario, if that's the only filter you're actually using, and would probably be even faster.
comment:7 by , 5 years ago
Resolution: | fixed → invalid |
---|
-init_hw_device vulkan=gpu:0.0 -filter_hw_device gpu
with usual hwdownload/hwupload filters.