Opened 17 hours ago

Last modified 14 hours ago

#11352 new defect

On the state of FFmpeg's ffv1_vulkan encoder implementation on Intel & NVIDIA

Reported by: Dennis E. Mungai Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: vulkan ffv1 ffv1_vulkan
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

The subject matter above refers.
Sample command-lines are provided for reference, see the notes section below.

(a). Intel:

ffmpeg \
-init_hw_device vulkan=vk:0 \
-hwaccel vulkan -hwaccel_output_format vulkan \
-hwaccel_device vk -filter_hw_device vk \
-loglevel info \
-fflags +genpts \
-i '4KIOS.mov' \
-vf "bwdif_vulkan,libplacebo=format=yuv420p" \
-c:v ffv1_vulkan \
-r:v 60 -g:v 2 -level:v 4 -strict -2 -coder:v 2 -context:v 1 \
-c:a libfdk_aac -b:a 128k -ar 48000 -ac 2 \
-map "0:v" -map "0:a" \
-max_muxing_queue_size 8192 -max_interleave_delta 0 \
-flags -global_header+cgop \
-y -f matroska "ticketx_ffv1.mkv"

(b). NVIDIA:

ffmpeg \
-init_hw_device vulkan=vk:0 \
-hwaccel vulkan -hwaccel_output_format vulkan \
-hwaccel_device vk -filter_hw_device vk \
-loglevel info \
-fflags +genpts \
-i '4KIOS.mov' \
-vf "bwdif_vulkan,libplacebo=format=yuv420p" \
-c:v ffv1_vulkan \
-r:v 60 -g:v 2 -level:v 4 -strict -2 -coder:v 2 -context:v 1 \
-c:a libfdk_aac -b:a 128k -ar 48000 -ac 2 \
-map "0:v" -map "0:a" \
-max_delay 5000000 -max_muxing_queue_size 8192 -max_interleave_delta 0 \
-flags -global_header+cgop \
-y -f matroska "ticket4_ffv1.mkv"

Runtime notes:

It's observed that the encoder also works quite well with and without using Vulkan H/W Accelerated decode, producing consistent results across multiple runs per test platform with and without Vulkan H/W Accelerated decode.

As noted by Lynne on IRC regarding this encoder, handling yuv420p requires the following conditions to be met:

  1. Either 64 pixel aligned images for both h and v.
  2. And using version 4 (Pass -level 4 -strict -2).

The example commands above have the flags -strict -2 and -level:v 4 set to match the constraints defined above.

Oddities/bugs:

  1. Even with identical parameters, the file sizes produced by Intel's and NVIDIA's ffv1_vulkan instances differ significantly.

Intel's anv driver produces ridiculously massive files as shown below:
The file from NVIDIA's session,named "ticket4_ffv1.mkv" is about 4.9G in size.
Intel's session output with the same settings, named "ticketx_ffv1.mkv", weighs at 41 GB.

  1. NVIDIA's performance with this encoder, as tested on the RTX 4060 Max Q with a 4k input sample is at about ~0.63x speed.

Intel's performance on the Integrated Raptor Lake GPU with the ANV driver clocks in at a paltry speed of 0.0907x. 7x slower.

  1. With NVIDIA, the ffv1_vulkan encoder will hang at a sporadic frame on each session *if* -g:v is higher than 2, regardless of the -level:v setting, all else being constant.

Change History (1)

comment:1 by Lynne, 14 hours ago

Try setting -async_depth 1. The current algorithm to calculate the spare RAM doesn't work that well.

ANV completely breaks the shaders we use during compilation. The developers know about it and its not our issue.

Note: See TracTickets for help on using tickets.