Opened 5 years ago
Last modified 5 years ago
#8396 new defect
hwdownload always use 0th device (hwaccel_device 0)
Reported by: | darn | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | unspecified | Keywords: | nvenc |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
I have system with 3 GeForce GTX 1080 TI.
I have "smart" libavfilter behaviour described at #5587. So i need to use hwdownload command.
When i use hwdownload command in filter_complex ffmpeg always use 0 hw device.
Run without hwdownload and hwaccel_device 2:
ffmpeg-cuda \ -hide_banner \ -probesize 10M \ -analyzeduration 10M \ -nostats \ -hwaccel cuvid \ -hwaccel_device 2 \ -c:v h264_cuvid \ -i "udp://hidden_ip:58631?reuse=1&pkt_size=1316&buffer_size=2621440&fifo_size=2621440" \ -filter_complex " \ [0:v]scale_npp=-1:-1:format=yuv420p:interp_algo=lanczos[v0] " \ -c:v h264_nvenc \ -preset:v llhq \ -rc:v vbr_hq \ -profile:v:0 high \ -level:0 4.1 \ -b:v:0 5600000 \ -forced-idr 1 \ -strict_gop 1 \ -no-scenecut 1 \ -g 125 \ -r 25 \ -keyint_min 125 \ -c:a aac \ -b:a 96k \ -ac 2 \ -ar 48000 \ -map "[v0]" \ -map 0:a:0 \ -f mpegts "udp://233.34.2.219:60041?reuse=1"
ffmpeg starts with 11156 PID.
$ nvidia-smi | grep 11156 | 2 11156 C ffmpeg-cuda 313MiB |
All working fine.
Run with hwdownload and hwaccel_device 2:
ffmpeg-cuda \ -hide_banner \ -probesize 10M \ -analyzeduration 10M \ -nostats \ -hwaccel cuvid \ -hwaccel_device 2 \ -c:v h264_cuvid \ -i "udp://hidden_ip:58631?reuse=1&pkt_size=1316&buffer_size=2621440&fifo_size=2621440" \ -filter_complex " \ [0:v]scale_npp=-1:-1:format=yuv420p:interp_algo=lanczos,hwdownload,format=yuv420p[v0] " \ -c:v h264_nvenc \ -preset:v llhq \ -rc:v vbr_hq \ -profile:v:0 high \ -level:0 4.1 \ -b:v:0 5600000 \ -forced-idr 1 \ -strict_gop 1 \ -no-scenecut 1 \ -g 125 \ -r 25 \ -keyint_min 125 \ -c:a aac \ -b:a 96k \ -ac 2 \ -ar 48000 \ -map "[v0]" \ -map 0:a:0 \ -f mpegts "udp://233.34.2.219:60041?reuse=1"
ffmpeg starts with 936 PID.
$ nvidia-smi | grep 936 | 0 936 C ffmpeg-cuda 196MiB | | 2 936 C ffmpeg-cuda 259MiB |
All working not fine. Stream was copied to 0 device.
Run with hwdownload and hwaccel_device 0:
ffmpeg-cuda \ -hide_banner \ -probesize 10M \ -analyzeduration 10M \ -nostats \ -hwaccel cuvid \ -hwaccel_device 0 \ -c:v h264_cuvid \ -i "udp://hidden_ip:58631?reuse=1&pkt_size=1316&buffer_size=2621440&fifo_size=2621440" \ -filter_complex " \ [0:v]scale_npp=-1:-1:format=yuv420p:interp_algo=lanczos,hwdownload,format=yuv420p[v0] " \ -c:v h264_nvenc \ -preset:v llhq \ -rc:v vbr_hq \ -profile:v:0 high \ -level:0 4.1 \ -b:v:0 5600000 \ -forced-idr 1 \ -strict_gop 1 \ -no-scenecut 1 \ -g 125 \ -r 25 \ -keyint_min 125 \ -c:a aac \ -b:a 96k \ -ac 2 \ -ar 48000 \ -map "[v0]" \ -map 0:a:0 \ -f mpegts "udp://233.34.2.219:60041?reuse=1"
ffmpeg starts with 936 PID.
$ nvidia-smi | grep 3952 | 0 3952 C ffmpeg-cuda 456MiB |
All working fine.
As you can see, hwdownload always use 0th hwaccel_device.
Is it working as expected?
How can I make ffmpeg stop copying the stream to the 0th device?
Change History (3)
comment:1 by , 5 years ago
Component: | ffmpeg → undetermined |
---|---|
Keywords: | nvenc added; hwdownload hwaccel_device removed |
Version: | 4.2 → unspecified |
follow-up: 3 comment:2 by , 5 years ago
hwdownload copies frames _from_ the device, and thus uses the context of the frames it gets as input. The outcoming frames are in system RAM and not tied to any device.
The second CUDA context on the default device (0) you are seeing is nvenc getting fed non-CUDA frames, which triggers it to create its own CUDA context to re-uploads the frames on.
nvenc has its own option (-gpu) controlling on which device it creates that context.
But really, why even download in the first place, just for nvenc to re-upload immediately?
comment:3 by , 5 years ago
Replying to oromit:
hwdownload copies frames _from_ the device, and thus uses the context of the frames it gets as input. The outcoming frames are in system RAM and not tied to any device.
Yes.
hwdownload -- copy from GPU memory to system memory.
hwupload -- copy from system memory to GPU memory.
Is it correct?
The second CUDA context on the default device (0) you are seeing is nvenc getting fed non-CUDA frames, which triggers it to create its own CUDA context to re-uploads the frames on.
nvenc has its own option (-gpu) controlling on which device it creates that context.
I tried to use "-gpu 2" setting, the result is the same.
But really, why even download in the first place, just for nvenc to re-upload immediately?
As far as I understand my configuration on filter_complex
-filter_complex "[0:v]scale_npp=-1:-1:format=yuv420p:interp_algo=lanczos,hwdownload,format=yuv420p[v0]"
works as follows:
- Input stream "[0:v]" copying to GPU memory;
- GPU "scale_npp" input stream to "-1:-1" with "format=yuv420p:interp_algo=lanczos";
- GPU "hwdownload" with "format=yuv420p" from GPU memory to system memory with name [v0].
Is it correct?
If this is meant to be a bug report please test current FFmpeg git head and provide the simplified (!) command line together with the complete, uncut console output to make this a valid ticket.