Ffmpeg QSV backend uses >2x more GPU memory compared to VAAPI or MSDK
|Reported by:||eero-t||Owned by:|
|Blocking:||Reproduced by developer:||no|
|Analyzed by developer:||no|
Summary of the bug:
GPU accelerated Video transcoding can use GBs of RAM, but it's in DRI/GEM objects which don't show up anywhere, can't be limited, can't be swapped out, and can therefore easily cause OOM-kill havoc in the rest of the system.
FFmpeg QSV backend uses way too much of these resources.
How to reproduce:
- monitor GEM object usage
# watch cat /sys/kernel/debug/dri/0/i915_gem_objects
- Transcode with FFmpeg / QSV
$ LIBVA_DRIVER_NAME=iHD ffmpeg -hwaccel qsv -qsv_device /dev/dri/renderD128 -c:v hevc_qsv -i Netflix_FoodMarket_4096x2160_10bit_420_100mbs_600.h265 -c:v hevc_qsv -b:v 20M -async_depth 4 output.h265
- Transcode with MediaSDK sample app:
LIBVA_DRIVER_NAME=iHD $ sample_multi_transcode -i::h265 Netflix_FoodMarket_4096x2160_10bit_420_100mbs_600.h265 -o::h265 output.h265 -b 20000 -async 4 -hw
- Transcode with FFmpeg / VAAPI
$ LIBVA_DRIVER_NAME=iHD ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format vaapi -i Netflix_FoodMarket_4096x2160_10bit_420_100mbs_600.h265 -c:v hevc_vaapi -b:v 20M output.h265
- Both FFmpeg backends take about same amount of GPU resources as they do about the same thing, and that resource usage is reasonable
- Reasonable being along lines of some (tens of) frames for prediction i.e. 4K * 16-bit * frames = few hundreds of MBs
- VAAPI backend uses ~1.1GB of GEM resources
- QSV backend uses ~2.5GB of GEM resources, more than double
- MSDK app uses ~1.1GB of GEM resources like VAAPI backend, so the QSV backend issue doesn't seem to be due to MSDK
I tried playing with -async/-async_depth option, and I was able to raise MSDK app GEM object usage to 2.5GB with "-async 20" option, but even with "-async_depth 1" option QSV backend lowered only to 2.2GB of GEM objects, so it must be something else.