wiki:

HWAccelIntro


Version 47 (modified by jkqxz, 5 months ago) (diff)

Improve introduction and summary tables

Many platforms offer access to dedicated hardware to perform a range of video-related tasks. Using such hardware allows some operations like decoding, encoding or filtering to be completed faster or using less of other resources (particularly CPU), but may give different or inferior results, or impose additional restrictions which are not present when using software only. On PC-like platforms, video hardware is typically integrated into a GPU (from AMD, Intel or Nvidia), while on mobile SoC-type platforms it is generally an independent IP core (many different vendors).

Hardware decoders will generate equivalent output to software decoders, but may use less power and CPU to do so. Feature support varies - for more complex codecs with many different profiles, hardware decoders rarely implement all of them (for example, hardware decoders tend not to implement anything beyond YUV 4:2:0 at 8-bit depth for H.264). A common feature of many hardware decoders to be able to generate output in hardware surfaces suitable for use by other components (with discrete graphics cards, this means surfaces in the memory on the card rather than in system memory) - this is often useful for playback, as no further copying is required before rendering the output, and in some cases it can also be used with encoders supporting hardware surface input to avoid any copying at all in transcode cases.

Hardware encoders typically generate output of significantly lower quality than good software encoders like x264, but are generally faster and do not use much CPU resource. (That is, they require a higher bitrate to make output with the same perceptual quality, or they make output with a lower perceptual quality at the same bitrate.)

Systems with decode and/or encode capability may also offer access to other related filtering features. Things like scaling and deinterlacing are common, other postprocessing may be available depending on the system. Where hardware surfaces are usable, these filters will generally act on them rather than on normal frames in system memory.

There are a lot of different APIs of varying standardisation status available. FFmpeg offers access to many of these, with varying support.

Platform API Availability

Linux Windows Android Apple Other
AMD Intel Nvidia AMD Intel Nvidia macOS iOS Raspberry Pi
CUDA / CUVID / NVENC N N Y N N Y N N N N
Direct3D 11 N N N Y Y Y N N N N
Direct3D 9 (DXVA2) N N N Y Y Y N N N N
libmfx N Y N N Y N N N N N
MediaCodec N N N N N N Y N N N
Media Foundation N N N Y Y Y N N N N
MMAL N N N N N N N N N Y
OpenCL Y Y Y Y Y Y P Y N N
OpenMAX P N N N N N P N N Y
V4L2 M2M N N N N N N P N N N
VAAPI P Y P N N N N N N N
VDA N N N N N N N Y N N
VDPAU P N Y N N N N N N N
VideoToolbox N N N N N N N Y Y N

Key:

  • Y Fully usable.
  • P Partial support (some devices / some features).
  • N Not possible.

FFmpeg API Implementation Status

Decoder Encoder Other support
Internal Standalone Hardware output Standalone Hardware input Filtering Hardware context Usable from ffmpeg CLI
CUDA / CUVID / NVENC N Y Y Y Y Y Y Y
Direct3D 11 Y - Y - - F Y Y
Direct3D 9 / DXVA2 Y - Y - - N Y Y
libmfx - Y Y Y Y Y Y Y
MediaCodec - Y Y N N - N N
Media Foundation - N N N N N N N
MMAL - Y Y N N - N N
OpenCL - - - - - Y F F
OpenMAX - N N Y N N N Y
RockChip MPP - F F N N - F F
V4L2 M2M - N N N N N N N
VAAPI Y - Y Y Y Y Y Y
VDA Y N Y - - - N Y
VDPAU Y - Y - - N Y Y
VideoToolbox Y N Y Y Y - Y Y

Key:

  • - Not applicable to this API.
  • Y Working.
  • N Possible but not implemented.
  • F Not yet integrated, but work is being done in this area.

VDPAU

Video Decode and Presentation API for Unix. Developed by NVidia for UNIX/Linux systems. To enable this you typically need the libvdpau development package in your distribution, and a compatible graphic card.

Note that VDPAU cannot be used to decode frames in memory, the compressed frames are sent by libavcodec to the GPU device supported by VDPAU and then the decoded image can be accessed using the VDPAU API. This is not done automatically by FFmpeg, but must be done at the application level (check for example the ffmpeg_vdpau.c file used by ffmpeg.c). Also, note that with this API it is not possible to move the decoded frame back to RAM, for example in case you need to encode again the decoded frame (e.g. when doing transcoding on a server).

Several decoders are currently supported through VDPAU in libavcodec, in particular H.264, MPEG-1/2/4, and VC-1.

XvMC

XVideo Motion Compensation. This is an extension of the X video extension (Xv) for the X Window System (and thus again only available only on UNIX/Linux).

Official specification is available here: http://www.xfree86.org/~mvojkovi/XvMC_API.txt

VA-API

Video Acceleration API (VA API) is a non-proprietary and royalty-free open source software library ("libVA") and API specification, initially developed by Intel but can be used in combination with other devices. Linux only: https://en.wikipedia.org/wiki/Video_Acceleration_API

DXVA2

Direct-X Video Acceleration API, developed by Microsoft (supports Windows and XBox360).

Link to MSDN documentation: http://msdn.microsoft.com/en-us/library/windows/desktop/cc307941%28v=vs.85%29.aspx

Several decoders are currently supported, in particular H.264, MPEG2, VC1 and WMV3.

DXVA2 hardware acceleration only works on Windows. In order to build FFmpeg with DXVA2 support, you need to install the dxva2api.h header. For MinGW this can be done by downloading the header maintained by VLC:

http://download.videolan.org/pub/contrib/dxva2api.h

and installing it in the include patch (for example in /usr/include/).

For MinGW64, the dxva2api.h is provided by default. One way to install mingw-w64 is through a pacman repository, and can be installed using one of the two following commands, depending on the architecture:

pacman -S mingw-w64-i686-gcc
pacman -S mingw-w64-x86_64-gcc

To enable DXVA2, use the --enable-dxva2 ffmpeg configure switch.

To test decoding, use the following command:

ffmpeg -hwaccel dxva2 -threads 1 -i INPUT -f null - -benchmark

VDA

Video Decoding API, only supported on MAC. H.264 decoding is available in FFmpeg/libavcodec.

Developers documentation: https://developer.apple.com/library/mac/technotes/tn2267/_index.html

NVENC

NVENC is an API developed by NVIDIA which enables the use of NVIDIA GPU cards to perform H.264 and HEVC encoding. FFmpeg supports NVENC through the h264_nvenc and hevc_nvenc encoders. In order to enable it in FFmpeg you need:

  • A supported GPU
  • Supported drivers
  • ffmpeg configured without --disable-nvenc

Visit NVIDIA Video Codec SDK to download the SDK and to read more about the supported GPUs and supported drivers.

Usage example:

ffmpeg -i input -c:v h264_nvenc -profile high444p -pixel_format yuv444p -preset default output.mp4

You can see available presets, other options, and encoder info with ffmpeg -h encoder=h264_nvenc or ffmpeg -h encoder=hevc_nvenc.

Note: If you get the No NVENC capable devices found error make sure you're encoding to a supported pixel format. See encoder info as shown above.

CUDA/CUVID/NvDecode

CUVID, which is also called nvdec by Nvidia now, can be used for decoding on Windows and Linux. In combination with nvenc it offers full hardware transcoding.

CUVID offers decoders for H264, HEVC, MJPEG, mpeg1/2/4, vp8/9, vc1. Codec support varies by hardware. The full set of codecs being available only on Pascal hardware, which adds VP9 and 10 bit support.

While decoding 10 bit video is supported, it is not possible to do full hardware transcoding currently (See the partial hardware example below).

Sample decode using CUVID, the cuvid decoder copies the frames to system memory in this case:

ffmpeg -c:v h264_cuvid -i input output.mkv

Full hardware transcode with CUVID and NVENC:

ffmpeg -hwaccel cuvid -c:v h264_cuvid -i input -c:v h264_nvenc -preset slow output.mkv

Partial hardware transcode, with frames passed through system memory (This is necessary for transcoding 10bit content):

ffmpeg -c:v h264_cuvid -i input -c:v h264_nvenc -preset slow output.mkv

If ffmpeg was compiled with support for libnpp, it can be used to insert a GPU based scaler into the chain:

ffmpeg -hwaccel_device 0 -hwaccel cuvid -c:v h264_cuvid -i input -vf scale_npp=-1:720 -c:v h264_nvenc -preset slow output.mkv

The -hwaccel_device option can be used to specify the GPU to be used by the cuvid hwaccel in ffmpeg.

Intel QSV

Intel QSV (Quick Sync Video) is a technology which allows decoding and encoding using recent Intel CPU and integrated GPU, supported on recent Intel CPUs. Note that the (CPU)GPU needs to be compatible with both QSV and OpenCL. Some (older) QSV -enabled GPUs aren't compatible with OpenCL. See: http://www.intel.com/content/www/us/en/architecture-and-technology/quick-sync-video/quick-sync-video-general.html https://software.intel.com/en-us/articles/intel-sdk-for-opencl-applications-2013-release-notes

To enable QSV support, you need the Intel Media SDK integrated in the Intel Media Server Studio: https://software.intel.com/en-us/intel-media-server-studio

The Intel Media Server studio is available for both Linux and Windows, and contains the libva and libdrm libraries, the libmfx dispatcher library and the intel drivers. libmfx is the library which selects the codec depending on the system capabilities, falling back to a software implementation if the hardware accelerated codec is not available).

FFmpeg QSV support relies on libmfx, but the library provided by Intel does not come with pkg-config files and a proper installer. Thus the easiest to install the library is to use the libmfx version packaged by lu_zero here: https://github.com/lu-zero/mfx_dispatch

Requirements on Windows: install the Intel Media SDK packaged in the Intel Media Server Studio, which comes with a graphic installer, and a MinGW compilation enviroment (for example provided by MSYS2 with a corresponding Mingw-w64 package). Then you need to build libmfx and install it in a path recognized by pkg-config. For example if you install in /usr/local then you need the update the $PKG_CONFIG_PATH environment variable to make it point to /usr/local/lib/pkgconfig.

Requriments on Linux: you need either to rely on the Intel Media Server Studio for Linux, or use a recent enough supported system, with the libva and libdrm libraries, the libva Intel drivers, and the libmfx library packaged by lu_zero. Note: in case you use the Intel Media Server Studio generic installation script, the installation script may overwrite your system libraries and break the system.

Check the following website for updated information about the Intel Graphics stack on the various Linux platforms: https://01.org/linuxgraphics

To enable QSV support in the FFmpeg build, configure with --enable-libmfx.

Support for decoding and encoding is integrated in FFmpeg through several codecs identified by the _qsv suffix. In particular, it currently supports MPEG2 video, VC1 (decoding only), H.264 and H.265.

For example to encode to H.264 using h264_qsv, you can use the command:

ffmpeg -i INPUT -c:v h264_qsv -preset:v faster out.qsv.mp4

If you have a Kaby Lake CPU, you can encode with HEVC using hevc_qsv:

ffmpeg -i INPUT -c:v hevc_qsv -load_plugin hevc_hw -preset:v faster out.qsv.mp4

OpenCL

Official website:

https://www.khronos.org/opencl/

Currently only used in filtering (deshake and unsharp filters). In order to use OpenCL code you need to enable the build with --enable-opencl. An API to use OpenCL API from FFmpeg is provided in libavutil/opencl.h. No decoding/encoding is currently supported (yet).

For enable-opencl to work you need to basically install your local graphics cards drivers, as well as SDK, then use its .lib files and headers.

AMD VCE

AMD VCE is exposed through VA-API on linux. For windows there have been port attempts but nothing official yet.

External resources