Opened 4 months ago

Closed 4 months ago

#6405 closed sponsoring request (fixed)

Compile troubles with “cuvid”, “nvenc” and “npp”

Reported by: ahakon Owned by:
Priority: normal Component: build system
Version: git-master Keywords: cuvid, nvenc, npp, cuda-sdk
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Hi,

Regarding the last tree in the repository (at 2017-05-19), two troubles appears when you compile with cuvid, nvenc and npp.

First of all, you don’t need at all to compile with the CUDA SDK. The FFmpeg compat/cuda directory has the headers. So, in order to use the cuvid & nvenc modules you can just compile with “--enable-cuvid --enable-nvenc --enable-cuda --disable-cuda-sdk”.

When you compile in this way the resulting binary doesn’t need to link with cuda lib (-lcuda). The executable dynamically links with the video driver, not the cuda lib.

The trouble, however is the bump in the required version of the driver. Today the “long-lived” video driver version for Linux is 375.66. However, the current headers targets 378.13 or newer. So, please, add some warning when the driver fails. And document the requirement.

From the original patch from Timo Rothenpieler:
“compat/cuda: update cuvid/nvdec headers to Video Codec SDK 8.0.14

This raises the required minimum NVIDIA display driver versions:
NVIDIA Linux display driver 378.13 or newer
NVIDIA Windows display driver 378.66 or newer”

The second trouble is related to the compilation with the scale_npp filter. In the current configuration is required to enable cuda_sdk. Some modules, like scale_cuda, really need to link with the CUDA lib (-lcuda). However, this is not the case with the scale_npp filter. It’s uses the libnpp* libraries. Then this simple patch in the configuration file resolves the trouble:

-scale_npp_filter_deps="cuda_sdk libnpp"
+scale_npp_filter_deps="cuda libnpp"

I hope someone takes note of these troubles and accept to fix them.
Regards.

Change History (8)

comment:1 Changed 4 months ago by oromit

The libnpp* libraries do not come with the nvidia driver.
Thus they need the full CUDA SDK to compile _and run_.

And encoder initialization should cleanly fail if the driver is too old.
Nvenc offers no information about the driver version, so giving that as an error is unfortunately impossible.

comment:2 Changed 4 months ago by ahakon

Hi,

Yes, the libnpp* libraries do not come with the nvidia driver.
However, you can compile static with them (instead of regular dynamic linking to libnpp*.so).

Moreover, libnpp* libraries are not linked to CUDA in any sense. As they are linked to the driver and not to the libcuda*.so. So you can use them without linking with "-lcuda".

Then, the requirement to "link with the CUDA SDK" is not necessary to enable the libnpp.
You can check it:

  • Install the CUDA SDK and compile FFMPEG.
  • If you compile with "--enable-libnpp --enable-cuda --enable-cuda-sdk" then the binary runs and is linked to libcuda*.so (check it with ldd).
  • If you make the change in the configuration file (scale_npp_filter_deps="cuda libnpp") and compile with "--enable-libnpp --enable-cuda --disable-cuda-sdk" then all goes fine... the binary runs and is not linked to libcuda*.so

In fact, this confusion was caused when someone has updated the CUDA SDK files in FFMpeg.
Before this change the old config was:

[scale_npp_filter_deps="cuda libnpp"]

and now is:

[scale_npp_filter_deps="cuda_sdk libnpp"].

For sure you need to install the CUDA SDK, but the correct configuration is the ancient: the requirement is "cuda" not "cuda_sdk" (for libnpp).

Please, update the configuration file to reflect this fact.

I confirm this because I'm compiling and using one custom FFMpeg compiled as this.

Last edited 4 months ago by ahakon (previous) (diff)

comment:3 Changed 4 months ago by ahakon

Regarding the driver requirement it can be very easy:

I suggest to indicate in the source code of files with CUDA the minimal version of the required driver.
Also, a simple WARNING log when running some CUDA component will be also useful.

When I try to compile an updated version of the FFMPEG I found this problem:

  • A recent version of the CUDA SDK is installed.
  • Then I can compile one git snapshot of the ffmpeg from some months ago.
  • When updating to a new git snapshot of the ffmpeg the same environment can compile it.
  • However the execution fails... and no one indicates the requirement of the driver.
  • After update the driver in my workstation, I can start to use the new version.

But the real problem is the lack of sufficient info. In the source code is not described the minimal version of the driver. I've spent several hours to found the solution:
Only the original patch submitted to the mailing-list has the info about the driver version bump.
So, it will be very useful to include at minimum some comment in the source code.

You agree?

comment:4 Changed 4 months ago by ahakon

Hi,

Regarding the linking to CUDA SDK, here more specific info:

  • The source file "compat/cuda/dynlink_loader.h" contains all references to dynamic links using NV HW:

# define CUDA_LIBNAME "libcuda.so.1"
# define NVCUVID_LIBNAME "libnvcuvid.so.1"
# define NVENC_LIBNAME "libnvidia-encode.so.1"

  • All these files are installed by the driver in "/usr/lib/*"
  • All CUDA SDK libraries (dynamic/static) are installed in "/usr/local/cuda/lib*"

Then, you don't need to link to the CUDA SDK (-lcuda) if is not strictly necessary.
And when you use the filter "scale_npp" is not necessary to link to the CUDA SDK, as the use of CUDA functions are dynamic loaded from "libcuda.so.1" as "dynlink_loader.h" describes.
This is the same behaviour of the h264_nvenc; it loads "libcuda.so.1" and "libnvidia-encode.so.1" dynamically.

So, the configuration forcing to link to CUDA SDK when enabling the filter "scale_npp" is erroneous.
Please, fix the configuration of the configure file to:

[scale_npp_filter_deps="cuda libnpp"]

instead of the wrong:

[scale_npp_filter_deps="cuda_sdk libnpp"]

For sure for the "scale_cuda" the requirement of "cuda_sdk" is correct. This only applyes to filter "scale_npp".

I hope you fix it soon!

comment:5 follow-up: Changed 4 months ago by oromit

I have no idea what you are going at, to build the scale_npp filter, you need libraries and headers that are part of the CUDA SDK, not of the set of re-implemented headers ffmpeg supplies, which is what the cuda dependency in configure ultimately means.
The cuda_sdk dependency was newly introduced and libnpp updated for them accordingly.

I agree that the minimum required driver version should be indicated in some place visible from a libavcodec binary. Might just add it to the error message when encoder initialization fails, but that might confuse people, as there are numerous reasons why that function might fail.

comment:6 in reply to: ↑ 5 Changed 4 months ago by ahakon

Hi Oromit,

Replying to oromit:

I have no idea what you are going at, to build the scale_npp filter, you need libraries and headers that are part of the CUDA SDK, not of the set of re-implemented headers ffmpeg supplies, which is what the cuda dependency in configure ultimately means.
The cuda_sdk dependency was newly introduced and libnpp updated for them accordingly.

Sorry. I try to explain more...

If you compile FFmpeg configuring it with "--enable-cuda-sdk" then the link process adds "-lcuda". The result is a binary with a dynamic dependency to "libnvidia-fatbinaryloader.so.XXX.YY" (check it with ldd), where XXX.YY is your current driver version.

However, this dynamic dependency is superfluous. Because the current headers included in the source of the FFmpeg have the code to load at run-time the functions exported by the shared library "libcuda.so.1". And this file is installed by the driver, and has the same name despite the current installed driver.

One example of the same is the "n264_nvenc" component of the FFmpeg. This module uses 'some' libcuda functions, but when it is enabled in the compilation, the configuration not triggers the CUDA SDK (so, it not adds "-lcuda").

And the same is true when you compile with the "scale_npp" filter. This filter uses 'some' libnpp* functions. These functions are linked directly, and not through the "libnvidia-fastbinaryloader.so". Instead, the same functions are loaded at run-time from the library "libcuda.so.1".

The result is: the ancient configuration for the "scale_npp" filter is correct ("cuda libnpp") and the current is superfluous ("cuda_sdk libnpp").

If you like to check it, try to recompile FFmpeg with "--enable-libnpp --enable-cuda --disable-cuda-sdk" and the CUDA SDK installed. If you revert the configuration to [scale_npp_filter_deps="cuda libnpp"], then you show that all works as expected. Why then force to link with CUDA SDK? If you really like to do it, you can do it (--enable-cuda-sdk). However, this is not required to compile the "scale_npp" filter.

Moreover, these problems are created by the current configuration:

1) After compile FFmpeg with "scale_npp" enabled, if you change the NV driver, then you need to recompile (without the "cuda_sdk" requirement this problem doesn't appear).

2) If you compile FFmpeg with the static version of the libnpp* libraries (/usr/local/lib/*_static.a), then even if you enable the scale_npp filter, your binary only requires "libcuda.so.1" at run-time.

So, the conclusion is quite simple: It's best to use [scale_npp_filter_deps="cuda libnpp"] in the configuration file.

Replying to oromit:

I agree that the minimum required driver version should be indicated in some place visible from a libavcodec binary. Might just add it to the error message when encoder initialization fails, but that might confuse people, as there are numerous reasons why that function might fail.

Great. Perhaps it's best to start with a simple comment in the source code.
You agree?

comment:7 Changed 4 months ago by bubbleguuum

Is it possible that this issue could be the cause of #6431 (Linux ffmpeg segfaulting attempting to use nvenc or nvdec) ?

Last edited 4 months ago by bubbleguuum (previous) (diff)

comment:8 Changed 4 months ago by oromit

  • Component changed from undetermined to build system
  • Resolution set to fixed
  • Status changed from new to closed
  • Type changed from defect to sponsoring request
Note: See TracTickets for help on using tickets.