Opened 4 years ago

Last modified 4 years ago

#8438 new defect

ffmpeg nvenc encoding failed with Tesla T4 card

Reported by: haisk Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: nvenc
Cc: dmngaie@gmail.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

My build is based on the master branch and with CUDA 10.2 (from CUDA toolkit cuda_10.2.89_440.33.01_linux) . When using this build to encode a mp4 sample , ffmpeg run at 0.1fps on a Tesla T4 instance, while it works fine on Tesla M60 or Tesla V100 instance

ffmpeg -v verbose -hwaccel nvdec -hwaccel_output_format cuda -i den.mp4 -c:v h264_nvenc -b:v 2M -y output.mp4
ffmpeg version N-96097-g99f505d2df Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
  configuration: --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-gpl --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-nonfree --enable-openssl --enable-nvenc --enable-cuda-nvcc --enable-cuvid --enable-libnpp --enable-nvdec --enable-filter=scale_cuda --enable-filter=thumbnail_cuda --enable-filter=yadif_cuda --enable-libfdk_aac --disable-ffplay --bindir=/usr/local/bin
  libavutil      56. 36.101 / 56. 36.101
  libavcodec     58. 65.100 / 58. 65.100
  libavformat    58. 35.101 / 58. 35.101
  libavdevice    58.  9.101 / 58.  9.101
  libavfilter     7. 69.101 /  7. 69.101
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
[h264 @ 0x55fb298f78c0] Reinit context to 1920x816, pix_fmt: yuv420p
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'den.mp4':
  Metadata:
    major_brand     : dash
    minor_version   : 0
    compatible_brands: iso6avc1mp41
    creation_time   : 2019-11-26T09:10:49.000000Z
  Duration: 00:04:56.88, start: 0.000000, bitrate: 1407 kb/s
    Stream #0:0(und): Video: h264 (High), 1 reference frame (avc1 / 0x31637661), yuv420p(tv, bt709, progressive, left), 1920x804 (1920x816) [SAR 1:1 DAR 160:67], 44 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2019-11-26T09:10:49.000000Z
      handler_name    : ISO Media file produced by Google Inc.
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc))
Press [q] to stop, [?] for help
[h264 @ 0x55fb2a2310c0] NVDEC capabilities:
[h264 @ 0x55fb2a2310c0] format supported: yes, max_mb_count: 65536
[h264 @ 0x55fb2a2310c0] min_width: 48, max_width: 4096
[h264 @ 0x55fb2a2310c0] min_height: 16, max_height: 4096
[h264 @ 0x55fb2a2310c0] Reinit context to 1920x816, pix_fmt: cuda
frame=    0 fps=0.0 q=0.0 size=       0kB time=-577014:32:22.77 bitrate=  -0.0kbits/s speed=N/A  [graph 0 input from stream 0:0 @ 0x55fb2a5514c0] w:1920 h:804 pixfmt:cuda tb:1/12800 fr:25/1 sar:1/1 sws_param:flags=2
[h264_nvenc @ 0x55fb29920640] Loaded Nvenc version 9.1
[h264_nvenc @ 0x55fb29920640] Nvenc initialized successfully
Output #0, mp4, to 'output.mp4':
  Metadata:
    major_brand     : dash
    minor_version   : 0
    compatible_brands: iso6avc1mp41
    encoder         : Lavf58.35.101
    Stream #0:0(und): Video: h264 (h264_nvenc) (Main), 1 reference frame (avc1 / 0x31637661), cuda(left), 1920x804 [SAR 1:1 DAR 160:67], q=-1--1, 2000 kb/s, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2019-11-26T09:10:49.000000Z
      handler_name    : ISO Media file produced by Google Inc.
      encoder         : Lavc58.65.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 4000000 vbv_delay: N/A
frame=    3 fps=0.2 q=25.0 size=       0kB time=00:00:00.00 bitrate=4923.1kbits/s speed=6.28e-06xframe=    4 fps=0.2 q=25.0 size=       0kB time=00:00:00.04 bitrate=   9.6kbits/s speed=0.00174x frame=    5 fps=0.1 q=24.0 size=       0kB time=00:00:00.08 bitrate=   4.8kbits/s speed=0.00239x frame=    6 fps=0.1 q=23.0 size=       0kB time=00:00:00.12 bitrate=   3.2kbits/s speed=0.00272x frame=    7 fps=0.1 q=22.0 size=       0kB time=00:00:00.16 bitrate=   2.4kbits/s speed=0.00293x frame=    8 fps=0.1 q=21.0 size=       0kB time=00:00:00.20 bitrate=   1.9kbits/s speed=0.00306x frame=    9 fps=0.1 q=20.0 size=       0kB time=00:00:00.24 bitrate=   1.6kbits/s speed=0.00316x frame=   10 fps=0.1 q=19.0 size=       0kB time=00:00:00.28 bitrate=   1.4kbits/s speed=0.00324x frame=   11 fps=0.1 q=18.0 size=       0kB time=00:00:00.32 bitrate=   1.2kbits/s speed=0.0033x  frame=   12 fps=0.1 q=17.0 size=       0kB time=00:00:00.36 bitrate=   1.1kbits/s speed=0.00335x frame=   13 fps=0.1 q=16.0 size=       0kB time=00:00:00.40 bitrate=   1.0kbits/s speed=0.00339x frame=   14 fps=0.1 q=15.0 size=       0kB time=00:00:00.44 bitrate=   0.9kbits/s speed=0.00342x frame=   15 fps=0.1 q=14.0 size=       0kB time=00:00:00.48 bitrate=   0.8kbits/s speed=0.00345x frame=   16 fps=0.1 q=13.0 size=       0kB time=00:00:00.52 bitrate=   0.7kbits/s speed=0.00347x frame=   17 fps=0.1 q=12.0 size=       0kB time=00:00:00.56 bitrate=   0.7kbits/s speed=0.00349x frame=   18 fps=0.1 q=11.0 size=       0kB time=00:00:00.60 bitrate=   0.6kbits/s speed=0.00351x frame=   19 fps=0.1 q=11.0 size=       0kB time=00:00:00.64 bitrate=   0.6kbits/s speed=0.00352x frame=   20 fps=0.1 q=10.0 size=       0kB time=00:00:00.68 bitrate=   0.6kbits/s speed=0.00354x frame=   21 fps=0.1 q=10.0 size=       0kB time=00:00:00.72 bitrate=   0.5kbits/s speed=0.00355x frame=   22 fps=0.1 q=10.0 size=       0kB time=00:00:00.76 bitrate=   0.5kbits/s speed=0.00356x frame=   23 fps=0.1 q=10.0 size=       0kB time=00:00:00.80 bitrate=   0.5kbits/s speed=0.00357x

Change History (5)

comment:1 by haisk, 4 years ago

Just found out, it only happen if I use -hwaccel nvdec -hwaccel_output_format cuda or -hwaccel cuvid -c:v h264_cuvid before the input . If I use software decode, it works fine . However it detroys the purpose as I want the entire encode process in GPU pipline .

I'm trying to compile with 4.2 and 4.1 branch as well but it still has the same issue. The OS is ubuntu 16.04

With CUVID

ffmpeg -v verbose -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -c:v h264_nvenc -b:v 3M -y output.mp4
ffmpeg version n4.1.4-22-g08d3cc2 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.12) 20160609
  configuration: --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-gpl --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-nonfree --enable-openssl --enable-nvenc --enable-cuda-sdk --enable-cuvid --enable-libnpp --enable-nvdec --enable-filter=scale_cuda --enable-filter=thumbnail_cuda --enable-filter=yadif_cuda --enable-libfdk_aac --disable-ffplay --bindir=/usr/local/bin
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
[h264 @ 0x3ded340] Reinit context to 1920x1088, pix_fmt: yuv420p
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : dash
    minor_version   : 0
    compatible_brands: iso6avc1mp41
    creation_time   : 2019-12-19T06:29:54.000000Z
  Duration: 00:04:56.21, start: 0.000000, bitrate: 3834 kb/s
    Stream #0:0(und): Video: h264 (High), 1 reference frame (avc1 / 0x31637661), yuv420p(tv, bt709, progressive, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 29 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc (default)
    Metadata:
      creation_time   : 2019-12-19T06:29:54.000000Z
      handler_name    : ISO Media file produced by Google Inc.
[h264_cuvid @ 0x3e13880] Initializing cuvid hwaccel
[h264_cuvid @ 0x3e13880] CUVID capabilities for h264_cuvid:
[h264_cuvid @ 0x3e13880] 8 bit: supported: 1, min_width: 48, max_width: 4096, min_height: 16, max_height: 4096
[h264_cuvid @ 0x3e13880] 10 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0
[h264_cuvid @ 0x3e13880] 12 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> h264 (h264_nvenc))
Press [q] to stop, [?] for help
[h264_cuvid @ 0x3e13880] Initializing cuvid hwaccel
[h264_cuvid @ 0x3e13880] Formats: Original: cuda | HW: cuda | SW: nv12
frame=    0 fps=0.0 q=0.0 size=       0kB time=-577014:32:22.77 bitrate=  -0.0kbits/s speed=N/A  [graph 0 input from stream 0:0 @ 0x4ada6c0] w:1920 h:1080 pixfmt:cuda tb:1/24000 fr:24000/1001 sar:1/1 sws_param:flags=2
[h264_nvenc @ 0x3e13280] Loaded Nvenc version 9.1
[h264_nvenc @ 0x3e13280] Nvenc initialized successfully
Output #0, mp4, to 'output.mp4':
  Metadata:
    major_brand     : dash
    minor_version   : 0
    compatible_brands: iso6avc1mp41
    encoder         : Lavf58.20.100
    Stream #0:0(und): Video: h264 (h264_nvenc) (Main), 1 reference frame (avc1 / 0x31637661), cuda(left), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 3000 kb/s, 23.98 fps, 24k tbn, 23.98 tbc (default)
    Metadata:
      creation_time   : 2019-12-19T06:29:54.000000Z
      handler_name    : ISO Media file produced by Google Inc.
      encoder         : Lavc58.35.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/3000000 buffer size: 6000000 vbv_delay: -1
frame=    3 fps=0.1 q=24.0 size=       0kB time=00:00:00.00 bitrate=9142.9kbits/s speed=1.9e-06x frame=    4 fps=0.1 q=33.0 size=       0kB time=00:00:00.04 bitrate=   9.2kbits/s speed=0.000965xframe=    5 fps=0.1 q=31.0 size=       0kB time=00:00:00.08 bitrate=   4.6kbits/s speed=0.00155x frame=    6 fps=0.1 q=30.0 size=       0kB time=00:00:00.12 bitrate=   3.1kbits/s speed=0.00167x frame=    7 fps=0.1 q=30.0 size=       0kB time=00:00:00.16 bitrate=   2.3kbits/s speed=0.00195x frame=    8 fps=0.1 q=29.0 size=       0kB time=00:00:00.20 bitrate=   1.8kbits/s speed=0.00217x 

With software decode

ffmpeg -v verbose -i input.mp4 -c:v h264_nvenc -b:v 3M -y output.mp4
ffmpeg version n4.1.4-22-g08d3cc2 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.12) 20160609
  configuration: --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-gpl --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-nonfree --enable-openssl --enable-nvenc --enable-cuda-sdk --enable-cuvid --enable-libnpp --enable-nvdec --enable-filter=scale_cuda --enable-filter=thumbnail_cuda --enable-filter=yadif_cuda --enable-libfdk_aac --disable-ffplay --bindir=/usr/local/bin
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
[h264 @ 0x406c240] Reinit context to 1920x1088, pix_fmt: yuv420p
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : dash
    minor_version   : 0
    compatible_brands: iso6avc1mp41
    creation_time   : 2019-12-19T06:29:54.000000Z
  Duration: 00:04:56.21, start: 0.000000, bitrate: 3834 kb/s
    Stream #0:0(und): Video: h264 (High), 1 reference frame (avc1 / 0x31637661), yuv420p(tv, bt709, progressive, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 29 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc (default)
    Metadata:
      creation_time   : 2019-12-19T06:29:54.000000Z
      handler_name    : ISO Media file produced by Google Inc.
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc))
Press [q] to stop, [?] for help
[h264 @ 0x4093c00] Reinit context to 1920x1088, pix_fmt: yuv420p
[graph 0 input from stream 0:0 @ 0x44e5440] w:1920 h:1080 pixfmt:yuv420p tb:1/24000 fr:24000/1001 sar:1/1 sws_param:flags=2
[h264_nvenc @ 0x4091d00] Loaded Nvenc version 9.1
[h264_nvenc @ 0x4091d00] Nvenc initialized successfully
[h264_nvenc @ 0x4091d00] 1 CUDA capable devices found
[h264_nvenc @ 0x4091d00] [ GPU #0 - < Tesla T4 > has Compute SM 7.5 ]
[h264_nvenc @ 0x4091d00] supports NVENC
Output #0, mp4, to 'output.mp4':
  Metadata:
    major_brand     : dash
    minor_version   : 0
    compatible_brands: iso6avc1mp41
    encoder         : Lavf58.20.100
    Stream #0:0(und): Video: h264 (h264_nvenc) (Main), 1 reference frame (avc1 / 0x31637661), yuv420p(left), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 3000 kb/s, 23.98 fps, 24k tbn, 23.98 tbc (default)
    Metadata:
      creation_time   : 2019-12-19T06:29:54.000000Z
      handler_name    : ISO Media file produced by Google Inc.
      encoder         : Lavc58.35.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/3000000 buffer size: 6000000 vbv_delay: -1
No more output streams to write to, finishing.e=00:04:52.37 bitrate=3034.1kbits/s speed=11.6x    
frame= 7102 fps=278 q=22.0 Lsize=  109408kB time=00:04:56.17 bitrate=3026.2kbits/s speed=11.6x    
video:109378kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.027704%
Input file #0 (input.mp4):
  Input stream #0:0 (video): 7102 packets read (141882400 bytes); 7102 frames decoded; 
  Total: 7102 packets (141882400 bytes) demuxed
Output file #0 (output.mp4):
  Output stream #0:0 (video): 7102 frames encoded; 7102 packets muxed (112002911 bytes); 
  Total: 7102 packets (112002911 bytes) muxed
[AVIOContext @ 0x4092180] Statistics: 2 seeks, 431 writeouts
[h264_nvenc @ 0x4091d00] Nvenc unloaded
[AVIOContext @ 0x40734c0] Statistics: 141974669 bytes read, 0 seeks

Last edited 4 years ago by haisk (previous) (diff)

comment:2 by Carl Eugen Hoyos, 4 years ago

Component: ffmpegavcodec

comment:3 by Dennis E. Mungai, 4 years ago

Cc: dmngaie@gmail.com added

Hello there,

Are you running this on a physical Tesla T4 or on a vGPU-based T4 instance, such as that provided by VMWare's vSphere on ESXi, etc?

Can you show us the output of:

  1. nvidia-smi

Just run:

nvidia-smi

As is, followed by:

  1. GPU topology information via:

nvidia-smi topo -m

Your results are indeed abnormal and are most likely not related to FFmpeg, but your system's config. The output from the commands above will help verify that.

comment:4 by haisk, 4 years ago

Thanks for response. As I said I'm using AWS instance named g4dn.xlarge

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   61C    P0    27W /  70W |      0MiB / 15109MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

nvidia-smi topo -m

	GPU0	CPU Affinity
GPU0	 X 	0-3

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

comment:5 by smallishzulu, 4 years ago

Note: See TracTickets for help on using tickets.