Opened 5 years ago
Last modified 5 years ago
#8438 new defect
ffmpeg nvenc encoding failed with Tesla T4 card
Reported by: | haisk | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avcodec |
Version: | git-master | Keywords: | nvenc |
Cc: | dmngaie@gmail.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
My build is based on the master branch and with CUDA 10.2 (from CUDA toolkit cuda_10.2.89_440.33.01_linux) . When using this build to encode a mp4 sample , ffmpeg run at 0.1fps on a Tesla T4 instance, while it works fine on Tesla M60 or Tesla V100 instance
ffmpeg -v verbose -hwaccel nvdec -hwaccel_output_format cuda -i den.mp4 -c:v h264_nvenc -b:v 2M -y output.mp4 ffmpeg version N-96097-g99f505d2df Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7 (Ubuntu 7.4.0-1ubuntu1~18.04.1) configuration: --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-gpl --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-nonfree --enable-openssl --enable-nvenc --enable-cuda-nvcc --enable-cuvid --enable-libnpp --enable-nvdec --enable-filter=scale_cuda --enable-filter=thumbnail_cuda --enable-filter=yadif_cuda --enable-libfdk_aac --disable-ffplay --bindir=/usr/local/bin libavutil 56. 36.101 / 56. 36.101 libavcodec 58. 65.100 / 58. 65.100 libavformat 58. 35.101 / 58. 35.101 libavdevice 58. 9.101 / 58. 9.101 libavfilter 7. 69.101 / 7. 69.101 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 [h264 @ 0x55fb298f78c0] Reinit context to 1920x816, pix_fmt: yuv420p Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'den.mp4': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6avc1mp41 creation_time : 2019-11-26T09:10:49.000000Z Duration: 00:04:56.88, start: 0.000000, bitrate: 1407 kb/s Stream #0:0(und): Video: h264 (High), 1 reference frame (avc1 / 0x31637661), yuv420p(tv, bt709, progressive, left), 1920x804 (1920x816) [SAR 1:1 DAR 160:67], 44 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default) Metadata: creation_time : 2019-11-26T09:10:49.000000Z handler_name : ISO Media file produced by Google Inc. Stream mapping: Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc)) Press [q] to stop, [?] for help [h264 @ 0x55fb2a2310c0] NVDEC capabilities: [h264 @ 0x55fb2a2310c0] format supported: yes, max_mb_count: 65536 [h264 @ 0x55fb2a2310c0] min_width: 48, max_width: 4096 [h264 @ 0x55fb2a2310c0] min_height: 16, max_height: 4096 [h264 @ 0x55fb2a2310c0] Reinit context to 1920x816, pix_fmt: cuda frame= 0 fps=0.0 q=0.0 size= 0kB time=-577014:32:22.77 bitrate= -0.0kbits/s speed=N/A [graph 0 input from stream 0:0 @ 0x55fb2a5514c0] w:1920 h:804 pixfmt:cuda tb:1/12800 fr:25/1 sar:1/1 sws_param:flags=2 [h264_nvenc @ 0x55fb29920640] Loaded Nvenc version 9.1 [h264_nvenc @ 0x55fb29920640] Nvenc initialized successfully Output #0, mp4, to 'output.mp4': Metadata: major_brand : dash minor_version : 0 compatible_brands: iso6avc1mp41 encoder : Lavf58.35.101 Stream #0:0(und): Video: h264 (h264_nvenc) (Main), 1 reference frame (avc1 / 0x31637661), cuda(left), 1920x804 [SAR 1:1 DAR 160:67], q=-1--1, 2000 kb/s, 25 fps, 12800 tbn, 25 tbc (default) Metadata: creation_time : 2019-11-26T09:10:49.000000Z handler_name : ISO Media file produced by Google Inc. encoder : Lavc58.65.100 h264_nvenc Side data: cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 4000000 vbv_delay: N/A frame= 3 fps=0.2 q=25.0 size= 0kB time=00:00:00.00 bitrate=4923.1kbits/s speed=6.28e-06xframe= 4 fps=0.2 q=25.0 size= 0kB time=00:00:00.04 bitrate= 9.6kbits/s speed=0.00174x frame= 5 fps=0.1 q=24.0 size= 0kB time=00:00:00.08 bitrate= 4.8kbits/s speed=0.00239x frame= 6 fps=0.1 q=23.0 size= 0kB time=00:00:00.12 bitrate= 3.2kbits/s speed=0.00272x frame= 7 fps=0.1 q=22.0 size= 0kB time=00:00:00.16 bitrate= 2.4kbits/s speed=0.00293x frame= 8 fps=0.1 q=21.0 size= 0kB time=00:00:00.20 bitrate= 1.9kbits/s speed=0.00306x frame= 9 fps=0.1 q=20.0 size= 0kB time=00:00:00.24 bitrate= 1.6kbits/s speed=0.00316x frame= 10 fps=0.1 q=19.0 size= 0kB time=00:00:00.28 bitrate= 1.4kbits/s speed=0.00324x frame= 11 fps=0.1 q=18.0 size= 0kB time=00:00:00.32 bitrate= 1.2kbits/s speed=0.0033x frame= 12 fps=0.1 q=17.0 size= 0kB time=00:00:00.36 bitrate= 1.1kbits/s speed=0.00335x frame= 13 fps=0.1 q=16.0 size= 0kB time=00:00:00.40 bitrate= 1.0kbits/s speed=0.00339x frame= 14 fps=0.1 q=15.0 size= 0kB time=00:00:00.44 bitrate= 0.9kbits/s speed=0.00342x frame= 15 fps=0.1 q=14.0 size= 0kB time=00:00:00.48 bitrate= 0.8kbits/s speed=0.00345x frame= 16 fps=0.1 q=13.0 size= 0kB time=00:00:00.52 bitrate= 0.7kbits/s speed=0.00347x frame= 17 fps=0.1 q=12.0 size= 0kB time=00:00:00.56 bitrate= 0.7kbits/s speed=0.00349x frame= 18 fps=0.1 q=11.0 size= 0kB time=00:00:00.60 bitrate= 0.6kbits/s speed=0.00351x frame= 19 fps=0.1 q=11.0 size= 0kB time=00:00:00.64 bitrate= 0.6kbits/s speed=0.00352x frame= 20 fps=0.1 q=10.0 size= 0kB time=00:00:00.68 bitrate= 0.6kbits/s speed=0.00354x frame= 21 fps=0.1 q=10.0 size= 0kB time=00:00:00.72 bitrate= 0.5kbits/s speed=0.00355x frame= 22 fps=0.1 q=10.0 size= 0kB time=00:00:00.76 bitrate= 0.5kbits/s speed=0.00356x frame= 23 fps=0.1 q=10.0 size= 0kB time=00:00:00.80 bitrate= 0.5kbits/s speed=0.00357x
Change History (5)
comment:2 by , 5 years ago
Component: | ffmpeg → avcodec |
---|
comment:3 by , 5 years ago
Cc: | added |
---|
Hello there,
Are you running this on a physical Tesla T4 or on a vGPU-based T4 instance, such as that provided by VMWare's vSphere on ESXi, etc?
Can you show us the output of:
- nvidia-smi
Just run:
nvidia-smi
As is, followed by:
- GPU topology information via:
nvidia-smi topo -m
Your results are indeed abnormal and are most likely not related to FFmpeg, but your system's config. The output from the commands above will help verify that.
comment:4 by , 5 years ago
Thanks for response. As I said I'm using AWS instance named g4dn.xlarge
nvidia-smi
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000:00:1E.0 Off | 0 | | N/A 61C P0 27W / 70W | 0MiB / 15109MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
nvidia-smi topo -m
GPU0 CPU Affinity GPU0 X 0-3 Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks
Just found out, it only happen if I use -hwaccel nvdec -hwaccel_output_format cuda or -hwaccel cuvid -c:v h264_cuvid before the input . If I use software decode, it works fine . However it detroys the purpose as I want the entire encode process in GPU pipline .
I'm trying to compile with 4.2 and 4.1 branch as well but it still has the same issue. The OS is ubuntu 16.04
With CUVID
With software decode