Opened 7 years ago
Closed 7 years ago
#6058 closed defect (invalid)
cuvid: Transcoding hangs after several files processed
Reported by: | slomkowski | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | git-master | Keywords: | nvenc |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
I'm using ffmpeg to transcode a lot of MPEG-2 videos grabbed from DVB-S to H.264. I'm using ffmpeg with NVidia CUVID, videos are processed in batch mode, with two instances of ffmpeg running. However, after several recordings processed successfully, both ffmpeg instances hang. It appears to be working with frame rate steadily going to 0 fps; output H.264 file grows several times the size of original file. This continues until ffmpeg is eventually killed by batch managing process.
After that, ps aux
shows, that ffmpeg instances have become defunct and graphics card is no longer responding. nvidia-smi dmon
shows 100% load on decoder and 0% on encoder despite no process running. Trying to reset the card using nvidia-smi -i 0 -r
doesn't bring any good, the card becomes completely unresponsive and nvidia-smi
is no longer able to communicate with it.
I ran the transcoding with and without -hwaccel
option, each one eventually failed.
The problem is not reproducible in easy way, when you run ffmpeg on the very same files which caused the problem before, everything goes well. Just after processing several other files the processing hangs.
My configuration:
- Debian 8, kernel 4.7.0.
- graphics card: GeForce GTX 950.
- NVidia drivers tested: 375.26, 367.57.
- ffmpeg versions tested: 3.2, 3.2.2, N-82987-gd800d48.
I include logs from processing which hang.
Attachments (4)
Change History (23)
by , 7 years ago
Attachment: | cuvid-hang-ffmpeg-log-1.log added |
---|
by , 7 years ago
Attachment: | cuvid-hang-ffmeg-log-2.log added |
---|
comment:1 by , 7 years ago
follow-up: 12 comment:2 by , 7 years ago
Component: | ffmpeg → undetermined |
---|---|
Keywords: | cuvid h264 mpeg2 ts nvidia removed |
Does ffmpeg crash or hang?
Please provide a backtrace.
comment:3 by , 7 years ago
It hangs. I see ffmpeg hang just now. It's running under my own program. I include the log (using FFREPORT
env).
I just ran the same command in console (the already hanged instance is still in the background):
/home/rtv/soft/ffmpeg/ffmpeg/ffmpeg -y -loglevel warning -hwaccel_device 0 -hwaccel cuvid -c:v mpeg2_cuvid -i /mnt/rtvprocessing/artifacts/2017-01-05/RV_N24____2017-01-05_09.14__vsn0/raw-ts-file.ts -c:v h264_nvenc -rc vbr -movflags +faststart video-file.mp4
and it hangs my SSH connection (Ctrl+C, Ctrl+Z doesn't work).
by , 7 years ago
Attachment: | ffmpeg-RV_N24____2017-01-05_09.14__vsn0-20170105-093537.log added |
---|
comment:4 by , 7 years ago
Keywords: | crash removed |
---|
The logfile you attached does not indicate a hang.
In any case please provide a backtrace.
follow-up: 6 comment:5 by , 7 years ago
Since ffmpeg doesn't crash, just hangs or waits for something, I cannot provide backtrace. I just run ffmpeg once again with debug level, and this my console output (note that other ffmpeg instance is still hanged in background).
ffmpeg version N-82987-gd800d48 Copyright (c) 2000-2017 the FFmpeg developers built with gcc 4.9.2 (Debian 4.9.2-10) configuration: --prefix=../ffmpeg_build --enable-gpl --disable-debug --enable-libass --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libx264 --enable-nonfree --enable-nvenc --enable-libnpp --enable-vdpau --enable-cuda --enable-cuvid libavutil 55. 43.100 / 55. 43.100 libavcodec 57. 70.101 / 57. 70.101 libavformat 57. 61.100 / 57. 61.100 libavdevice 57. 2.100 / 57. 2.100 libavfilter 6. 68.100 / 6. 68.100 libswscale 4. 3.101 / 4. 3.101 libswresample 2. 4.100 / 2. 4.100 libpostproc 54. 2.100 / 54. 2.100 Splitting the commandline. Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'. Reading option '-hwaccel_device' ... matched as option 'hwaccel_device' (select a device for HW acceleration) with argument '0'. Reading option '-hwaccel' ... matched as option 'hwaccel' (use HW accelerated decoding) with argument 'cuvid'. Reading option '-c:v' ... matched as option 'c' (codec name) with argument 'mpeg2_cuvid'. Reading option '-i' ... matched as input url with argument '/mnt/rtvprocessing/artifacts/2017-01-05/RV_N24____2017-01-05_09.14__vsn0/raw-ts-file.ts'. Reading option '-c:v' ... matched as option 'c' (codec name) with argument 'h264_nvenc'. Reading option '-rc' ... matched as AVOption 'rc' with argument 'vbr'. Reading option '-movflags' ... matched as AVOption 'movflags' with argument '+faststart'. Reading option 'video-file.mp4' ... matched as output url. Finished splitting the commandline. Parsing a group of options: global . Applying option y (overwrite output files) with argument 1. Applying option loglevel (set logging level) with argument debug. Successfully parsed a group of options. Parsing a group of options: input url /mnt/rtvprocessing/artifacts/2017-01-05/RV_N24____2017-01-05_09.14__vsn0/raw-ts-file.ts. Applying option hwaccel_device (select a device for HW acceleration) with argument 0. Applying option hwaccel (use HW accelerated decoding) with argument cuvid. Applying option c:v (codec name) with argument mpeg2_cuvid. Successfully parsed a group of options. Opening an input file: /mnt/rtvprocessing/artifacts/2017-01-05/RV_N24____2017-01-05_09.14__vsn0/raw-ts-file.ts. [file @ 0x21d5060] Setting default whitelist 'file,crypto' [mpegts @ 0x21d4760] Format mpegts probed with size=2048 and score=50 [mpegts @ 0x21d4760] stream=0 stream_type=2 pid=3ff prog_reg_desc= [mpegts @ 0x21d4760] stream=1 stream_type=3 pid=400 prog_reg_desc= [mpegts @ 0x21d4760] stream=2 stream_type=6 pid=23 prog_reg_desc= [mpegts @ 0x21d4760] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0 nb_streams:3 [mpegts @ 0x21d4760] parser not found for codec dvb_teletext, packets or times may be invalid. Last message repeated 1 times [mpeg2video @ 0x21d9220] Invalid frame dimensions 0x0. Last message repeated 10 times [mpegts @ 0x21d4760] max_analyze_duration 5000000 reached at 5000000 microseconds st:2 [mpegts @ 0x21d4760] PES packet size mismatch [mpegts @ 0x21d4760] After avformat_find_stream_info() pos: 0 bytes read:3330192 seeks:2 frames:449 Input #0, mpegts, from '/mnt/rtvprocessing/artifacts/2017-01-05/RV_N24____2017-01-05_09.14__vsn0/raw-ts-file.ts': Duration: 00:18:00.88, start: 54479.140411, bitrate: 3785 kb/s Program 17503 Stream #0:0[0x3ff], 118, 1/90000: Video: mpeg2video (Main), 1 reference frame ([2][0][0][0] / 0x0002), yuv420p(tv, top first, left), 720x576 [SAR 64:45 DAR 16:9], 0/1, 25 fps, 25 tbr, 90k tbn, 50 tbc Stream #0:1[0x400](deu), 204, 1/90000: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, s16p, 192 kb/s (clean effects) Stream #0:2[0x23](deu), 127, 1/90000: Subtitle: dvb_teletext ([6][0][0][0] / 0x0006) Successfully opened the file. Parsing a group of options: output url video-file.mp4. Applying option c:v (codec name) with argument h264_nvenc. Successfully parsed a group of options. Opening an output file: video-file.mp4. [file @ 0x22049a0] Setting default whitelist 'file,crypto' Successfully opened the file. Setting up CUVID transcoding
Just waits on Setting up CUVID transcoding
. Also all ffmpeg instances consume 100% CPU.
comment:6 by , 7 years ago
Replying to slomkowski:
Since ffmpeg doesn't crash, just hangs or waits for something, I cannot provide backtrace.
Why not?
Just waits on
Setting up CUVID transcoding
. Also all ffmpeg instances consume 100% CPU.
Please provide a backtrace when this happens.
comment:7 by , 7 years ago
I tried to attach to the processes using gdb
:
(gdb) attach 32749 Attaching to process 32749
and gdb just hangs here (you cannot interrupt it using Ctrl+C). Same for other ffmpeg instance.
comment:9 by , 7 years ago
I used printf
and determined ffmpeg (the new instance) hangs on function:
err = cu->cuCtxCreate(&hwctx->cuda_ctx, CU_CTX_SCHED_BLOCKING_SYNC, cu_device);
in file libavutil/hwcontext_cuda.c
, line 352. I can't tell anything about original hang.
comment:11 by , 7 years ago
err = cu->cuCtxCreate(&hwctx->cuda_ctx, 0, cu_device);
doesn't change anything.
follow-up: 14 comment:12 by , 7 years ago
I just managed to create TS file sample, which hangs ffmpeg on first call. I include the file and the log (note: lines with single numbers in log are from my debug printfs).
nvidia-smi dmon
log:
# gpu pwr temp sm mem enc dec mclk pclk # Idx W C % % % % MHz MHz 0 23 38 0 0 0 0 3304 1139 0 25 39 0 0 0 0 3304 1139 0 25 39 0 0 0 0 3304 1139 0 25 39 0 0 0 0 3304 1139 0 27 39 3 4 4 8 3304 1139 0 27 39 0 0 10 22 3304 1139 0 35 40 0 0 7 46 3304 1404 0 35 40 0 0 5 62 3304 1404 0 35 41 0 0 3 74 3304 1404 0 35 41 0 0 2 82 3304 1404 0 35 41 0 0 2 87 3304 1404 0 35 41 0 0 1 91 3304 1404 0 35 41 0 0 1 94 3304 1404 0 35 41 0 0 0 96 3304 1404 0 35 41 0 0 0 97 3304 1404 0 35 42 0 0 0 98 3304 1404 0 35 42 0 0 0 99 3304 1404 0 35 42 0 0 0 99 3304 1404 0 35 42 0 0 0 99 3304 1404 0 35 42 0 0 0 99 3304 1404 0 35 42 0 0 0 100 3304 1404 0 35 42 0 0 0 100 3304 1404 0 35 42 0 0 0 100 3304 1404 0 35 42 0 0 0 100 3304 1404
nvidia-smi pmon
log:
# gpu pid type sm mem enc dec command # Idx # C/G % % % % name 0 1781 C 0 0 0 99 ffmpeg 0 1781 C 0 0 0 99 ffmpeg
Replying to cehoyos:
Please provide a backtrace.
This time, I managed to attach ffmpeg to gdb. This is backtrace:
(gdb) attach 1731 (gdb) bt
#0 0x00007f18fd106aed in poll () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f18f3bf7367 in ?? () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 #2 0x00007f18f3bf9ae9 in ?? () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 #3 0x00007f18f3be631e in ?? () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 #4 0x00007f18f3beb5a7 in ?? () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 #5 0x00007f18f3be97d4 in cuvidDestroyDecoder () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 #6 0x000000000042496f in cuvid_decode_end () #7 0x000000000044b5dc in avcodec_close () #8 0x00000000004a2e56 in transcode () #9 0x0000000000483125 in main ()
by , 7 years ago
Attachment: | 6058-hangs-ffmpeg-dec100-1.log added |
---|
log of processing 8MB file, which hangs at cuvidDestroyDecoder()
comment:13 by , 7 years ago
Since upload.ffmpeg.org seems to be down, I uploaded the TS sample here:
https://services.breitenbachmedia.com/ffmpeg/6058-hangs-ffmpeg-dec100-1.ts
follow-up: 15 comment:14 by , 7 years ago
Replying to slomkowski:
#5 0x00007f18f3be97d4 in cuvidDestroyDecoder () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
Do you agree that libnvcuvid.so.1 must never hang in cuvidDestroyDecoder(), no matter what we pass as arguments?
comment:15 by , 7 years ago
Replying to cehoyos:
Do you agree that libnvcuvid.so.1 must never hang in cuvidDestroyDecoder(), no matter what we pass as arguments?
I don't understand.
comment:16 by , 7 years ago
Thank YOU for finding this bug (epsecialy for testing data!), we had alotof crashes on our systems while decoding with cuvid and MPEG2!
I can reproduce this behavior on Nvidia GTX 960 (maxwell second gen GM206), but it works without any problems on GTX 1060, 1070 and 1080 (we have alotof nvidia cards).
What is very strange is that i can't reproduce this on Quadro M4000, which is also Maxwell but GM204. So my first opinion is that affected are only GM206 cards.
Could anyone test this sample on GTX970/980 with GM204?
Solution could be for now to switch to GTX 1050 which has also 2x faster encoder :)
By the way i think it could be in CUDA or in NVDEC engine so driver fix from NVIDIA is needed. It is not bug in ffmpeg itself.
comment:18 by , 7 years ago
I created relevant thread on NVidia devtalk forum, but I haven't fill the bug report yet.
comment:19 by , 7 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Closing this issue since it is either a driver or a hardware issue. Please feel free to add information about possible fixes here.
I confirm that the problem happens also when one ffmpeg instance is running.
nvidia-smi
output:nvidia-smi dmon
output:nvidia-smi pmon
output (noticedec
is 0 here):