Opened 10 months ago
Last modified 10 months ago
#10668 open defect
cuvid regression creates jerky output
Reported by: | Jason Dove | Owned by: | |
---|---|---|---|
Priority: | important | Component: | avcodec |
Version: | git-master | Keywords: | cuvid |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
Using the h264_cuvid decoder with certain content will cause the output to be jerky.
How to reproduce:
% ffmpeg -c:v h264_cuvid -i input.mkv -c:v libx264 -y output.mkv ffmpeg version N-112777-g08e97dae20 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 13.2.1 (GCC) 20230801 configuration: --prefix=/usr --extra-cflags=-I/opt/cuda/include --extra-ldflags=-L/opt/cuda/lib64 --enable-lto --disable-rpath --enable-gpl --enable-version3 --enable-nonfree --enable-shared --disable-static --disable-stripping --disable-htmlpages --enable-gray --enable-alsa --enable-avisynth --enable-bzlib --enable-chromaprint --enable-frei0r --enable-gcrypt --enable-gmp --enable-gnutls --enable-iconv --enable-ladspa --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcelt --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libdavs2 --enable-libdc1394 --enable-libfdk-aac --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libglslang --enable-libgme --enable-libgsm --enable-libiec61883 --enable-libilbc --enable-libjack --enable-libjxl --enable-libklvanc --enable-libkvazaar --enable-liblensfun --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-libopencv --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-libopenvino --enable-libopus --enable-libplacebo --enable-libpulse --enable-librabbitmq --enable-librav1e --enable-librist --enable-librsvg --enable-librubberband --enable-librtmp --enable-libshine --enable-libsmbclient --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libsvthevc --enable-libsvtvp9 --disable-libtensorflow --enable-libtesseract --enable-libtheora --disable-libtls --enable-libtwolame --enable-libuavs3d --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxavs2 --enable-libxcb --enable-libxcb-shm --enable-libxcb-xfixes --enable-libxcb-shape --enable-libxvid --enable-libxml2 --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-lzma --enable-decklink --disable-mbedtls --enable-libmysofa --enable-openal --enable-opencl --enable-opengl --disable-openssl --disable-pocketsphinx --enable-sndio --enable-sdl2 --enable-vapoursynth --enable-vulkan --enable-xlib --enable-zlib --enable-amf --enable-cuda-nvcc --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-libdrm --enable-libvpl --enable-libnpp --enable-nvdec --enable-nvenc --enable-omx --enable-rkmpp --enable-v4l2-m2m --enable-vaapi --enable-vdpau libavutil 58. 32.100 / 58. 32.100 libavcodec 60. 33.100 / 60. 33.100 libavformat 60. 17.100 / 60. 17.100 libavdevice 60. 4.100 / 60. 4.100 libavfilter 9. 13.100 / 9. 13.100 libswscale 7. 6.100 / 7. 6.100 libswresample 4. 13.100 / 4. 13.100 libpostproc 57. 4.100 / 57. 4.100 Input #0, matroska,webm, from 'input.mkv': Metadata: ENCODER : Lavf60.17.100 Duration: 00:00:20.25, start: 0.000000, bitrate: 5560 kb/s Stream #0:0: Video: h264 (Main), yuv420p(tv, bt709, progressive), 1918x814 [SAR 1:1 DAR 959:407], 23.98 fps, 23.98 tbr, 1k tbn (default) Metadata: DURATION : 00:00:20.250000000 Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp (default) Metadata: DURATION : 00:00:20.031000000 Stream mapping: Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> h264 (libx264)) Stream #0:1 -> #0:1 (aac (native) -> vorbis (libvorbis)) Press [q] to stop, [?] for help [libx264 @ 0x55640793d640] using SAR=1/1 [libx264 @ 0x55640793d640] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 [libx264 @ 0x55640793d640] profile High, level 4.0, 4:2:0, 8-bit [libx264 @ 0x55640793d640] 264 - core 164 r3108 31e19f9 - H.264/MPEG-4 AVC codec - Copyleft 2003-2023 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, matroska, to 'output.mkv': Metadata: encoder : Lavf60.17.100 Stream #0:0: Video: h264 (H264 / 0x34363248), nv12(tv, bt709, progressive), 1918x814 [SAR 1:1 DAR 959:407], q=2-31, 23.98 fps, 1k tbn (default) Metadata: DURATION : 00:00:20.250000000 encoder : Lavc60.33.100 libx264 Side data: cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A Stream #0:1: Audio: vorbis (oV[0][0] / 0x566F), 48000 Hz, stereo, fltp (default) Metadata: DURATION : 00:00:20.031000000 encoder : Lavc60.33.100 libvorbis [out#0/matroska @ 0x556407921100] video:5317kB audio:191kB subtitle:0kB other streams:0kB global headers:4kB muxing overhead: 0.264545% frame= 482 fps= 75 q=-1.0 Lsize= 5523kB time=00:00:20.02 bitrate=2259.5kbits/s speed=3.11x [libx264 @ 0x55640793d640] frame I:15 Avg QP:18.37 size: 29181 [libx264 @ 0x55640793d640] frame P:307 Avg QP:20.27 size: 14622 [libx264 @ 0x55640793d640] frame B:160 Avg QP:17.28 size: 3236 [libx264 @ 0x55640793d640] consecutive B-frames: 53.5% 5.4% 3.7% 37.3% [libx264 @ 0x55640793d640] mb I I16..4: 43.4% 53.1% 3.5% [libx264 @ 0x55640793d640] mb P I16..4: 10.8% 20.1% 0.2% P16..4: 39.2% 3.2% 3.1% 0.0% 0.0% skip:23.3% [libx264 @ 0x55640793d640] mb B I16..4: 1.8% 2.0% 0.0% B16..8: 13.9% 1.0% 0.1% direct: 3.2% skip:78.0% L0:58.0% L1:41.1% BI: 0.9% [libx264 @ 0x55640793d640] 8x8 transform intra:62.3% inter:93.8% [libx264 @ 0x55640793d640] coded y,uvDC,uvAC intra: 27.7% 53.5% 3.0% inter: 10.0% 27.8% 0.0% [libx264 @ 0x55640793d640] i16 v,h,dc,p: 29% 33% 17% 21% [libx264 @ 0x55640793d640] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 19% 46% 2% 2% 2% 3% 2% 2% [libx264 @ 0x55640793d640] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 26% 27% 21% 3% 6% 5% 7% 3% 2% [libx264 @ 0x55640793d640] i8c dc,h,v,p: 52% 25% 21% 2% [libx264 @ 0x55640793d640] Weighted P-Frames: Y:9.1% UV:4.9% [libx264 @ 0x55640793d640] ref P L0: 41.1% 4.1% 26.2% 26.1% 2.4% [libx264 @ 0x55640793d640] ref B L0: 55.2% 26.6% 18.2% [libx264 @ 0x55640793d640] ref B L1: 61.4% 38.6% [libx264 @ 0x55640793d640] kb/s:2139.92 ffmpeg -c:v h264_cuvid -i input.mkv -c:v libx264 -y output.mkv 59.59s user 0.61s system 888% cpu 6.777 total
Attachments (1)
Change History (9)
by , 10 months ago
Attachment: | ffmpeg-report.zip added |
---|
comment:2 by , 10 months ago
Testing some builds from https://github.com/BtbN/FFmpeg-Builds/releases to try to find when the regression was introduced
2023-05-31 N-110946-g859c34706d behaves correctly (output is smooth)
2023-06-30 N-111313-ge4d4d616ba does not behave correctly (output is jerky)
comment:4 by , 10 months ago
Keywords: | decoder nvidia removed |
---|---|
Status: | new → open |
Yep very simple to reproduce with ffplay.exe -vcodec h264_cuvid C:\Users\ZAQU\Downloads\cuvid-decoder-regression-sample.mkv
Wrong reordering. Indeed, a regression
comment:5 by , 10 months ago
I did a git bisect and the first bad commit is 402d98c9d467dff6931d906ebb732b9a00334e0b.
I also confirmed that master with libavcodec/cuviddec.c at dc7bd7c5a5ad5ea800dfb63cc5dd15670d065527 works properly, so I at least have a workaround for now.
comment:6 by , 10 months ago
It is funny. That commit was derived for a fix for another bug #8948, but it did not fix it, not to mention cuvid is not affected. So no wonder it broke other stuff.
comment:8 by , 10 months ago
Hello,
402d98c9d467dff6931d906ebb732b9a00334e0b merely changes the default value of nb_surfaces variable and allows user to set it via extra_hw_frames (to avoid the deprecated option usage and unify cuvid behaviour with nvdec in this aspect):
fifo_size_inc = ctx->nb_surfaces; ctx->nb_surfaces = FFMAX(ctx->nb_surfaces, format->min_num_decode_surfaces + 3); if (avctx->extra_hw_frames > 0) ctx->nb_surfaces += avctx->extra_hw_frames; fifo_size_inc = ctx->nb_surfaces - fifo_size_inc; if (fifo_size_inc > 0 && av_fifo_grow2(ctx->frame_queue, fifo_size_inc) < 0) { av_log(avctx, AV_LOG_ERROR, "Failed to grow frame queue on video sequence callback\n"); ctx->internal_error = AVERROR(ENOMEM); return 0; }
So it can be easily fixed by reverting the default nb_surfaces value:
{ "surfaces", "Maximum surfaces to be used for decoding", OFFSET(nb_surfaces), AV_OPT_TYPE_INT, { .i64 = -1 }, 25, INT_MAX, VD | AV_OPT_FLAG_DEPRECATED }
But there are 2 caveats:
1) It looks like a bug in Video Codec SDK which returns insufficient min_num_decode_surfaces value.
2) Huge vRAM consumption increase. Many video sequences require just 6-7 surfaces in nvdec pool instead of 25.
Unfortunately, given pt. 1 it looks like there's no reliable way so far to determine actual minimal number of surfaces required for decoding.
report output