Opened 10 months ago

Last modified 10 months ago

#10668 open defect

cuvid regression creates jerky output

Reported by: Jason Dove Owned by:
Priority: important Component: avcodec
Version: git-master Keywords: cuvid
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

Using the h264_cuvid decoder with certain content will cause the output to be jerky.

How to reproduce:

% ffmpeg -c:v h264_cuvid -i input.mkv -c:v libx264 -y output.mkv
ffmpeg version N-112777-g08e97dae20 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.1 (GCC) 20230801
  configuration: --prefix=/usr --extra-cflags=-I/opt/cuda/include --extra-ldflags=-L/opt/cuda/lib64 --enable-lto --disable-rpath --enable-gpl --enable-version3 --enable-nonfree --enable-shared --disable-static --disable-stripping --disable-htmlpages --enable-gray --enable-alsa --enable-avisynth --enable-bzlib --enable-chromaprint --enable-frei0r --enable-gcrypt --enable-gmp --enable-gnutls --enable-iconv --enable-ladspa --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcelt --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libdavs2 --enable-libdc1394 --enable-libfdk-aac --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libglslang --enable-libgme --enable-libgsm --enable-libiec61883 --enable-libilbc --enable-libjack --enable-libjxl --enable-libklvanc --enable-libkvazaar --enable-liblensfun --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-libopencv --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-libopenvino --enable-libopus --enable-libplacebo --enable-libpulse --enable-librabbitmq --enable-librav1e --enable-librist --enable-librsvg --enable-librubberband --enable-librtmp --enable-libshine --enable-libsmbclient --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libsvthevc --enable-libsvtvp9 --disable-libtensorflow --enable-libtesseract --enable-libtheora --disable-libtls --enable-libtwolame --enable-libuavs3d --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxavs2 --enable-libxcb --enable-libxcb-shm --enable-libxcb-xfixes --enable-libxcb-shape --enable-libxvid --enable-libxml2 --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-lzma --enable-decklink --disable-mbedtls --enable-libmysofa --enable-openal --enable-opencl --enable-opengl --disable-openssl --disable-pocketsphinx --enable-sndio --enable-sdl2 --enable-vapoursynth --enable-vulkan --enable-xlib --enable-zlib --enable-amf --enable-cuda-nvcc --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-libdrm --enable-libvpl --enable-libnpp --enable-nvdec --enable-nvenc --enable-omx --enable-rkmpp --enable-v4l2-m2m --enable-vaapi --enable-vdpau
  libavutil      58. 32.100 / 58. 32.100
  libavcodec     60. 33.100 / 60. 33.100
  libavformat    60. 17.100 / 60. 17.100
  libavdevice    60.  4.100 / 60.  4.100
  libavfilter     9. 13.100 /  9. 13.100
  libswscale      7.  6.100 /  7.  6.100
  libswresample   4. 13.100 /  4. 13.100
  libpostproc    57.  4.100 / 57.  4.100
Input #0, matroska,webm, from 'input.mkv':
  Metadata:
    ENCODER         : Lavf60.17.100
  Duration: 00:00:20.25, start: 0.000000, bitrate: 5560 kb/s
  Stream #0:0: Video: h264 (Main), yuv420p(tv, bt709, progressive), 1918x814 [SAR 1:1 DAR 959:407], 23.98 fps, 23.98 tbr, 1k tbn (default)
    Metadata:
      DURATION        : 00:00:20.250000000
  Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 00:00:20.031000000
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> h264 (libx264))
  Stream #0:1 -> #0:1 (aac (native) -> vorbis (libvorbis))
Press [q] to stop, [?] for help
[libx264 @ 0x55640793d640] using SAR=1/1
[libx264 @ 0x55640793d640] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x55640793d640] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 0x55640793d640] 264 - core 164 r3108 31e19f9 - H.264/MPEG-4 AVC codec - Copyleft 2003-2023 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to 'output.mkv':
  Metadata:
    encoder         : Lavf60.17.100
  Stream #0:0: Video: h264 (H264 / 0x34363248), nv12(tv, bt709, progressive), 1918x814 [SAR 1:1 DAR 959:407], q=2-31, 23.98 fps, 1k tbn (default)
    Metadata:
      DURATION        : 00:00:20.250000000
      encoder         : Lavc60.33.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1: Audio: vorbis (oV[0][0] / 0x566F), 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 00:00:20.031000000
      encoder         : Lavc60.33.100 libvorbis
[out#0/matroska @ 0x556407921100] video:5317kB audio:191kB subtitle:0kB other streams:0kB global headers:4kB muxing overhead: 0.264545%
frame=  482 fps= 75 q=-1.0 Lsize=    5523kB time=00:00:20.02 bitrate=2259.5kbits/s speed=3.11x
[libx264 @ 0x55640793d640] frame I:15    Avg QP:18.37  size: 29181
[libx264 @ 0x55640793d640] frame P:307   Avg QP:20.27  size: 14622
[libx264 @ 0x55640793d640] frame B:160   Avg QP:17.28  size:  3236
[libx264 @ 0x55640793d640] consecutive B-frames: 53.5%  5.4%  3.7% 37.3%
[libx264 @ 0x55640793d640] mb I  I16..4: 43.4% 53.1%  3.5%
[libx264 @ 0x55640793d640] mb P  I16..4: 10.8% 20.1%  0.2%  P16..4: 39.2%  3.2%  3.1%  0.0%  0.0%    skip:23.3%
[libx264 @ 0x55640793d640] mb B  I16..4:  1.8%  2.0%  0.0%  B16..8: 13.9%  1.0%  0.1%  direct: 3.2%  skip:78.0%  L0:58.0% L1:41.1% BI: 0.9%
[libx264 @ 0x55640793d640] 8x8 transform intra:62.3% inter:93.8%
[libx264 @ 0x55640793d640] coded y,uvDC,uvAC intra: 27.7% 53.5% 3.0% inter: 10.0% 27.8% 0.0%
[libx264 @ 0x55640793d640] i16 v,h,dc,p: 29% 33% 17% 21%
[libx264 @ 0x55640793d640] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 19% 46%  2%  2%  2%  3%  2%  2%
[libx264 @ 0x55640793d640] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 26% 27% 21%  3%  6%  5%  7%  3%  2%
[libx264 @ 0x55640793d640] i8c dc,h,v,p: 52% 25% 21%  2%
[libx264 @ 0x55640793d640] Weighted P-Frames: Y:9.1% UV:4.9%
[libx264 @ 0x55640793d640] ref P L0: 41.1%  4.1% 26.2% 26.1%  2.4%
[libx264 @ 0x55640793d640] ref B L0: 55.2% 26.6% 18.2%
[libx264 @ 0x55640793d640] ref B L1: 61.4% 38.6%
[libx264 @ 0x55640793d640] kb/s:2139.92
ffmpeg -c:v h264_cuvid -i input.mkv -c:v libx264 -y output.mkv  59.59s user 0.61s system 888% cpu 6.777 total

Attachments (1)

ffmpeg-report.zip (105.6 KB ) - added by Jason Dove 10 months ago.
report output

Download all attachments as: .zip

Change History (9)

by Jason Dove, 10 months ago

Attachment: ffmpeg-report.zip added

report output

comment:1 by Jason Dove, 10 months ago

Sample was uploaded with name cuvid-decoder-regression-sample.mkv

comment:2 by Jason Dove, 10 months ago

Testing some builds from https://github.com/BtbN/FFmpeg-Builds/releases to try to find when the regression was introduced

2023-05-31 N-110946-g859c34706d behaves correctly (output is smooth)
2023-06-30 N-111313-ge4d4d616ba does not behave correctly (output is jerky)

comment:4 by Balling, 10 months ago

Keywords: decoder nvidia removed
Status: newopen

Yep very simple to reproduce with ffplay.exe -vcodec h264_cuvid C:\Users\ZAQU\Downloads\cuvid-decoder-regression-sample.mkv

Wrong reordering. Indeed, a regression

comment:5 by Jason Dove, 10 months ago

I did a git bisect and the first bad commit is 402d98c9d467dff6931d906ebb732b9a00334e0b.

I also confirmed that master with libavcodec/cuviddec.c at dc7bd7c5a5ad5ea800dfb63cc5dd15670d065527 works properly, so I at least have a workaround for now.

comment:6 by Balling, 10 months ago

It is funny. That commit was derived for a fix for another bug #8948, but it did not fix it, not to mention cuvid is not affected. So no wonder it broke other stuff.

comment:7 by Balling, 10 months ago

Dup. of #10409, workaround is -surfaces 10

comment:8 by Roman Arzumanyan, 10 months ago

Hello,

402d98c9d467dff6931d906ebb732b9a00334e0b merely changes the default value of nb_surfaces variable and allows user to set it via extra_hw_frames (to avoid the deprecated option usage and unify cuvid behaviour with nvdec in this aspect):

fifo_size_inc = ctx->nb_surfaces;
ctx->nb_surfaces = FFMAX(ctx->nb_surfaces, format->min_num_decode_surfaces + 3);

if (avctx->extra_hw_frames > 0)
    ctx->nb_surfaces += avctx->extra_hw_frames;

fifo_size_inc = ctx->nb_surfaces - fifo_size_inc;
if (fifo_size_inc > 0 && av_fifo_grow2(ctx->frame_queue, fifo_size_inc) < 0) {
    av_log(avctx, AV_LOG_ERROR, "Failed to grow frame queue on video sequence callback\n");
    ctx->internal_error = AVERROR(ENOMEM);
    return 0;
}

So it can be easily fixed by reverting the default nb_surfaces value:

{ "surfaces", "Maximum surfaces to be used for decoding", OFFSET(nb_surfaces), AV_OPT_TYPE_INT, { .i64 = -1 }, 25, INT_MAX, VD | AV_OPT_FLAG_DEPRECATED }

But there are 2 caveats:
1) It looks like a bug in Video Codec SDK which returns invalid min_num_decode_surfaces value.
2) Huge vRAM consumption increase. Many video sequences require just 6-7 surfaces in nvdec pool instead of 25.

Unfortunately, given pt. 1 it looks like there's no reliable way so far to determine actual minimal number of surfaces required for decoding.

Version 5, edited 10 months ago by Roman Arzumanyan (previous) (next) (diff)
Note: See TracTickets for help on using tickets.