Opened 6 months ago

Last modified 5 months ago

#10668 open defect

cuvid regression creates jerky output

Reported by: Jason Dove Owned by:
Priority: important Component: avcodec
Version: git-master Keywords: cuvid
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

Using the h264_cuvid decoder with certain content will cause the output to be jerky.

How to reproduce:

% ffmpeg -c:v h264_cuvid -i input.mkv -c:v libx264 -y output.mkv
ffmpeg version N-112777-g08e97dae20 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.1 (GCC) 20230801
  configuration: --prefix=/usr --extra-cflags=-I/opt/cuda/include --extra-ldflags=-L/opt/cuda/lib64 --enable-lto --disable-rpath --enable-gpl --enable-version3 --enable-nonfree --enable-shared --disable-static --disable-stripping --disable-htmlpages --enable-gray --enable-alsa --enable-avisynth --enable-bzlib --enable-chromaprint --enable-frei0r --enable-gcrypt --enable-gmp --enable-gnutls --enable-iconv --enable-ladspa --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcelt --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libdavs2 --enable-libdc1394 --enable-libfdk-aac --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libglslang --enable-libgme --enable-libgsm --enable-libiec61883 --enable-libilbc --enable-libjack --enable-libjxl --enable-libklvanc --enable-libkvazaar --enable-liblensfun --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-libopencv --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-libopenvino --enable-libopus --enable-libplacebo --enable-libpulse --enable-librabbitmq --enable-librav1e --enable-librist --enable-librsvg --enable-librubberband --enable-librtmp --enable-libshine --enable-libsmbclient --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libsvthevc --enable-libsvtvp9 --disable-libtensorflow --enable-libtesseract --enable-libtheora --disable-libtls --enable-libtwolame --enable-libuavs3d --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxavs2 --enable-libxcb --enable-libxcb-shm --enable-libxcb-xfixes --enable-libxcb-shape --enable-libxvid --enable-libxml2 --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-lzma --enable-decklink --disable-mbedtls --enable-libmysofa --enable-openal --enable-opencl --enable-opengl --disable-openssl --disable-pocketsphinx --enable-sndio --enable-sdl2 --enable-vapoursynth --enable-vulkan --enable-xlib --enable-zlib --enable-amf --enable-cuda-nvcc --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-libdrm --enable-libvpl --enable-libnpp --enable-nvdec --enable-nvenc --enable-omx --enable-rkmpp --enable-v4l2-m2m --enable-vaapi --enable-vdpau
  libavutil      58. 32.100 / 58. 32.100
  libavcodec     60. 33.100 / 60. 33.100
  libavformat    60. 17.100 / 60. 17.100
  libavdevice    60.  4.100 / 60.  4.100
  libavfilter     9. 13.100 /  9. 13.100
  libswscale      7.  6.100 /  7.  6.100
  libswresample   4. 13.100 /  4. 13.100
  libpostproc    57.  4.100 / 57.  4.100
Input #0, matroska,webm, from 'input.mkv':
  Metadata:
    ENCODER         : Lavf60.17.100
  Duration: 00:00:20.25, start: 0.000000, bitrate: 5560 kb/s
  Stream #0:0: Video: h264 (Main), yuv420p(tv, bt709, progressive), 1918x814 [SAR 1:1 DAR 959:407], 23.98 fps, 23.98 tbr, 1k tbn (default)
    Metadata:
      DURATION        : 00:00:20.250000000
  Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 00:00:20.031000000
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> h264 (libx264))
  Stream #0:1 -> #0:1 (aac (native) -> vorbis (libvorbis))
Press [q] to stop, [?] for help
[libx264 @ 0x55640793d640] using SAR=1/1
[libx264 @ 0x55640793d640] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x55640793d640] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 0x55640793d640] 264 - core 164 r3108 31e19f9 - H.264/MPEG-4 AVC codec - Copyleft 2003-2023 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to 'output.mkv':
  Metadata:
    encoder         : Lavf60.17.100
  Stream #0:0: Video: h264 (H264 / 0x34363248), nv12(tv, bt709, progressive), 1918x814 [SAR 1:1 DAR 959:407], q=2-31, 23.98 fps, 1k tbn (default)
    Metadata:
      DURATION        : 00:00:20.250000000
      encoder         : Lavc60.33.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1: Audio: vorbis (oV[0][0] / 0x566F), 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 00:00:20.031000000
      encoder         : Lavc60.33.100 libvorbis
[out#0/matroska @ 0x556407921100] video:5317kB audio:191kB subtitle:0kB other streams:0kB global headers:4kB muxing overhead: 0.264545%
frame=  482 fps= 75 q=-1.0 Lsize=    5523kB time=00:00:20.02 bitrate=2259.5kbits/s speed=3.11x
[libx264 @ 0x55640793d640] frame I:15    Avg QP:18.37  size: 29181
[libx264 @ 0x55640793d640] frame P:307   Avg QP:20.27  size: 14622
[libx264 @ 0x55640793d640] frame B:160   Avg QP:17.28  size:  3236
[libx264 @ 0x55640793d640] consecutive B-frames: 53.5%  5.4%  3.7% 37.3%
[libx264 @ 0x55640793d640] mb I  I16..4: 43.4% 53.1%  3.5%
[libx264 @ 0x55640793d640] mb P  I16..4: 10.8% 20.1%  0.2%  P16..4: 39.2%  3.2%  3.1%  0.0%  0.0%    skip:23.3%
[libx264 @ 0x55640793d640] mb B  I16..4:  1.8%  2.0%  0.0%  B16..8: 13.9%  1.0%  0.1%  direct: 3.2%  skip:78.0%  L0:58.0% L1:41.1% BI: 0.9%
[libx264 @ 0x55640793d640] 8x8 transform intra:62.3% inter:93.8%
[libx264 @ 0x55640793d640] coded y,uvDC,uvAC intra: 27.7% 53.5% 3.0% inter: 10.0% 27.8% 0.0%
[libx264 @ 0x55640793d640] i16 v,h,dc,p: 29% 33% 17% 21%
[libx264 @ 0x55640793d640] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 19% 46%  2%  2%  2%  3%  2%  2%
[libx264 @ 0x55640793d640] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 26% 27% 21%  3%  6%  5%  7%  3%  2%
[libx264 @ 0x55640793d640] i8c dc,h,v,p: 52% 25% 21%  2%
[libx264 @ 0x55640793d640] Weighted P-Frames: Y:9.1% UV:4.9%
[libx264 @ 0x55640793d640] ref P L0: 41.1%  4.1% 26.2% 26.1%  2.4%
[libx264 @ 0x55640793d640] ref B L0: 55.2% 26.6% 18.2%
[libx264 @ 0x55640793d640] ref B L1: 61.4% 38.6%
[libx264 @ 0x55640793d640] kb/s:2139.92
ffmpeg -c:v h264_cuvid -i input.mkv -c:v libx264 -y output.mkv  59.59s user 0.61s system 888% cpu 6.777 total

Attachments (1)

ffmpeg-report.zip (105.6 KB ) - added by Jason Dove 6 months ago.
report output

Download all attachments as: .zip

Change History (9)

by Jason Dove, 6 months ago

Attachment: ffmpeg-report.zip added

report output

comment:1 by Jason Dove, 6 months ago

Sample was uploaded with name cuvid-decoder-regression-sample.mkv

comment:2 by Jason Dove, 6 months ago

Testing some builds from https://github.com/BtbN/FFmpeg-Builds/releases to try to find when the regression was introduced

2023-05-31 N-110946-g859c34706d behaves correctly (output is smooth)
2023-06-30 N-111313-ge4d4d616ba does not behave correctly (output is jerky)

comment:4 by Balling, 6 months ago

Keywords: decoder nvidia removed
Status: newopen

Yep very simple to reproduce with ffplay.exe -vcodec h264_cuvid C:\Users\ZAQU\Downloads\cuvid-decoder-regression-sample.mkv

Wrong reordering. Indeed, a regression

comment:5 by Jason Dove, 5 months ago

I did a git bisect and the first bad commit is 402d98c9d467dff6931d906ebb732b9a00334e0b.

I also confirmed that master with libavcodec/cuviddec.c at dc7bd7c5a5ad5ea800dfb63cc5dd15670d065527 works properly, so I at least have a workaround for now.

comment:6 by Balling, 5 months ago

It is funny. That commit was derived for a fix for another bug #8948, but it did not fix it, not to mention cuvid is not affected. So no wonder it broke other stuff.

comment:7 by Balling, 5 months ago

Dup. of #10409, workaround is -surfaces 10

comment:8 by Roman Arzumanyan, 5 months ago

Hello,

402d98c9d467dff6931d906ebb732b9a00334e0b merely changes the default value of nb_surfaces variable and allows user to set it via extra_hw_frames (to avoid the deprecated option usage and unify cuvid behaviour with nvdec in this aspect):

fifo_size_inc = ctx->nb_surfaces;
ctx->nb_surfaces = FFMAX(ctx->nb_surfaces, format->min_num_decode_surfaces + 3);

if (avctx->extra_hw_frames > 0)
    ctx->nb_surfaces += avctx->extra_hw_frames;

fifo_size_inc = ctx->nb_surfaces - fifo_size_inc;
if (fifo_size_inc > 0 && av_fifo_grow2(ctx->frame_queue, fifo_size_inc) < 0) {
    av_log(avctx, AV_LOG_ERROR, "Failed to grow frame queue on video sequence callback\n");
    ctx->internal_error = AVERROR(ENOMEM);
    return 0;
}

So it can be easily fixed by reverting the default nb_surfaces value:

{ "surfaces", "Maximum surfaces to be used for decoding", OFFSET(nb_surfaces), AV_OPT_TYPE_INT, { .i64 = -1 }, 25, INT_MAX, VD | AV_OPT_FLAG_DEPRECATED }

But there are 2 caveats:
1) It looks like a bug in Video Codec SDK which returns insufficient min_num_decode_surfaces value.
2) Huge vRAM consumption increase. Many video sequences require just 6-7 surfaces in nvdec pool instead of 25.

Unfortunately, given pt. 1 it looks like there's no reliable way so far to determine actual minimal number of surfaces required for decoding.

Last edited 5 months ago by Roman Arzumanyan (previous) (diff)
Note: See TracTickets for help on using tickets.