Opened 14 months ago

#10263 new defect

Regression: Huge increase in FFmpeg QSV memory usage + OOMs

Reported by: eero-t Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

Between following FFmpeg commits:

  • 2023-02-02: 7d49fef8b4 lavc/vaapi_encode: fix propagating durations and opaques
  • 2023-02-03: 9a820ec8b1 ffmpeg: add video heartbeat capability to fix_sub_duration

Doing H.264 transcoding with downscaling & FPS conversion started to take enormous amounts of memory, so that FFmpeg gets OOM killed regardless of how much free memory that host has:

[11930.611964] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-c3.scope,task=ffmpeg,pid=19970,uid=1000
[11930.612006] Out of memory: Killed process 19970 (ffmpeg) total-vm:9158180kB, anon-rss:7277512kB, file-rss:128kB, shmem-rss:0kB, UID:1000 pgtables:17128kB oom_score_adj:0

Earlier it had worked fine also on machine that had only 8GB RAM, but now it OOMs on machine with 32GB, after using all that RAM...

How to reproduce:

% ffmpeg -loglevel verbose -vsync passthrough -fpsprobesize 300 -analyzeduration 500K -hwaccel qsv -hwaccel_output_format qsv -qsv_device /dev/dri/renderD128 -c:v h264_qsv -i 1280x720p_29.97_10mb_h264_cabac.264 -c:v h264_qsv -b:v 800K -vf scale_qsv=w=352:h=240,fps=15 -compression_level 4 -an -vframes 2400 -y output.h264
ffmpeg version N-110014-ga6e9d01f88 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.3.0-1ubuntu1~22.04)
  configuration: --prefix=/opt/install/ --enable-libmfx --enable-vaapi --enable-sdl2 --disable-libx265 --disable-libx264 --disable-libvpx --enable-libvorbis --enable-libopus --disable-libmp3lame --disable-libass --disable-sndio --enable-libfreetype --enable-gpl --disable-doc
...
Input #0, h264, from '1280x720p_29.97_10mb_h264_cabac.264':
  Duration: N/A, bitrate: N/A
  Stream #0:0: Video: h264 (High), 1 reference frame, yuv420p(tv, bt709, progressive, left), 1280x720 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn
[h264_mp4toannexb @ 0x562553551580] The input looks like it is Annex B already
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_qsv) -> h264 (h264_qsv))
Press [q] to stop, [?] for help
[AVHWDeviceContext @ 0x562553551440] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (02d29220e).
[AVHWDeviceContext @ 0x562553551440] Driver not found in known nonstandard list, using standard behaviour.
[h264_qsv @ 0x562553539500] Decoder: output is video memory surface
[h264_qsv @ 0x562553539500] Use Intel(R) Media SDK to create MFX session, the required implementation version is 1.35
[AVHWDeviceContext @ 0x5625535bd300] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (02d29220e).
[AVHWDeviceContext @ 0x5625535bd300] Driver not found in known nonstandard list, using standard behaviour.
[h264_qsv @ 0x562553539500] Decoder: output is video memory surface
[h264_qsv @ 0x562553539500] Use Intel(R) Media SDK to create MFX session, the required implementation version is 1.35
[graph 0 input from stream 0:0 @ 0x5625535fa900] w:1280 h:720 pixfmt:qsv tb:1/1200000 fr:30000/1001 sar:1/1
[AVHWDeviceContext @ 0x5625535f7f00] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (02d29220e).
[AVHWDeviceContext @ 0x5625535f7f00] Driver not found in known nonstandard list, using standard behaviour.
[Parsed_scale_qsv_0 @ 0x5625535f9980] Use Intel(R) Media SDK to create MFX session, API version is 1.35, the required implementation version is 1.35
[Parsed_scale_qsv_0 @ 0x5625535f9980] VPP: input is video memory surface
[Parsed_scale_qsv_0 @ 0x5625535f9980] VPP: output is video memory surface
[Parsed_fps_1 @ 0x5625535fa000] fps=15/1
[Parsed_fps_1 @ 0x5625535fa000] Set first pts to 0
[h264_qsv @ 0x562553551dc0] Using input frames context (format qsv) with h264_qsv encoder.
[h264_qsv @ 0x562553551dc0] Encoder: input is video memory surface
[h264_qsv @ 0x562553551dc0] Use Intel(R) Media SDK to create MFX session, the required implementation version is 1.35
[h264_qsv @ 0x562553551dc0] Using the variable bitrate (VBR) ratecontrol method
[h264_qsv @ 0x562553551dc0] MFMode:2
[h264_qsv @ 0x562553551dc0] profile: avc high; level: 20
[h264_qsv @ 0x562553551dc0] GopPicSize: 256; GopRefDist: 3; GopOptFlag: closed; IdrInterval: 0
[h264_qsv @ 0x562553551dc0] TargetUsage: 4; RateControlMethod: VBR
[h264_qsv @ 0x562553551dc0] BufferSizeInKB: 300; InitialDelayInKB: 150; TargetKbps: 800; MaxKbps: 1200; BRCParamMultiplier: 1
[h264_qsv @ 0x562553551dc0] NumSlice: 1; NumRefFrame: 2
[h264_qsv @ 0x562553551dc0] RateDistortionOpt: OFF
[h264_qsv @ 0x562553551dc0] RecoveryPointSEI: OFF
[h264_qsv @ 0x562553551dc0] VDENC: OFF
[h264_qsv @ 0x562553551dc0] Entropy coding: CABAC; MaxDecFrameBuffering: 2
[h264_qsv @ 0x562553551dc0] NalHrdConformance: ON; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: ON
[h264_qsv @ 0x562553551dc0] FrameRateExtD: 1; FrameRateExtN: 15 
[h264_qsv @ 0x562553551dc0] IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0
[h264_qsv @ 0x562553551dc0] MaxFrameSize: 67584; MaxSliceSize: 0
[h264_qsv @ 0x562553551dc0] BitrateLimit: ON; MBBRC: OFF; ExtBRC: OFF
[h264_qsv @ 0x562553551dc0] Trellis: auto
[h264_qsv @ 0x562553551dc0] RepeatPPS: OFF; NumMbPerSlice: 0; LookAheadDS: 2x
[h264_qsv @ 0x562553551dc0] AdaptiveI: OFF; AdaptiveB: OFF; BRefType:off
[h264_qsv @ 0x562553551dc0] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0
[h264_qsv @ 0x562553551dc0] DisableDeblockingIdc: 0 
[h264_qsv @ 0x562553551dc0] SkipFrame: no_skip
[h264_qsv @ 0x562553551dc0] PRefType: default
[h264_qsv @ 0x562553551dc0] TransformSkip: unknown 
[h264_qsv @ 0x562553551dc0] IntRefCycleDist: 0
[h264_qsv @ 0x562553551dc0] LowDelayBRC: OFF
[h264_qsv @ 0x562553551dc0] MaxFrameSizeI: 0; MaxFrameSizeP: 0
[h264_qsv @ 0x562553551dc0] ScenarioInfo: 0
Output #0, h264, to 'output/0030_HD22_1.0.h264':
  Metadata:
    encoder         : Lavf60.4.100
  Stream #0:0: Video: h264, 1 reference frame, qsv(progressive), 352x240 (0x0) [SAR 1:1 DAR 22:15], q=2-31, 800 kb/s, 15 fps, 15 tbn
    Metadata:
      encoder         : Lavc60.6.101 h264_qsv
    Side data:
      cpb: bitrate max/min/avg: 0/0/800000 buffer size: 0 vbv_delay: N/A
frame=    0 fps=0.0 q=0.0 size=       0kB time=-577014:32:22.77 bitrate=  -0.0kbits/s speed=N/A    
frame=  361 fps=0.0 q=20.0 size=    2048kB time=00:00:24.00 bitrate= 699.1kbits/s speed=46.8x    
frame=  689 fps=680 q=23.0 size=    4352kB time=00:00:45.86 bitrate= 777.3kbits/s speed=45.3x    
frame= 1037 fps=685 q=23.0 size=    6656kB time=00:01:09.06 bitrate= 789.5kbits/s speed=45.6x    
frame= 1396 fps=693 q=26.0 size=    8960kB time=00:01:33.00 bitrate= 789.3kbits/s speed=46.2x    
frame= 1745 fps=694 q=20.0 size=   11008kB time=00:01:56.26 bitrate= 775.6kbits/s speed=46.2x    
[in#0/h264 @ 0x562553530d40] EOF while reading input
[in#0/h264 @ 0x562553530d40] Terminating demuxer thread
[h264_qsv @ 0x562553539500] A decode call did not consume any data: expect more data at input (-10)
    Last message repeated 2 times
[out_0_0 @ 0x5625535fb540] 100 buffers queued in out_0_0, something may be wrong.
[out_0_0 @ 0x5625535fb540] 1000 buffers queued in out_0_0, something may be wrong.
[out_0_0 @ 0x5625535fb540] 10000 buffers queued in out_0_0, something may be wrong.
[out_0_0 @ 0x5625535fb540] 100000 buffers queued in out_0_0, something may be wrong.
[out_0_0 @ 0x5625535fb540] 1000000 buffers queued in out_0_0, something may be wrong.
[out_0_0 @ 0x5625535fb540] 10000000 buffers queued in out_0_0, something may be wrong.

(Errors at the end of output are probably due to its memory allocations getting denied before it is OOM-killed.)

It is FFmpeg bug, because doing same with MSDK tool works fine:

sample_multi_transcode -i::h264 input/1280x720p_29.97_10mb_h264_cabac.264 -o::h264 output/0030_HD22_1.0.h264 -b 800 -u 4 -n 2400 -f 15 -w 352 -h 240 -FRC::PT -async 4 -hw

I'm not seeing this with FFpeg VA-API, or other FFmpeg tests I'm running, so it's most likely related to using QSV to do both downscaling and FPS conversion.

Change History (0)

Note: See TracTickets for help on using tickets.