Opened 21 months ago
#10263 new defect
Regression: Huge increase in FFmpeg QSV memory usage + OOMs
Reported by: | eero-t | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | unspecified | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
Between following FFmpeg commits:
- 2023-02-02: 7d49fef8b4 lavc/vaapi_encode: fix propagating durations and opaques
- 2023-02-03: 9a820ec8b1 ffmpeg: add video heartbeat capability to fix_sub_duration
Doing H.264 transcoding with downscaling & FPS conversion started to take enormous amounts of memory, so that FFmpeg gets OOM killed regardless of how much free memory that host has:
[11930.611964] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-c3.scope,task=ffmpeg,pid=19970,uid=1000 [11930.612006] Out of memory: Killed process 19970 (ffmpeg) total-vm:9158180kB, anon-rss:7277512kB, file-rss:128kB, shmem-rss:0kB, UID:1000 pgtables:17128kB oom_score_adj:0
Earlier it had worked fine also on machine that had only 8GB RAM, but now it OOMs on machine with 32GB, after using all that RAM...
How to reproduce:
% ffmpeg -loglevel verbose -vsync passthrough -fpsprobesize 300 -analyzeduration 500K -hwaccel qsv -hwaccel_output_format qsv -qsv_device /dev/dri/renderD128 -c:v h264_qsv -i 1280x720p_29.97_10mb_h264_cabac.264 -c:v h264_qsv -b:v 800K -vf scale_qsv=w=352:h=240,fps=15 -compression_level 4 -an -vframes 2400 -y output.h264 ffmpeg version N-110014-ga6e9d01f88 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 11 (Ubuntu 11.3.0-1ubuntu1~22.04) configuration: --prefix=/opt/install/ --enable-libmfx --enable-vaapi --enable-sdl2 --disable-libx265 --disable-libx264 --disable-libvpx --enable-libvorbis --enable-libopus --disable-libmp3lame --disable-libass --disable-sndio --enable-libfreetype --enable-gpl --disable-doc ... Input #0, h264, from '1280x720p_29.97_10mb_h264_cabac.264': Duration: N/A, bitrate: N/A Stream #0:0: Video: h264 (High), 1 reference frame, yuv420p(tv, bt709, progressive, left), 1280x720 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn [h264_mp4toannexb @ 0x562553551580] The input looks like it is Annex B already Stream mapping: Stream #0:0 -> #0:0 (h264 (h264_qsv) -> h264 (h264_qsv)) Press [q] to stop, [?] for help [AVHWDeviceContext @ 0x562553551440] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (02d29220e). [AVHWDeviceContext @ 0x562553551440] Driver not found in known nonstandard list, using standard behaviour. [h264_qsv @ 0x562553539500] Decoder: output is video memory surface [h264_qsv @ 0x562553539500] Use Intel(R) Media SDK to create MFX session, the required implementation version is 1.35 [AVHWDeviceContext @ 0x5625535bd300] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (02d29220e). [AVHWDeviceContext @ 0x5625535bd300] Driver not found in known nonstandard list, using standard behaviour. [h264_qsv @ 0x562553539500] Decoder: output is video memory surface [h264_qsv @ 0x562553539500] Use Intel(R) Media SDK to create MFX session, the required implementation version is 1.35 [graph 0 input from stream 0:0 @ 0x5625535fa900] w:1280 h:720 pixfmt:qsv tb:1/1200000 fr:30000/1001 sar:1/1 [AVHWDeviceContext @ 0x5625535f7f00] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.4 (02d29220e). [AVHWDeviceContext @ 0x5625535f7f00] Driver not found in known nonstandard list, using standard behaviour. [Parsed_scale_qsv_0 @ 0x5625535f9980] Use Intel(R) Media SDK to create MFX session, API version is 1.35, the required implementation version is 1.35 [Parsed_scale_qsv_0 @ 0x5625535f9980] VPP: input is video memory surface [Parsed_scale_qsv_0 @ 0x5625535f9980] VPP: output is video memory surface [Parsed_fps_1 @ 0x5625535fa000] fps=15/1 [Parsed_fps_1 @ 0x5625535fa000] Set first pts to 0 [h264_qsv @ 0x562553551dc0] Using input frames context (format qsv) with h264_qsv encoder. [h264_qsv @ 0x562553551dc0] Encoder: input is video memory surface [h264_qsv @ 0x562553551dc0] Use Intel(R) Media SDK to create MFX session, the required implementation version is 1.35 [h264_qsv @ 0x562553551dc0] Using the variable bitrate (VBR) ratecontrol method [h264_qsv @ 0x562553551dc0] MFMode:2 [h264_qsv @ 0x562553551dc0] profile: avc high; level: 20 [h264_qsv @ 0x562553551dc0] GopPicSize: 256; GopRefDist: 3; GopOptFlag: closed; IdrInterval: 0 [h264_qsv @ 0x562553551dc0] TargetUsage: 4; RateControlMethod: VBR [h264_qsv @ 0x562553551dc0] BufferSizeInKB: 300; InitialDelayInKB: 150; TargetKbps: 800; MaxKbps: 1200; BRCParamMultiplier: 1 [h264_qsv @ 0x562553551dc0] NumSlice: 1; NumRefFrame: 2 [h264_qsv @ 0x562553551dc0] RateDistortionOpt: OFF [h264_qsv @ 0x562553551dc0] RecoveryPointSEI: OFF [h264_qsv @ 0x562553551dc0] VDENC: OFF [h264_qsv @ 0x562553551dc0] Entropy coding: CABAC; MaxDecFrameBuffering: 2 [h264_qsv @ 0x562553551dc0] NalHrdConformance: ON; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: ON [h264_qsv @ 0x562553551dc0] FrameRateExtD: 1; FrameRateExtN: 15 [h264_qsv @ 0x562553551dc0] IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0 [h264_qsv @ 0x562553551dc0] MaxFrameSize: 67584; MaxSliceSize: 0 [h264_qsv @ 0x562553551dc0] BitrateLimit: ON; MBBRC: OFF; ExtBRC: OFF [h264_qsv @ 0x562553551dc0] Trellis: auto [h264_qsv @ 0x562553551dc0] RepeatPPS: OFF; NumMbPerSlice: 0; LookAheadDS: 2x [h264_qsv @ 0x562553551dc0] AdaptiveI: OFF; AdaptiveB: OFF; BRefType:off [h264_qsv @ 0x562553551dc0] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0 [h264_qsv @ 0x562553551dc0] DisableDeblockingIdc: 0 [h264_qsv @ 0x562553551dc0] SkipFrame: no_skip [h264_qsv @ 0x562553551dc0] PRefType: default [h264_qsv @ 0x562553551dc0] TransformSkip: unknown [h264_qsv @ 0x562553551dc0] IntRefCycleDist: 0 [h264_qsv @ 0x562553551dc0] LowDelayBRC: OFF [h264_qsv @ 0x562553551dc0] MaxFrameSizeI: 0; MaxFrameSizeP: 0 [h264_qsv @ 0x562553551dc0] ScenarioInfo: 0 Output #0, h264, to 'output/0030_HD22_1.0.h264': Metadata: encoder : Lavf60.4.100 Stream #0:0: Video: h264, 1 reference frame, qsv(progressive), 352x240 (0x0) [SAR 1:1 DAR 22:15], q=2-31, 800 kb/s, 15 fps, 15 tbn Metadata: encoder : Lavc60.6.101 h264_qsv Side data: cpb: bitrate max/min/avg: 0/0/800000 buffer size: 0 vbv_delay: N/A frame= 0 fps=0.0 q=0.0 size= 0kB time=-577014:32:22.77 bitrate= -0.0kbits/s speed=N/A frame= 361 fps=0.0 q=20.0 size= 2048kB time=00:00:24.00 bitrate= 699.1kbits/s speed=46.8x frame= 689 fps=680 q=23.0 size= 4352kB time=00:00:45.86 bitrate= 777.3kbits/s speed=45.3x frame= 1037 fps=685 q=23.0 size= 6656kB time=00:01:09.06 bitrate= 789.5kbits/s speed=45.6x frame= 1396 fps=693 q=26.0 size= 8960kB time=00:01:33.00 bitrate= 789.3kbits/s speed=46.2x frame= 1745 fps=694 q=20.0 size= 11008kB time=00:01:56.26 bitrate= 775.6kbits/s speed=46.2x [in#0/h264 @ 0x562553530d40] EOF while reading input [in#0/h264 @ 0x562553530d40] Terminating demuxer thread [h264_qsv @ 0x562553539500] A decode call did not consume any data: expect more data at input (-10) Last message repeated 2 times [out_0_0 @ 0x5625535fb540] 100 buffers queued in out_0_0, something may be wrong. [out_0_0 @ 0x5625535fb540] 1000 buffers queued in out_0_0, something may be wrong. [out_0_0 @ 0x5625535fb540] 10000 buffers queued in out_0_0, something may be wrong. [out_0_0 @ 0x5625535fb540] 100000 buffers queued in out_0_0, something may be wrong. [out_0_0 @ 0x5625535fb540] 1000000 buffers queued in out_0_0, something may be wrong. [out_0_0 @ 0x5625535fb540] 10000000 buffers queued in out_0_0, something may be wrong.
(Errors at the end of output are probably due to its memory allocations getting denied before it is OOM-killed.)
It is FFmpeg bug, because doing same with MSDK tool works fine:
sample_multi_transcode -i::h264 input/1280x720p_29.97_10mb_h264_cabac.264 -o::h264 output/0030_HD22_1.0.h264 -b 800 -u 4 -n 2400 -f 15 -w 352 -h 240 -FRC::PT -async 4 -hw
I'm not seeing this with FFpeg VA-API, or other FFmpeg tests I'm running, so it's most likely related to using QSV to do both downscaling and FPS conversion.