Opened 3 years ago

Last modified 3 years ago

#9285 open defect

Excessive GPU memory usage with nvdec hwaccel

Reported by: Ridley Combs Owned by: Timo R.
Priority: normal Component: avcodec
Version: unspecified Keywords: nvdec nvidia
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: yes

Description

When decoding video using the CUDA hwaccel, ff_nvdec_decode_init() sets both ulNumDecodeSurfaces and ulNumOutputSurfaces to frames_ctx->initial_pool_size, which in turn is set by ff_nvdec_decode_init to dpb_size + 2, which in turn has 3 added by ff_decode_get_hw_frames_ctx() and extra_hw_frames + thread_count added by avcodec_get_hw_frames_parameters.

This is excessive. Only ulNumDecodeSurfaces needs additional frames based on thread count (the output surfaces are only used in nvdec_retrieve_data, which runs on the consumer's single thread), while only ulNumOutputSurfaces needs the 3 additional output frames from ff_decode_get_hw_frames_ctx() or the ones from extra_hw_frames (the decode surfaces are never exposed to the consumer).

I'm not sure what the best way to handle this is. Maybe nvdec should ignore what the generic code sets initial_pool_size to altogether and instead calculate its buffer counts internally, duplicating the generic code's behavior only where appropriate? The initial_pool_size value seems to be designed for systems where the decoder's internal buffered frames are returned directly to the user, but that's not the case here.

Additionally, it doesn't seem like multithreading in CUDA actually serves any purpose; I see no performance gain when using multiple threads vs 1. Is it useful with any hardware decoder? Should we be defaulting multithreading off when using a hwaccel, or forcing it off unless the hwaccel fails and software fallback occurs? This can result in some pretty hefty memory usage for no reason by default on many-core machines.

Change History (1)

comment:1 by Balling, 3 years ago

Owner: set to Timo R.
Status: newopen

Can you comment?

Note: See TracTickets for help on using tickets.