Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#942 closed defect (invalid)

swresample leaks memory when sampling up

Reported by: beafdefx Owned by: michael
Priority: normal Component: swresample
Version: git-master Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Version 0.10

Cross compiling for Android/ARM with C++, if that matters.

Decoding and resampling are great for me until I have swresample resample upward in bitrate. The memory usage is very obviously headed upward with no ceiling as I feed it data to resample. Everything goes back to normal when I free the resample context.

The leaking is very obvious when going from 44100 to 48000. It's still present when going from 44100 to 44200, but much slower. No problems downsampling.

Perhaps there could also be a problem if resampling to a larger greater channel config or sample format. Haven't tested that myself.

Attachments (2)

test.tar.gz (10.9 KB) - added by beafdefx 5 years ago.
Test C program and valgrind out
valgrind_requested.gz (3.2 KB) - added by beafdefx 5 years ago.
Requested valgrind output

Download all attachments as: .zip

Change History (17)

comment:1 Changed 5 years ago by cehoyos

  • Priority changed from important to normal

Please add valgrind output, it is needed for memleak reports.

comment:2 Changed 5 years ago by beafdefx

Sorry, I only work on Android right now -- we have no such useful tools. I'm just looking at the output of top and watching my process die over time.

comment:3 Changed 5 years ago by cehoyos

Can't you test your command on a x86 Linux box?

At least add a command line and complete, uncut console output, maybe somebody else will test.

comment:4 Changed 5 years ago by beafdefx

There is no command. On Android, all native code is compiled and loaded as shared libraries and invoked via JNI wrappers, so the code can not stand alone as an exe.

The code path is pretty simple. Allocate the resample context with the destination rate > source rate, feed it lots of samples, watch it use memory. Memory goes away when the context is freed. The resampling otherwise functions perfectly.

I can try to extract the relevant bits into a command line program and run it that way, but I won't have time to get to that for another few days.

Changed 5 years ago by beafdefx

Test C program and valgrind out

comment:5 Changed 5 years ago by beafdefx

OK, I wrote a simple console program that just uses swresample to resample random data and included some valgrind output in the tar.gz. The contents of the audio input buffer isn't important, it's the memory use with repeated calls to swr_convert that I'm concerned about.

There are two defines that you can play with to alter the way the program executes. OUTRATE_MULTIPLIER determines the sample rate of the output in relation to the input (input is locked at 16kHz). RESAMPLE_REPS determines how many times swr_convert is called repeatedly. When OUTRATE_MULTIPLIER is greater than 1, the program leaks. The bigger the value of RESAMPLE_REPS, the more it leaks.

DO_SWR_FREE determines if swr_free is called. I leave this OFF for the purpose of measuring memory leaking because swr_free cleans up all the excessive memory that was used during resampling. I'm interested in showing the excessive memory use during a long resampling session, for example, during extended playback of a song or stream.

The included valgrind output is for 100 reps with a 2x output sample rate mulitpler compared to 1000 reps of the same:

For 100 reps:

==45151== HEAP SUMMARY:
==45151== in use at exit: 4,294,524 bytes in 42 blocks
==45151== total heap usage: 51 allocs, 9 frees, 8,230,844 bytes allocated

For 1000 reps:

==45133== HEAP SUMMARY:
==45133== in use at exit: 32,966,524 bytes in 42 blocks
==45133== total heap usage: 54 allocs, 12 frees, 65,478,940 bytes allocated

This kind of excess usage doesn't make sense. Also, this excess DOES NOT OCCUR if OUTRATE_MULITIPLER is 1 or less. I would hope that resampling reuses existing buffers if possible, but it appears that is not the case.

Hope this helps.

comment:6 follow-up: Changed 5 years ago by cehoyos

Please rerun valgrind with "--leak-check=full --show-reachable=yes"

Changed 5 years ago by beafdefx

Requested valgrind output

comment:7 in reply to: ↑ 6 Changed 5 years ago by beafdefx

Replying to cehoyos:

Please rerun valgrind with "--leak-check=full --show-reachable=yes"

Attached

comment:8 Changed 5 years ago by reimar

  • Resolution set to invalid
  • Status changed from new to closed

You tell it to process 16000 samples and give it an output buffer of 16000.
If you upsample, that input of 16000 will obviously give more than 16000 samples after resampling, what do you expect to happen to those?
Your "memleak" is the resampler taking care of all those samples you pushed in but never gave it a chance to return back out.

Last edited 5 years ago by reimar (previous) (diff)

comment:9 Changed 5 years ago by beafdefx

I don't think I understand. The output buffer is bigger than the input buffer by exactly the ratio of the input sample rate to the output sample rate, and I'm telling the resampler I have exactly room for 16000 samples in each buffer. The return value from swr_convert is exactly 16000, so I assume is is processing every sample in and placing every resampled sample out. Am I perhaps misunderstanding the API?

Like I mentioned before, the resampling is working perfectly in my Android code, empirically speaking. There is nothing wrong with the output. The internal buffers inside swresample just appear to be growing out of control when the output sample rate is greater than the input sample rate.

comment:10 Changed 5 years ago by reimar

The output buffer is bigger than the input buffer

Yes, it is, but you don't tell FFmpeg that. You pass it exactly the same value twice, i.e. you tell it the output buffer has the same size as the input buffer.

comment:11 Changed 5 years ago by beafdefx

OK, I'm looking at the documentation for swr_convert:

http://www.ffmpeg.org/doxygen/trunk/swresample_8c.html#81af226d8969df314222218c56396f6a

Parameters:

s: allocated Swr context, with parameters set
out: output buffers, only the first one need be set in case of packed audio
out_count: amount of space available for output in samples per channel
in: input buffers, only the first one need to be set in case of packed audio
in_count: number of input samples available in one channel

The counts are documented as "samples per channel". If the counts are supposed to be buffer sizes, then the documentation is incorrect or I am misunderstanding the documentation. If the values are supposed to be measured in samples per channel, then I would assume passing 16000 for both values is correct since I have allocated exactly room for 16000 samples in both the input buffer and the output buffer, each according to their requested sampling rate.

comment:12 Changed 5 years ago by reimar

I don't understand where your problem in understanding lies.
If you resample from 16000 samples per second to 32000 samples per second, then your input of 16000 samples (i.e. one second) will become 32000 samples (since it is still one second of audio).
If your output buffer has only space available for 16000 samples that leaves another 16000 of those resampled 32000 samples that the library keeps buffered internally.
out_count is of course _not_ the "number of input samples that would expand to a size that exactly fills this output buffer" because then everyone getting the rounding wrong (the ratio will not always be a nice 1:2) would end up with a buffer overflow. If you have a better way of clarifying this: patch welcome. It seemed clear enough to me, but that doesn't mean much.

comment:13 Changed 5 years ago by beafdefx

I know where I went wrong. I was not realizing here that the number of samples changes during a resample. For some reason I was thinking that the width of a sample changes, which is not going to be the case unless the sample format changes as specified in the context. So that was just my misunderstanding.

But you touched on an auxiliary issue about rounding. If I have arbitrarily large amount of audio to resample over time, and the ratio is not a nice multiple, what is the best way to avoid accumulating extra bits in the internal swresample buffer without having to push extra non-samples at the end of one of my buffers? And, for that matter, I don't think this behavior of implicit buffering made perfectly clear to callers of the API so we can knowingly avoid or compensate.

By the way, thanks for hanging with me on this one. swresample definitely meets some specific needs of mine and dealing with memory well is absolutely crucial on limited mobile devices.

comment:14 Changed 5 years ago by reimar

I am not sure what kind of solution you are thinking of exactly, and I haven't tested ant of these.
Most of the time, just making the buffer a bit larger so it can take a few extra samples when necessary should work I guess.
An alternative if that's no good is to call the resample function with 0 size input buffer.
If you absolutely always want to have a completely filled fixed-size output buffer you'd probably have to keep track of the rounding error and do the 0-size input buffer exactly at the right point.
However that is complex and very brittle code.
Another option would be to add a function to libswresample that will return the number of buffered samples.

comment:15 Changed 5 years ago by beafdefx

In my particular case, I have allocated a large buffer (whatever FFmpeg says is the max audio buffer size, 192000 I think) to contain converted and resampled PCM and reuse that across the decoding session. It gets allocated in Java and passed into native code which gets written into directly by FFmpeg. I generally don't worry about overflows because I only process one packet at a time and bubble that PCM back up to the Java layer for output. So I guess I can just tell the resampler to just use the entire contents of that buffer after the current AVPacket has been decoded into an AVFrame.

As far as clarity for developers, I think your idea of providing a function to obtain the number of samples held in swr's internal buffers is a great idea. It not only makes it clear that there is internal buffering, but it also makes it easy to tell at runtime if buffering is going on without having to drain the buffer.

Note: See TracTickets for help on using tickets.