swresample can introduce significant audio distortion
|Reported by:||Gregory Beauregard||Owned by:|
|Blocking:||Reproduced by developer:||yes|
|Analyzed by developer:||no|
Description (last modified by )
UPDATE: see bottom, swresample is at fault
Summary of the bug: The loudnorm filter can badly goof the loudness measurement in certain situations (e.g. particular inaudible noise after a resample) resulting in significant distortion if attempting to use it in dynamic mode or to get measurements.
ffmpeg used is git master as of 2021-07-29.
Using the reproducer file at https://stream.gably.net/images/loudnorm_samp.mkv,
loudnorm_samp.mkv (4.7 MB), 48 kHz DTS-HD, analyze the loudness with loudnorm to get integrated loudness ~-21:
"input_i" : "-21.57" (output of this command attached):
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo,loudnorm=print_format=json -f null -
However, when we reanalyze it by giving the resampler a dither and specifying 48 kHz, the loudnorm filter goofs the integrated loudness measurement with 0.13: "input_i" : "0.13"`:
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,loudnorm=print_format=json -f null -
This results in really wrong re-normalization resulting in significant distortion. If we run
ebur128 to analyze the loudness instead of
loudnorm in the situation where it goofs,
ebur128 outputs an expected -22.2 integrated loudness, and doesn't change between the two situations above:
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null -
So it seems like there's some odd sensitivity in the
loudnorm filter to certain inaudible noise that's messing it up where
ebur128 is still ok.
Update: You can use the following two
ebur128 filter runs similar to the above with and without inserting a
,aformat=r=192000, resampling to 192 kHz as the
loudnorm filter does unconditionally internally (but
ebur128 does not). I've attached the outputs as
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000,ebur128 -f null - > 192ebur.txt 2>&1
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null - > 48ebur.txt 2>&1
In the 192 kHz resample the
ebur128 filter wrongly measures the integrated loudness to
I: -0.1 LUFS, similar to the wrong
loudnorm measurement, but the 48 kHz case measures -22.2 as you'd expect. My understanding is the two filters share some measurement code, so presumably ffmpeg's measurement code is broken at 192 kHz.
The internal 192 kHz resample in
loudnorm hid the real culprit here, swresample significantly distorting the audio itself.
Listen to the sample with just the format conversions to see it has been significantly distorted:
ffplay -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000
The audio is not distorted if we don't add the
aformat=r=192000 to the above; however, we can also reproduce the significant distortion with just one resample if we specify
s32 sample format:
ffplay -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=s32
resampler=soxr does not fix the distortion.
Change History (21)
comment:16 by , 2 years ago
|Component:||avfilter → swresample|
|Summary:||loudnorm filter goofs loudness measurements in some cases → swresample can introduce significant audio distortion|