Opened 3 years ago
Last modified 3 years ago
#9352 new defect
swresample can introduce significant audio distortion
Reported by: | Gregory Beauregard | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | swresample |
Version: | git-master | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | yes | |
Analyzed by developer: | no |
Description (last modified by )
UPDATE: see bottom, swresample is at fault
Summary of the bug: The loudnorm filter can badly goof the loudness measurement in certain situations (e.g. particular inaudible noise after a resample) resulting in significant distortion if attempting to use it in dynamic mode or to get measurements.
ffmpeg used is git master as of 2021-07-29.
Using the reproducer file at https://stream.gably.net/images/loudnorm_samp.mkv, loudnorm_samp.mkv
(4.7 MB), 48 kHz DTS-HD, analyze the loudness with loudnorm to get integrated loudness ~-21: "input_i" : "-21.57"
(output of this command attached):
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo,loudnorm=print_format=json -f null -
However, when we reanalyze it by giving the resampler a dither and specifying 48 kHz, the loudnorm filter goofs the integrated loudness measurement with 0.13: "input_i" : "0.13"`:
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,loudnorm=print_format=json -f null -
This results in really wrong re-normalization resulting in significant distortion. If we run ebur128
to analyze the loudness instead of loudnorm
in the situation where it goofs, ebur128
outputs an expected -22.2 integrated loudness, and doesn't change between the two situations above:
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null -
So it seems like there's some odd sensitivity in the loudnorm
filter to certain inaudible noise that's messing it up where ebur128
is still ok.
Update: You can use the following two ebur128
filter runs similar to the above with and without inserting a ,aformat=r=192000
, resampling to 192 kHz as the loudnorm
filter does unconditionally internally (but ebur128
does not). I've attached the outputs as 192ebur.txt
and 48ebur.txt
.
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000,ebur128 -f null - > 192ebur.txt 2>&1
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null - > 48ebur.txt 2>&1
In the 192 kHz resample the ebur128
filter wrongly measures the integrated loudness to I: -0.1 LUFS
, similar to the wrong loudnorm
measurement, but the 48 kHz case measures -22.2 as you'd expect. My understanding is the two filters share some measurement code, so presumably ffmpeg's measurement code is broken at 192 kHz.
UPDATE 2:
The internal 192 kHz resample in loudnorm
hid the real culprit here, swresample significantly distorting the audio itself.
Listen to the sample with just the format conversions to see it has been significantly distorted:
ffplay -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000
The audio is not distorted if we don't add the aformat=r=192000
to the above; however, we can also reproduce the significant distortion with just one resample if we specify s32
sample format:
ffplay -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=s32
Setting resampler=soxr
does not fix the distortion.
Attachments (3)
Change History (21)
comment:1 by , 3 years ago
Description: | modified (diff) |
---|
comment:2 by , 3 years ago
Description: | modified (diff) |
---|
comment:3 by , 3 years ago
Description: | modified (diff) |
---|
comment:4 by , 3 years ago
Description: | modified (diff) |
---|
comment:5 by , 3 years ago
Description: | modified (diff) |
---|
comment:6 by , 3 years ago
Description: | modified (diff) |
---|---|
Version: | unspecified → git-master |
by , 3 years ago
Attachment: | loudnorm.txt added |
---|
comment:7 by , 3 years ago
Description: | modified (diff) |
---|
comment:8 by , 3 years ago
Description: | modified (diff) |
---|
comment:9 by , 3 years ago
Keywords: | loudnorm added |
---|
comment:10 by , 3 years ago
Description: | modified (diff) |
---|
comment:11 by , 3 years ago
Description: | modified (diff) |
---|
comment:12 by , 3 years ago
Keywords: | ebur128 added |
---|
comment:13 by , 3 years ago
Keywords: | ebur128 removed |
---|
Note that the test with ebur128 filter with different sample rates requires ffmpeg with commit 274112c88d89d839a27c0766f558f065f9eee0d7 - prior to that it used fixed filter coefficients and only worked at 48kHz.
Given the difference between the 48kHz and 192kHz calculations, my guess is that one of the following things might be the problem?
- The calculation of the K-weighting filter coefficients at 192kHz is incorrect, or
- The resampling of audio from 48kHz to 192kHz is increasing the power of inaudible high frequency signals (which contribute towards the loudness calculation - it doesn't use a lowpass)
The latter doesn't *seem* likely, but shouldn't be hard to rule out with a spectrogram just to be sure.
comment:14 by , 3 years ago
Keywords: | ebur128 added |
---|
comment:15 by , 3 years ago
You are definitely resampling twice causing some overclipped output with aresample=osr=48000:dither_method=shibata,aformat=r=192000
Anyway dithering should be always last step in processing IIRC
you can fix your command with using float sample format as output like this:
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=flt,loudnorm=print_format=json -f null -
comment:16 by , 3 years ago
Component: | avfilter → swresample |
---|---|
Description: | modified (diff) |
Keywords: | swresample added |
Summary: | loudnorm filter goofs loudness measurements in some cases → swresample can introduce significant audio distortion |
comment:17 by , 3 years ago
Description: | modified (diff) |
---|
comment:18 by , 3 years ago
Keywords: | loudnorm ebur128 swresample removed |
---|---|
Reproduced by developer: | set |
output of first command