Opened 6 months ago

Last modified 6 months ago

#9352 new defect

swresample can introduce significant audio distortion

Reported by: Gregory Beauregard Owned by:
Priority: normal Component: swresample
Version: git-master Keywords: loudnorm, ebur128, swresample
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description (last modified by Gregory Beauregard)

UPDATE: see bottom, swresample is at fault

Summary of the bug: The loudnorm filter can badly goof the loudness measurement in certain situations (e.g. particular inaudible noise after a resample) resulting in significant distortion if attempting to use it in dynamic mode or to get measurements.

ffmpeg used is git master as of 2021-07-29.

Using the reproducer file at https://stream.gably.net/images/loudnorm_samp.mkv, loudnorm_samp.mkv (4.7 MB), 48 kHz DTS-HD, analyze the loudness with loudnorm to get integrated loudness ~-21: "input_i" : "-21.57" (output of this command attached):

ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo,loudnorm=print_format=json -f null -

However, when we reanalyze it by giving the resampler a dither and specifying 48 kHz, the loudnorm filter goofs the integrated loudness measurement with 0.13: "input_i" : "0.13"`:

ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,loudnorm=print_format=json -f null -

This results in really wrong re-normalization resulting in significant distortion. If we run ebur128 to analyze the loudness instead of loudnorm in the situation where it goofs, ebur128 outputs an expected -22.2 integrated loudness, and doesn't change between the two situations above:

ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null -

So it seems like there's some odd sensitivity in the loudnorm filter to certain inaudible noise that's messing it up where ebur128 is still ok.

Update: You can use the following two ebur128 filter runs similar to the above with and without inserting a ,aformat=r=192000, resampling to 192 kHz as the loudnorm filter does unconditionally internally (but ebur128 does not). I've attached the outputs as 192ebur.txt and 48ebur.txt.

ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000,ebur128 -f null - > 192ebur.txt 2>&1
ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null - > 48ebur.txt 2>&1

In the 192 kHz resample the ebur128 filter wrongly measures the integrated loudness to I: -0.1 LUFS, similar to the wrong loudnorm measurement, but the 48 kHz case measures -22.2 as you'd expect. My understanding is the two filters share some measurement code, so presumably ffmpeg's measurement code is broken at 192 kHz.

UPDATE 2:
The internal 192 kHz resample in loudnorm hid the real culprit here, swresample significantly distorting the audio itself.

Listen to the sample with just the format conversions to see it has been significantly distorted:

ffplay -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000

The audio is not distorted if we don't add the aformat=r=192000 to the above; however, we can also reproduce the significant distortion with just one resample if we specify s32 sample format:

ffplay -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=s32

Setting resampler=soxr does not fix the distortion.

Attachments (3)

loudnorm.txt (4.2 KB ) - added by Gregory Beauregard 6 months ago.
output of first command
48ebur.txt (17.1 KB ) - added by Gregory Beauregard 6 months ago.
48 kHz ebur filter run
192ebur.txt (17.1 KB ) - added by Gregory Beauregard 6 months ago.
192 kHz ebur filter run

Download all attachments as: .zip

Change History (20)

comment:1 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:2 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:3 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:4 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:5 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:6 by Gregory Beauregard, 6 months ago

Description: modified (diff)
Version: unspecifiedgit-master

by Gregory Beauregard, 6 months ago

Attachment: loudnorm.txt added

output of first command

comment:7 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:8 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:9 by Gregory Beauregard, 6 months ago

Keywords: loudnorm added

by Gregory Beauregard, 6 months ago

Attachment: 48ebur.txt added

48 kHz ebur filter run

by Gregory Beauregard, 6 months ago

Attachment: 192ebur.txt added

192 kHz ebur filter run

comment:10 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:11 by Gregory Beauregard, 6 months ago

Description: modified (diff)

comment:12 by Gregory Beauregard, 6 months ago

Keywords: ebur128 added

comment:13 by kepstin, 6 months ago

Keywords: ebur128 removed

Note that the test with ebur128 filter with different sample rates requires ffmpeg with commit 274112c88d89d839a27c0766f558f065f9eee0d7 - prior to that it used fixed filter coefficients and only worked at 48kHz.

Given the difference between the 48kHz and 192kHz calculations, my guess is that one of the following things might be the problem?

  • The calculation of the K-weighting filter coefficients at 192kHz is incorrect, or
  • The resampling of audio from 48kHz to 192kHz is increasing the power of inaudible high frequency signals (which contribute towards the loudness calculation - it doesn't use a lowpass)

The latter doesn't *seem* likely, but shouldn't be hard to rule out with a spectrogram just to be sure.

comment:14 by kepstin, 6 months ago

Keywords: ebur128 added

comment:15 by Elon Musk, 6 months ago

You are definitely resampling twice causing some overclipped output with aresample=osr=48000:dither_method=shibata,aformat=r=192000

Anyway dithering should be always last step in processing IIRC

you can fix your command with using float sample format as output like this:

ffmpeg -i loudnorm_samp.mkv -af aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=flt,loudnorm=print_format=json -f null -

comment:16 by Gregory Beauregard, 6 months ago

Component: avfilterswresample
Description: modified (diff)
Keywords: swresample added
Summary: loudnorm filter goofs loudness measurements in some casesswresample can introduce significant audio distortion

comment:17 by Gregory Beauregard, 6 months ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.