Opened 4 years ago

Closed 4 years ago

#3701 closed defect (fixed)

adpcm-ima_qt encoder's trellis support is broken

Reported by: Timothy_Gu Owned by:
Priority: important Component: avcodec
Version: git-master Keywords: adpcm regression
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Description


Summary of the bug

The adpcm-ima_qt encoder's trellis support is broken in two ways:

  • it does not produce reproducible output
  • it significantly degrades the output

How to reproduce

Non-reproducible output

First encoding process:

timothy_gu@ubuntu-lenovo:~/ffmpeg$ ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact -nostats -f md5 -
ffmpeg version N-63714-g1a426d5 Copyright (c) 2000-2014 the FFmpeg developers
  built on Jun  4 2014 17:34:36 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
  configuration: 
  libavutil      52. 89.100 / 52. 89.100
  libavcodec     55. 66.100 / 55. 66.100
  libavformat    55. 42.100 / 55. 42.100
  libavdevice    55. 13.101 / 55. 13.101
  libavfilter     4.  5.100 /  4.  5.100
  libswscale      2.  6.100 /  2.  6.100
  libswresample   0. 19.100 /  0. 19.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'tests/data/asynth-44100-2.wav':
  Duration: 00:00:06.00, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, md5, to 'pipe:':
    Stream #0:0: Audio: adpcm_ima_qt, 44100 Hz, stereo, s16p, 352 kb/s
    Metadata:
      encoder         : Lavc adpcm_ima_qt
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le -> adpcm_ima_qt)
Press [q] to stop, [?] for help
MD5=06391007776121799859126bd4d848f3
size=       0kB time=00:00:06.00 bitrate=   0.0kbits/s    
video:0kB audio:275kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

Second encoding process:

timothy_gu@ubuntu-lenovo:~/ffmpeg$ ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact -nostats -f md5 -
ffmpeg version N-63714-g1a426d5 Copyright (c) 2000-2014 the FFmpeg developers
  built on Jun  4 2014 17:34:36 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
  configuration: 
  libavutil      52. 89.100 / 52. 89.100
  libavcodec     55. 66.100 / 55. 66.100
  libavformat    55. 42.100 / 55. 42.100
  libavdevice    55. 13.101 / 55. 13.101
  libavfilter     4.  5.100 /  4.  5.100
  libswscale      2.  6.100 /  2.  6.100
  libswresample   0. 19.100 /  0. 19.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'tests/data/asynth-44100-2.wav':
  Duration: 00:00:06.00, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, md5, to 'pipe:':
    Stream #0:0: Audio: adpcm_ima_qt, 44100 Hz, stereo, s16p, 352 kb/s
    Metadata:
      encoder         : Lavc adpcm_ima_qt
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le -> adpcm_ima_qt)
Press [q] to stop, [?] for help
MD5=353699581c94f150671616ecfc357c09
size=       0kB time=00:00:06.00 bitrate=   0.0kbits/s    
video:0kB audio:275kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

The MD5 changed from 06391007776121799859126bd4d848f3 to 353699581c94f150671616ecfc357c09. This phenomenon doesn't happen with any other adpcm encoders.

Significantly degraded output

I will omit part of the encoding log because there is nothing interesting.

Encoding/decoding without trellis

./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -flags +bitexact -fflags +bitexact -nostats nontrellis.aiff
./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i nontrellis.aiff -threads 1  -flags +bitexact -fflags +bitexact -nostats nontrellis.wav

Encoding/decoding with trellis:

./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact trellis.aiff
./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i trellis.aiff -threads 1  -flags +bitexact -fflags +bitexact trellis.wav

Finding PSNR:

timothy_gu@ubuntu-lenovo:~/ffmpeg$ tests/tiny_psnr  tests/data/asynth-44100-2.wav nontrellis.wav 2
stddev:  904.76 PSNR: 37.20 MAXDIFF:34029 bytes:  1058400/  1058560
timothy_gu@ubuntu-lenovo:~/ffmpeg$ tests/tiny_psnr  tests/data/asynth-44100-2.wav trellis.wav 2
stddev: 8399.21 PSNR: 17.84 MAXDIFF:64623 bytes:  1058400/  1058560

For reference, with this specific sample, all other ADPCM encoders have a ~2dB PSNR increase.

Change History (6)

comment:1 Changed 4 years ago by cehoyos

  • Keywords regression added
  • Reproduced by developer set
  • Status changed from new to open

Could be considered a regression since 35d3d44a

comment:2 Changed 4 years ago by cehoyos

  • Priority changed from normal to important

I originally thought that mentioned commit improved quality but I actually get the following with older FFmpeg and trellis so imo this is definitely a regression:

stddev:  732.40 PSNR: 39.03 MAXDIFF:29633 bytes:  1058400/  1058560

comment:3 Changed 4 years ago by Timothy_Gu

Martin Storsjö of Libav has sent a patch to fix it: https://lists.libav.org/pipermail/libav-devel/2014-June/060185.html

comment:4 Changed 4 years ago by cehoyos

This patch is not sufficient / makes no audible difference.

comment:5 Changed 4 years ago by Timothy_Gu

@cehoyos Martin also wrote this patch: https://lists.libav.org/pipermail/libav-devel/2014-June/060187.html which seems to fix this issue (the Git commit referenced in the mail is the same as the Git commit you referenced).

comment:6 Changed 4 years ago by Timothy_Gu

  • Resolution set to fixed
  • Status changed from open to closed
Note: See TracTickets for help on using tickets.