Opened 10 years ago
Closed 10 years ago
#3701 closed defect (fixed)
adpcm-ima_qt encoder's trellis support is broken
Reported by: | Timothy Gu | Owned by: | |
---|---|---|---|
Priority: | important | Component: | avcodec |
Version: | git-master | Keywords: | adpcm regression |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | yes | |
Analyzed by developer: | no |
Description
Summary of the bug
The adpcm-ima_qt encoder's trellis support is broken in two ways:
- it does not produce reproducible output
- it significantly degrades the output
How to reproduce
Non-reproducible output
First encoding process:
timothy_gu@ubuntu-lenovo:~/ffmpeg$ ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact -nostats -f md5 - ffmpeg version N-63714-g1a426d5 Copyright (c) 2000-2014 the FFmpeg developers built on Jun 4 2014 17:34:36 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1) configuration: libavutil 52. 89.100 / 52. 89.100 libavcodec 55. 66.100 / 55. 66.100 libavformat 55. 42.100 / 55. 42.100 libavdevice 55. 13.101 / 55. 13.101 libavfilter 4. 5.100 / 4. 5.100 libswscale 2. 6.100 / 2. 6.100 libswresample 0. 19.100 / 0. 19.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'tests/data/asynth-44100-2.wav': Duration: 00:00:06.00, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s Output #0, md5, to 'pipe:': Stream #0:0: Audio: adpcm_ima_qt, 44100 Hz, stereo, s16p, 352 kb/s Metadata: encoder : Lavc adpcm_ima_qt Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le -> adpcm_ima_qt) Press [q] to stop, [?] for help MD5=06391007776121799859126bd4d848f3 size= 0kB time=00:00:06.00 bitrate= 0.0kbits/s video:0kB audio:275kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Second encoding process:
timothy_gu@ubuntu-lenovo:~/ffmpeg$ ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact -nostats -f md5 - ffmpeg version N-63714-g1a426d5 Copyright (c) 2000-2014 the FFmpeg developers built on Jun 4 2014 17:34:36 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1) configuration: libavutil 52. 89.100 / 52. 89.100 libavcodec 55. 66.100 / 55. 66.100 libavformat 55. 42.100 / 55. 42.100 libavdevice 55. 13.101 / 55. 13.101 libavfilter 4. 5.100 / 4. 5.100 libswscale 2. 6.100 / 2. 6.100 libswresample 0. 19.100 / 0. 19.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'tests/data/asynth-44100-2.wav': Duration: 00:00:06.00, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s Output #0, md5, to 'pipe:': Stream #0:0: Audio: adpcm_ima_qt, 44100 Hz, stereo, s16p, 352 kb/s Metadata: encoder : Lavc adpcm_ima_qt Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le -> adpcm_ima_qt) Press [q] to stop, [?] for help MD5=353699581c94f150671616ecfc357c09 size= 0kB time=00:00:06.00 bitrate= 0.0kbits/s video:0kB audio:275kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
The MD5 changed from 06391007776121799859126bd4d848f3
to 353699581c94f150671616ecfc357c09
. This phenomenon doesn't happen with any other adpcm encoders.
Significantly degraded output
I will omit part of the encoding log because there is nothing interesting.
Encoding/decoding without trellis
./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -flags +bitexact -fflags +bitexact -nostats nontrellis.aiff ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i nontrellis.aiff -threads 1 -flags +bitexact -fflags +bitexact -nostats nontrellis.wav
Encoding/decoding with trellis:
./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact trellis.aiff ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i trellis.aiff -threads 1 -flags +bitexact -fflags +bitexact trellis.wav
Finding PSNR:
timothy_gu@ubuntu-lenovo:~/ffmpeg$ tests/tiny_psnr tests/data/asynth-44100-2.wav nontrellis.wav 2 stddev: 904.76 PSNR: 37.20 MAXDIFF:34029 bytes: 1058400/ 1058560 timothy_gu@ubuntu-lenovo:~/ffmpeg$ tests/tiny_psnr tests/data/asynth-44100-2.wav trellis.wav 2 stddev: 8399.21 PSNR: 17.84 MAXDIFF:64623 bytes: 1058400/ 1058560
For reference, with this specific sample, all other ADPCM encoders have a ~2dB PSNR increase.
Change History (6)
comment:1 by , 10 years ago
Keywords: | regression added |
---|---|
Reproduced by developer: | set |
Status: | new → open |
comment:2 by , 10 years ago
Priority: | normal → important |
---|
I originally thought that mentioned commit improved quality but I actually get the following with older FFmpeg and trellis so imo this is definitely a regression:
stddev: 732.40 PSNR: 39.03 MAXDIFF:29633 bytes: 1058400/ 1058560
comment:3 by , 10 years ago
Martin Storsjö of Libav has sent a patch to fix it: https://lists.libav.org/pipermail/libav-devel/2014-June/060185.html
comment:5 by , 10 years ago
@cehoyos Martin also wrote this patch: https://lists.libav.org/pipermail/libav-devel/2014-June/060187.html which seems to fix this issue (the Git commit referenced in the mail is the same as the Git commit you referenced).
comment:6 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | open → closed |
Fixed by Martin Storsjö in a32765c4252eb106a2ade543026ef6f59e699bfa and fa8f060b75bf9074792a0f9ff4ed002652ef62b8.
Could be considered a regression since 35d3d44a