Opened 12 years ago

Closed 8 years ago

#2686 closed defect (fixed)

Native AAC encoder collapses at high bitrates on some samples

Reported by: Kamedo2 Owned by: klaussfreire
Priority: normal Component: avcodec
Version: git-master Keywords: aac regression
Cc: klaussfreire@gmail.com, timothygu99@gmail.com, atomnuker@gmail.com, rodger.combs@gmail.com Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: yes

Description

Summary of the bug:
FFmpeg native aac encoder outputs horrible sound around 256kbps or more on particular samples. It happens at higher bitrates. The quality degrades as I increase the bitrates, and become most degraded at 320-400kbps.

How to reproduce:

ffmpeg -i ffmpeg_aac320k_collapse.flac -vn -c:a aac -strict experimental -b:a 320k ffmpeg_aac320k_collapse.mp4

I couldn't reproduce the results when I trimmed the most problematic sample down to 8 seconds, but by adding 10 seconds of silence before the sample, the bug could be reproduced. So I'm going to upload the sample with 10 seconds of silence attached. The native aac encoder was ok on many music clips at 320kbps, and only some clips exhibit noticeably bad quality aac files, to an extent I'd call it 'bug'.

Console Output:

ffmpeg version N-54096-ge41bf19 Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun 19 2013 00:20:06 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-libmp3lame --enable-lib
vorbis --enable-nonfree --enable-libfdk-aac --enable-libvo_aacenc --enable-libfa
ac --extra-ldflags=-static --extra-cflags='-march=nocona -mfpmath=sse' --optflag
s=-O2
  libavutil      52. 37.101 / 52. 37.101
  libavcodec     55. 16.100 / 55. 16.100
  libavformat    55.  9.100 / 55.  9.100
  libavdevice    55.  2.100 / 55.  2.100
  libavfilter     3. 77.101 /  3. 77.101
  libswscale      2.  3.100 /  2.  3.100
  libswresample   0. 17.102 /  0. 17.102
  libpostproc    52.  3.100 / 52.  3.100
[flac @ 0003f160] max_analyze_duration 5000000 reached at 5015510 microseconds
Input #0, flac, from '05-true_my_heart_2m50s.flac':
  Duration: 00:00:18.01, bitrate: 573 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to '05-true_my_heart_2m50s_320k.mp4':
  Metadata:
    encoder         : Lavf55.9.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 32
0 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (flac -> aac)
Press [q] to stop, [?] for help
size=     331kB time=00:00:18.01 bitrate= 150.4kbits/s
video:0kB audio:327kB subtitle:0 global headers:0kB muxing overhead 1.151111%

Attachments (68)

ffmpeg_aac320k_collapse.flac (1.2 MB ) - added by Kamedo2 12 years ago.
The sound file that cripples a native AAC encoder. True My Heart [DVTS-2121][07.09.03] Track05 2m50s~58s
ffmpeg_aac320k_collapse2.flac (1.5 MB ) - added by Kamedo2 12 years ago.
A sound that degrades on FFmpeg native aac encoder. Sounds like a spray can. Billie Holiday : I'm A Fool To Want You (trimmed to 20sec, first and last)
aac-improvements-wip.patch (25.1 KB ) - added by klaussfreire 12 years ago.
AAC native encoder improvements, work in progress
ffmpeg_aac320k_collapse3.flac (1.0 MB ) - added by Kamedo2 12 years ago.
A sound that degrades on FFmpeg native aac encoder. Euphoria - Yui Makino [VTCL-35073][06.4.26] Track04 Amefuribana(inst.) 2:45~2:55
whitenoise_256k.mp4 (141.4 KB ) - added by Kamedo2 12 years ago.
White noise, encoded by native aac encoder at 256kbps. The sound is obviously collapsed.
Whitenoise.flac (946.2 KB ) - added by Kamedo2 12 years ago.
White noise, created by SoundEngine Free ver.4.59. Using aevalsrc as in comment:11 do the same job.
aac-improvements-wip-v2-rclookahead.patch (30.5 KB ) - added by klaussfreire 11 years ago.
Second version of AAC improvements, with improvements on rate control, hopefully gets rid of all remaining "collapsations on high bit rates". Tested various music tracks on 64k, 128k, 256k and 384k.
aac-improvements-wip-v2-rclookahead.2.patch (30.0 KB ) - added by klaussfreire 11 years ago.
Second version of AAC improvements, with improvements on rate control, hopefully gets rid of all remaining "collapsations on high bit rates". Tested various music tracks on 64k, 128k, 256k and 384k.
ffmpeg_aac320k_collapse4.flac (1.4 MB ) - added by Kamedo2 11 years ago.
A sound that degrades on FFmpeg native aac encoder.
18.6_22kHz_noise.flac (2.2 MB ) - added by Kamedo2 11 years ago.
Partial white noise, clipped by 256th-order lanczos function, to include only signals between 18.6 and 22kHz. the signal wanders around the freq.
ffmpeg_aac320k_collapse5.flac (901.5 KB ) - added by Kamedo2 11 years ago.
A sound that degrades on FFmpeg native aac encoder.
ffmpeg_aacvbr_pulse1.flac (1.6 MB ) - added by Kamedo2 11 years ago.
Sound disappears for about 20ms in VBR mode -q:a 5, -q:a 10. Sounds like an annoying pulse.
aac-improvements-wip-v3-vbr.patch (35.9 KB ) - added by klaussfreire 11 years ago.
VBR improvements over wip-v2-rclookahead
ffmpeg_aacvbr_pulse2.flac (2.2 MB ) - added by Kamedo2 11 years ago.
Partial white noise, splitted by 256th lanczos filter. HF pulse noise that sounds like stopwatch is added in VBR around -a:q 0.3
aac-improvements-wip-v4-vbr.patch (40.4 KB ) - added by klaussfreire 11 years ago.
Improved VBR, fixed psy threshold reduction bug
fdkaac_10_12.zip (2.1 MB ) - added by Kamedo2 11 years ago.
samples #10-#12 encoded by fdkaac. *2.mp4 are the 128kbps samples, the others are the 96kbps samples.
fdkaac_13_16.zip (2.2 MB ) - added by Kamedo2 11 years ago.
samples # 10 - # 12 encoded by fdkaac. *2.mp4 are the 128kbps samples, the others are the 96kbps samples.
aac-improvements-wip-v5.patch (40.5 KB ) - added by klaussfreire 11 years ago.
V5 patch, twoloop RD fixed (I think)
aac-improvements-wip-v6.patch (42.3 KB ) - added by klaussfreire 11 years ago.
Improved (mostly constrained) VBR, fixed RC bug from v5. There's some dead code that begs to be removed, but it's better to start testing before cleaning.
ffmpeg_aacvbr_degrade1.flac (1.4 MB ) - added by Kamedo2 11 years ago.
A sound that degrades on VBR. from GIZA studio Masterpiece BLEND 2001 Disc2 Track3 Stand Up (Mai Kuraki)
ffmpeg_aac_lead_voice.flac (1.4 MB ) - added by Kamedo2 11 years ago.
Degrades on FFmpeg aac encoder, both on vbr and abr. The original sound is very odd and may not be worthy to put a lot of effort improving it.
aac-improvements-wip-v7.patch (48.2 KB ) - added by klaussfreire 11 years ago.
v7 patch - mostly bugfixing on v6, but quite significant bugs - still incomplete (needs sample rate fixes and Mahler still sounds weird)
sine_tester.flac (1006.7 KB ) - added by Kamedo2 11 years ago.
Sine waves for a warbling test. 50 440 1000 3000 7000 10000 20000Hz. 24bit 48kHz PCM.
aac-improvements-wip-v8.patch (71.8 KB ) - added by klaussfreire 11 years ago.
v8 patch - tweaked tonal band priorization, especially in transients, fixed M/S encoding and made default, and other assorted bugs. Added missing include changes.
Whitenoise_left.flac (479.1 KB ) - added by Kamedo2 11 years ago.
Whitenoise.flac without the sound of right channel. A strange noise appears in the center in v8.
ffmpeg_aac256k_degrade.flac (1.9 MB ) - added by Kamedo2 11 years ago.
The sound degrades on v8 around 256kbps. Mainly right channel suffers. from Kohmi Hirose GIFT/Ai wa tokkoyaku Track3
ItCouldBeSweet.ffv8_128k.diff.flac (1.7 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 128kbps.
ItCouldBeSweet.ffv8_192k.diff.flac (1.9 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 192kbps.
ItCouldBeSweet.ffv8_320k.diff.flac (1.3 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 320kbps.
ItCouldBeSweet.ffv8_q1.5.diff.flac (1.5 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, quality option -q:a 1.5
aac-improvements-wip-v8-fix.patch (1.8 KB ) - added by klaussfreire 11 years ago.
Cumulative patch over v8 to fix M/S coding
ItCouldBeSweet.qaac_cvbr128k.diff.flac (1.7 MB ) - added by Kamedo2 11 years ago.
Just for comparison. The diff of the ItCouldBeSweet, between the original and qaac encode, 128kbps.
ItCouldBeSweet.fdk_128k.diff.flac (1.7 MB ) - added by Kamedo2 11 years ago.
Just for comparison. The diff of the ItCouldBeSweet, between the original and FDK-AAC encode, 128kbps.
aac-improvements-wip-v8f.patch (73.6 KB ) - added by klaussfreire 11 years ago.
Combined v8 + fix
ItCouldBeSweet.ffv8f_128k.diff.flac (1.7 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 128kbps.
ItCouldBeSweet.ffv8f_192k.diff.flac (1.6 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 192kbps.
ItCouldBeSweet.ffv8f_320k.diff.flac (1.4 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 320kbps.
aac-improvements-wip-v8g.patch (76.6 KB ) - added by klaussfreire 11 years ago.
Fix M/S encoding in ABR
ItCouldBeSweet.ffv8g_128k.diff.flac (1.8 MB ) - added by Kamedo2 11 years ago.
ItCouldBeSweet.ffv8g_192k.diff.flac (1.7 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v8g AAC encode, 192kbps.
ItCouldBeSweet.ffv8g_320k.diff.flac (1.4 MB ) - added by Kamedo2 11 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v8g AAC encode, 320kbps.
aac-improvements-wip-v7-new.patch (48.2 KB ) - added by Kamedo2 10 years ago.
v7 patch altered to reflect the latest change by Michael Niedermayer at 20140525. This should work for the git head.
aac-improvements-wip-v8g-new.patch (76.6 KB ) - added by Kamedo2 10 years ago.
v8g patch altered to reflect the latest change by Michael Niedermayer at 20140525. This should work for the git head.
ffmpeg_anmr_error.flac (157.7 KB ) - added by Kamedo2 10 years ago.
It causes the assertion error at aacenc.c line 399 by -aac_coder anmr on all -b:a and -q:a 0.1695 or bigger.
ffmpeg_anmr_error2.flac (1.1 MB ) - added by Kamedo2 10 years ago.
EBU–TECH 3253 Sound Quality Assessment Material recordings for subjective tests, 50 Male speech, English.
aac-improvements-wip-v9.patch (92.9 KB ) - added by klaussfreire 10 years ago.
Hopefully final version of the AAC patch
ItCouldBeSweet.ffv9_128k.diff.flac (1.7 MB ) - added by Kamedo2 10 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 128kbps.
ItCouldBeSweet.ffv9_192k.diff.flac (1.7 MB ) - added by Kamedo2 10 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 192kbps.
ItCouldBeSweet.ffv9_320k.diff.flac (1.4 MB ) - added by Kamedo2 10 years ago.
The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 320kbps.
ffmpeg_anmr_error3.flac (221.4 KB ) - added by Kamedo2 10 years ago.
EBU–TECH 3253 Sound Quality Assessment Material recordings for subjective tests, 3 Electronic gong 100 Hz.(sine wave)
FFmpeg_anmr_error4.flac (250.4 KB ) - added by Kamedo2 10 years ago.
This causes the assertion error on both -b:a 128k and -q:a 1. 4000Hz sine wave, stereo.
FFmpeg_anmr_error5.flac (140.0 KB ) - added by Kamedo2 10 years ago.
This causes the assertion error on both -b:a 128k and -q:a 1. 11000Hz sine wave, stereo.
aac-improvements-wip-v9b.patch (98.6 KB ) - added by klaussfreire 10 years ago.
v9b version, based on v9, matched behavior against v7
FFmpeg_anmr_error6.flac (267.1 KB ) - added by Kamedo2 10 years ago.
This causes the assertion error on -b:a 96k, 128k, 160k on v9b. -q:a is OK. 9000Hz sine wave, stereo.
FFmpeg_anmr_error7.flac (2.4 MB ) - added by Kamedo2 10 years ago.
This causes the assertion error on -b:a 192k on v9b. Dave Matthews Band - Crush, http://www.hydrogenaud.io/forums/index.php?showtopic=102079&hl=
aac-improvements-wip-v7-new.2.patch (48.2 KB ) - added by Kamedo2 10 years ago.
v7 patch altered to reflect the latest changes. This should work for the git head.
aac-improvements-wip-v9b-new.patch (98.6 KB ) - added by Kamedo2 10 years ago.
v9b patch altered to reflect the latest changes. This should work for the git head.
FFmpeg_anmr_error8.flac (1.8 MB ) - added by Kamedo2 10 years ago.
This causes the assertion error on -q:a 1 on v9b. -q:a 0.99 or 1.01 is safe. Susanne Vega, Tom's Diner http://www.rarewares.org/test_samples/
ffmpeg_aac_error1.flac (1.7 MB ) - added by Kamedo2 9 years ago.
FFmpeg doesn't stop when the sample rate is 8kHz and the bitrate is high. -ar 8000 -b:a 96k, -q:a 0.958 or higher. Fear Factory, Digimortal, Linchpin.
SinceAlways.flac (2.2 MB ) - added by Kamedo2 9 years ago.
This is one exceptional case that degrades on v9b.
mybloodrusts.flac (2.4 MB ) - added by Kamedo2 9 years ago.
This is one exceptional case that degrades on v9b.
castanets.flac (588.9 KB ) - added by Kamedo2 9 years ago.
This is one exceptional case that degrades on v9b.
ffmpeg_96k_error.flac (58.5 KB ) - added by Kamedo2 9 years ago.
Low Freq. Sine Sweep Stereo with right channel inverted; inaudible on mono.
mybloodrusts.ff74961_128k.mp4 (323.3 KB ) - added by Kamedo2 9 years ago.
mybloodrusts.flac encoded at -b:a 128k by ffmpeg74961-g61009a7.
mybloodrusts.ff75043_128k.mp4 (323.4 KB ) - added by Kamedo2 9 years ago.
mybloodrusts.flac encoded at -b:a 128k by ffmpeg75043-gb31041a.
assertion_diff_shimoseka.m4a (795.5 KB ) - added by llogan 9 years ago.
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363
ffmpeg_aac_error2.flac (1.2 MB ) - added by Kamedo2 9 years ago.
This causes error on -profile:a aac_ltp -b:a 96k. The error msg are "av_interleaved_write_frame(): Not enough space" or "Audio encoding failed (avcodec_encode_audio2)". The sound is 08._Sarah_McLachlan_Ice_ringing.flac
short_block_test_2.flac (167.1 KB ) - added by Kamedo2 9 years ago.

Change History (577)

by Kamedo2, 12 years ago

The sound file that cripples a native AAC encoder. True My Heart [DVTS-2121][07.09.03] Track05 2m50s~58s

by Kamedo2, 12 years ago

A sound that degrades on FFmpeg native aac encoder. Sounds like a spray can. Billie Holiday : I'm A Fool To Want You (trimmed to 20sec, first and last)

comment:1 by Carl Eugen Hoyos, 12 years ago

Component: FFmpegavcodec
Keywords: native encoder sound quality 256kbps 320kbps removed
Version: 1.0.7git-master

Did the output (aac) files sound better with the (original!) release 1.2?
(Not a later release of the 1.2 series.)

comment:2 by Kamedo2, 12 years ago

Yes, the output aac files sounded better with release 1.2.1 I've downloaded from
http://www.ffmpeg.org/releases/ffmpeg-1.2.1.tar.bz2

Still, the quality of the native aac at 320kbps is poorer than the native aac 256kbps.

ffmpeg version 1.2.1 Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun 19 2013 12:38:13 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-libmp3lame --enable-lib
vorbis --enable-nonfree --enable-libfdk-aac --enable-libvo_aacenc --enable-libfa
ac --extra-ldflags=-static --extra-cflags='-march=nocona -mfpmath=sse' --optflag
s=-O2
  libavutil      52. 18.100 / 52. 18.100
  libavcodec     54. 92.100 / 54. 92.100
  libavformat    54. 63.104 / 54. 63.104
  libavdevice    54.  3.103 / 54.  3.103
  libavfilter     3. 42.103 /  3. 42.103
  libswscale      2.  2.100 /  2.  2.100
  libswresample   0. 17.102 /  0. 17.102
  libpostproc    52.  2.100 / 52.  2.100
[flac @ 01405c20] max_analyze_duration 5000000 reached at 5015510 microseconds
Input #0, flac, from 'ffmpeg_aac320k_collapse.flac
':
  Duration: 00:00:18.01, bitrate: 573 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'ffmpeg_aac320k_collapse.mp4':
  Metadata:
    encoder         : Lavf54.63.104
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 32
0 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (flac -> aac)
Press [q] to stop, [?] for help
size=     289kB time=00:00:18.01 bitrate= 131.3kbits/s
video:0kB audio:285kB subtitle:0 global headers:0kB muxing overhead 1.321136%

comment:3 by Kamedo2, 12 years ago

Oops, you said original release 1.2.
Release 1.2 and 1.2.1 had the same behavior -- the first sample collapses at 432-464kbps.
As for N-54096-ge41bf19 I've got from git -- the first sample collapses at 256-432kbps.
These two groups have the distinct "degradation range". Release 1.2 and 1.2.1 have much narrower degradation range, and the 1.2* is less severe at the range. N-54096-ge41bf19 at 352kbps is the worst quality.

ffmpeg version 1.2 Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun 20 2013 03:06:34 with gcc 4.8.1 (GCC)
  configuration: --enable-version3 --enable-nonfree --enable-libfdk-aac --extra-
ldflags=-static --extra-cflags='-march=native' --optflags=-O2
  libavutil      52. 18.100 / 52. 18.100
  libavcodec     54. 92.100 / 54. 92.100
  libavformat    54. 63.104 / 54. 63.104
  libavdevice    54.  3.103 / 54.  3.103
  libavfilter     3. 42.103 /  3. 42.103
  libswscale      2.  2.100 /  2.  2.100
  libswresample   0. 17.102 /  0. 17.102
[flac @ 03295c20] max_analyze_duration 5000000 reached at 5015510 microseconds
Input #0, flac, from 'C:\Users\PCC\Documents\ABC-HR\ffmpeg_aac320k_collapse.flac
':
  Duration: 00:00:18.01, bitrate: 573 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'C:\Users\PCC\Documents\ABC-HR\05-true_my_heart_2m50s_320k_12
.mp4':
  Metadata:
    encoder         : Lavf54.63.104
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 32
0 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (flac -> aac)
Press [q] to stop, [?] for help
size=     289kB time=00:00:18.01 bitrate= 131.3kbits/s
video:0kB audio:285kB subtitle:0 global headers:0kB muxing overhead 1.321136%

comment:4 by klaussfreire, 12 years ago

This patch I'm going to attach fixes both issues. But I must warn that it's a WIP, I still have to split it into individual issues and fix a bug it exhibits in rare circumstances when working in VBR mode.

by klaussfreire, 12 years ago

Attachment: aac-improvements-wip.patch added

AAC native encoder improvements, work in progress

comment:5 by Carl Eugen Hoyos, 12 years ago

Keywords: regression added
Reproduced by developer: set
Status: newopen

comment:6 by Kamedo2, 12 years ago

I appreciate your effort, klaussfreire.
I want to test the aac-improvements-wip.patch, but how can I do that?

/c/mingw/ffmpeg/ffmpeg-1.2
$ patch -u -p1 < aac-improvements-wip.patch
patching file libavcodec/aaccoder.c
Hunk #3 FAILED at 711.
Hunk #4 succeeded at 776 (offset -5 lines).
Hunk #5 succeeded at 818 (offset -5 lines).
Hunk #6 FAILED at 845.
Hunk #7 FAILED at 1055.
Hunk #8 FAILED at 1068.
Hunk #9 FAILED at 1092.
Hunk #10 FAILED at 1110.
6 out of 10 hunks FAILED -- saving rejects to file libavcodec/aaccoder.c.rej
patching file libavcodec/aacenc.c
Hunk #3 FAILED at 622.
1 out of 3 hunks FAILED -- saving rejects to file libavcodec/aacenc.c.rej
patching file libavcodec/aacpsy.c
Hunk #1 succeeded at 293 (offset -4 lines).
Hunk #2 succeeded at 385 (offset -4 lines).
Hunk #3 succeeded at 646 (offset -33 lines).
patching file libavcodec/psymodel.h

comment:7 by Carl Eugen Hoyos, 12 years ago

Without trying myself, I would bet that the patch only applies to current git head.

comment:8 by Kamedo2, 12 years ago

I tried $ git clone git://source.ffmpeg.org/ffmpeg.git, but still, the patch fails.

comment:9 by Carl Eugen Hoyos, 12 years ago

I can confirm that the patch does not apply.

comment:10 by Kamedo2, 12 years ago

I tried the wip patch again. No good. I think the patch is broken.

$ patch -p1 < aac-improvements-wip.patch
patching file libavcodec/aaccoder.c
Hunk #3 FAILED at 711.
Hunk #4 succeeded at 776 (offset -5 lines).
Hunk #5 succeeded at 818 (offset -5 lines).
Hunk #6 FAILED at 845.
Hunk #7 FAILED at 1055.
Hunk #8 FAILED at 1068.
Hunk #9 FAILED at 1092.
Hunk #10 FAILED at 1110.
6 out of 10 hunks FAILED -- saving rejects to file libavcodec/aaccoder.c.rej
patching file libavcodec/aacenc.c
Hunk #1 FAILED at 591.
Hunk #2 FAILED at 609.
Hunk #3 FAILED at 621.
3 out of 3 hunks FAILED -- saving rejects to file libavcodec/aacenc.c.rej
patching file libavcodec/aacpsy.c
Hunk #1 succeeded at 299 (offset 2 lines).
Hunk #2 succeeded at 391 (offset 2 lines).
Hunk #3 succeeded at 681 (offset 2 lines).
patching file libavcodec/psymodel.h

by Kamedo2, 12 years ago

A sound that degrades on FFmpeg native aac encoder. Euphoria - Yui Makino [VTCL-35073][06.4.26] Track04 Amefuribana(inst.) 2:45~2:55

comment:11 by Kamedo2, 12 years ago

I successfully applied the patch. klaussfreire's repository is in here. http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143216.html
Or, you can use https://dl.dropboxusercontent.com/u/81238453/aac.patch (Thank you Takuan @K4095) to patch from current git head.

However, still, it has a distinctive bug. The sound disappears partially when the sound is white noise-like.
The bug #2706 was that the sound warbles when the sound was a sine wave. That was solved by this patch, but this creates new problem.

ffmpeg54292 -v 9 -loglevel
99 -filter_complex "aevalsrc=-0.5+random(0)" -c:a aac -strict experimental -ar 4
4100 -ac 2 -b:a 256k -t 4 "C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4"
ffmpeg version N-54292-g97947d9 Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun 30 2013 20:34:13 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk
-aac --extra-ldflags=-static --extra-cflags='-march=nocona -mfpmath=sse' --optfl
ags=-O2
  libavutil      52. 38.100 / 52. 38.100
  libavcodec     55. 18.100 / 55. 18.100
  libavformat    55. 10.100 / 55. 10.100
  libavdevice    55.  2.100 / 55.  2.100
  libavfilter     3. 77.101 /  3. 77.101
  libswscale      2.  3.100 /  2.  3.100
  libswresample   0. 17.102 /  0. 17.102
  libpostproc    52.  3.100 / 52.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument
'9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level)
with argument '99'.
Reading option '-filter_complex' ... matched as option 'filter_complex' (create
a complex filtergraph) with argument 'aevalsrc=-0.5+random(0)'.
Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac'
.
Reading option '-strict' ... matched as AVOption 'strict' with argument 'experim
ental'.
Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz)
) with argument '44100'.
Reading option '-ac' ... matched as option 'ac' (set number of audio channels) w
ith argument '2'.
Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v)
) with argument '256k'.
Reading option '-t' ... matched as option 't' (record or transcode "duration" se
conds of audio/video) with argument '4'.
Reading option 'C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4' ... matched a
s output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option filter_complex (create a complex filtergraph) with argument aeva
lsrc=-0.5+random(0).
Successfully parsed a group of options.
Parsing a group of options: output file C:\Users\PCC\Documents\ABC-HR\whitenoise
_256k.mp4.
Applying option c:a (codec name) with argument aac.
Applying option ar (set audio sampling rate (in Hz)) with argument 44100.
Applying option ac (set number of audio channels) with argument 2.
Applying option b:a (video bitrate (please use -b:v)) with argument 256k.
Applying option t (record or transcode "duration" seconds of audio/video) with a
rgument 4.
Successfully parsed a group of options.
Opening an output file: C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4.
detected 8 logical cores
[Parsed_aevalsrc_0 @ 0140bea0] compat: called with args=[-0.5+random(0)]
[Parsed_aevalsrc_0 @ 0140bea0] Setting 'exprs' to value '-0.5+random(0)'
[audio format for output stream 0:0 @ 01412880] Setting 'sample_fmts' to value '
fltp'
[audio format for output stream 0:0 @ 01412880] Setting 'sample_rates' to value
'44100'
[audio format for output stream 0:0 @ 01412880] Setting 'channel_layouts' to val
ue '0x3'
Successfully opened the file.
[audio format for output stream 0:0 @ 01412880] auto-inserting filter 'auto-inse
rted resampler 0' between the filter 'Parsed_aevalsrc_0' and the filter 'audio f
ormat for output stream 0:0'
[AVFilterGraph @ 0039f3c0] query_formats: 3 queried, 6 merged, 3 already done, 0
 delayed
[Parsed_aevalsrc_0 @ 0140bea0] sample_rate:44100 chlayout:mono duration:-1.00000
0
[auto-inserted resampler 0 @ 0039f2a0] [SWR @ 00393160] Using double precision m
ode
0.707107
0.707107
[auto-inserted resampler 0 @ 0039f2a0] ch:1 chl:mono fmt:dblp r:44100Hz -> ch:2
chl:stereo fmt:fltp r:44100Hz
Output #0, mp4, to 'C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4':
  Metadata:
    encoder         : Lavf55.10.100
    Stream #0:0, 0, 1/44100: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, ster
eo, fltp, 256 kb/s
Stream mapping:
  aevalsrc -> Stream #0:0 (aac)
Press [q] to stop, [?] for help
No more output streams to write to, finishing.
size=     141kB time=00:00:04.01 bitrate= 288.4kbits/s
video:0kB audio:140kB subtitle:0 global headers:0kB muxing overhead 1.001409%
0 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0141b640] Statistics: 30 seeks, 197 writeouts

The output mp4 I'm going to post sounds nothing like white noise.

by Kamedo2, 12 years ago

Attachment: whitenoise_256k.mp4 added

White noise, encoded by native aac encoder at 256kbps. The sound is obviously collapsed.

comment:12 by Kamedo2, 12 years ago

Another bug, typically happens when hi-hats are present. The sound disappears for about 20ms.
Short, but it's still audible and sounds like a annoying pulse.

http://i40.tinypic.com/v3dt39.png

When these problems are solved, I'm going to conduct an extensive blind listening test, to assess sound quality of AAC encoders available from FFmpeg.

comment:13 by Kamedo2, 12 years ago

Another type of holes. There are no holes like this in the original sound, but are present in encoded mp4s.
http://i41.tinypic.com/axisf5.png

comment:14 by klaussfreire, 12 years ago

Sorry, I expected to get email notifications, but got none.

That bug is probably a ratecontrol bug I thought I had erradicated. I'll try to test with white noise, but just in case the exact input matters, can you attach a flac version?

in reply to:  14 comment:15 by Carl Eugen Hoyos, 12 years ago

Replying to klaussfreire:

Sorry, I expected to get email notifications, but got none.

You will get them if you add yourself to CC.

comment:16 by klaussfreire, 12 years ago

Cc: klaussfreire@gmail.com added

comment:17 by klaussfreire, 12 years ago

In aacenc.c, changing

s->lambda *= ratio

by

s->lambda *= sqrtf(sqrtf(ratio));

Fixes the white nose thing, so indeed it's RC messup.

But that brings some other trouble in more normal signals, so I guess I'll have to play with RC a little bit more.

comment:18 by klaussfreire, 12 years ago

I think AAC's ratecontrol needs a lookahead buffer.

by Kamedo2, 12 years ago

Attachment: Whitenoise.flac added

White noise, created by SoundEngine Free ver.4.59. Using aevalsrc as in comment:11 do the same job.

in reply to:  16 comment:19 by Carl Eugen Hoyos, 12 years ago

Replying to klaussfreire:

You may also want to look at ticket #2706.
(Is it a duplicate of this ticket?)

in reply to:  18 comment:20 by Kamedo2, 11 years ago

Replying to klaussfreire:

I think AAC's ratecontrol needs a lookahead buffer.

Can you implement the feature until July 13th?
I'm going to be free and have time to do some double-blind listening tests of the codec.
Results will be like this: http://www.hydrogenaudio.org/forums/index.php?showtopic=100896

comment:21 by klaussfreire, 11 years ago

Maybe a very simple one-block one. I've been thinking such a simple lookahead might be enough to fix the bugs, with a better one perhaps for a further patch.

I'll give this high priority, but we're only 3 days away from that deadline you know...

in reply to:  21 comment:22 by Kamedo2, 11 years ago

Replying to klaussfreire:

Maybe a very simple one-block one. I've been thinking such a simple lookahead might be enough to fix the bugs, with a better one perhaps for a further patch.

I'll give this high priority, but we're only 3 days away from that deadline you know...

Thank you very much! A delay of some days is acceptable.

comment:23 by klaussfreire, 11 years ago

Alright, attaching another version. This seems to work better, but it's a bit rushed. I'll try to improve on it, but if I delay, feel free to test this version.

by klaussfreire, 11 years ago

Second version of AAC improvements, with improvements on rate control, hopefully gets rid of all remaining "collapsations on high bit rates". Tested various music tracks on 64k, 128k, 256k and 384k.

in reply to:  23 comment:24 by Carl Eugen Hoyos, 11 years ago

Replying to klaussfreire:

Alright, attaching another version.

The patch does not apply here to current git head.

comment:25 by Kamedo2, 11 years ago

The patch does not apply, neither. I read http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143216.html and http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143222.html and guessed what should I do, but still, it fails.

by klaussfreire, 11 years ago

Second version of AAC improvements, with improvements on rate control, hopefully gets rid of all remaining "collapsations on high bit rates". Tested various music tracks on 64k, 128k, 256k and 384k.

comment:26 by klaussfreire, 11 years ago

Yes, sorry, I'm not working on a clean checkout.

I should move to a clean checkout.

There I attached a rebased patch.

comment:27 by Kamedo2, 11 years ago

Very good one! The only serious artifact I've heard so far is whitenoise.flac at 8, 16, 24, 32kbps and 192kbps.

comment:28 by Kamedo2, 11 years ago

Whitenoise.flac at 384kbps, ffmpeg_aac320k_collapse.flac at 320kbps is strange, too.

in reply to:  28 comment:29 by klaussfreire, 11 years ago

Replying to Kamedo2:

Whitenoise.flac at 384kbps, ffmpeg_aac320k_collapse.flac at 320kbps is strange, too.

I didn't try the collapse ones at 320k, though I tried at 384 and sounded nice. I'll try again when I have a chance though.

However, whitenoise 384 gives me an error, seems 384kbps is too much for mono. The whitenoise I mention is generated with the random generator, I'll try with the flac first chance I get.

comment:30 by Kamedo2, 11 years ago

Isn't the lower spreading function applied too much? The quality of lower frequency is bad when the higher frequency bin is strong. And what makes 320kbps particularly bad? The quality degrades when we have enough ('overkill') bits. I think something fatal is happening, like integer overflow or something.

by Kamedo2, 11 years ago

A sound that degrades on FFmpeg native aac encoder.

comment:31 by Kamedo2, 11 years ago

Isn't line 334 of libavcodec/aacpsy.c:

        for (g = 0; g < ctx->num_bands[j]-1; g++) {
            AacPsyCoeffs *coeff = &coeffs[g];
            float bark_width = coeffs[g+1].barks - coeffs->barks;
            coeff->spread_low[0] = pow(10.0, -bark_width * PSY_3GPP_THR_SPREAD_LOW);
            coeff->spread_hi [0] = pow(10.0, -bark_width * PSY_3GPP_THR_SPREAD_HI);
            coeff->spread_low[1] = pow(10.0, -bark_width * en_spread_low);
            coeff->spread_hi [1] = pow(10.0, -bark_width * en_spread_hi);
            pe_min = bark_pe * bark_width;
            minsnr = exp2(pe_min / band_sizes[g]) - 1.5f;
            coeff->min_snr = av_clipf(1.0f / minsnr, PSY_SNR_25DB, PSY_SNR_1DB);
        }

strange? I doubt the sanity of lower spreading function at the highest band, because using -cutoff 18000 option improves the quality on problematic samples, and these problematic samples always includes strong 20-22kHz sounds. (The default cutoff is 18k at 192kbps, 20k at 256kbps, and 22k at 320kbps.)

by Kamedo2, 11 years ago

Attachment: 18.6_22kHz_noise.flac added

Partial white noise, clipped by 256th-order lanczos function, to include only signals between 18.6 and 22kHz. the signal wanders around the freq.

comment:32 by Kamedo2, 11 years ago

I've got it. When the native aac encoder calcs a masking curve, almost inaudible sounds like 18kHz, 20kHz, 22kHz is taking into account, and audible sound like 14kHz is masked by the inaudibles. Add the inaudible noise above to the source sound and the encoded sound will be significantly degraded. I recommend that any signals above 16kHz is disregarded in psychoacoustic engines.

comment:33 by klaussfreire, 11 years ago

Alright. Good catch.

I'd recommend not ignoring, because masking within that band will still be important for bit allocation purposes. Rather, back-spreading rolloff (towards the lower frequencies) should be tweaked a bit.

comment:34 by Kamedo2, 11 years ago

Things start to make sense.

Could you tweak the back-spreading and provide the patch for me? I'd like to test that.

comment:35 by klaussfreire, 11 years ago

Yes, will do this tonight (at work right now).

by Kamedo2, 11 years ago

A sound that degrades on FFmpeg native aac encoder.

comment:36 by Kamedo2, 11 years ago

-cutoff 18000 seems to work, but the lowpass filter is too dull, compared to many practical encoders. libavcodec/psymodel.c has the constant FILT_ORDER, and changing the order from 4 to 8 sharpens the filter. But 12 and 16 fails somehow.

comment:37 by klaussfreire, 11 years ago

I hope you're testing with good headphones. HF quality is hard to gauge with speakers, especially since good speakers cost a fortune.

comment:38 by Kamedo2, 11 years ago

Yes, I'm testing with good headphones.

in reply to:  38 ; comment:39 by klaussfreire, 11 years ago

Replying to Kamedo2:

Yes, I'm testing with good headphones.

The reason I mention this is because, from my experience, FAAC tends to have a low cutoff for some bitrates, that seem optimal with speakers, but sound noticeably dull with headphones.

in reply to:  39 comment:40 by Kamedo2, 11 years ago

Replying to klaussfreire:

The reason I mention this is because, from my experience, FAAC tends to have a low cutoff for some bitrates, that seem optimal with speakers, but sound noticeably dull with headphones.

Exactly. FAAC cutoff is rather annoyingly low in 96kbps, 64kbps, and 32kbps, and the filter is the major reason why FAAC never beats Nero.

BTW, any prospects for fixing samples 1, 4, 5, and white noise? 4 and 5 is bad at 320kbps and whitenoise.flac is bad at 384kbps. Both regain quality by -cutoff 18000.

comment:41 by Kamedo2, 11 years ago

from line 300:

    const int chan_bitrate = ctx->avctx->bit_rate / ((ctx->avctx->flags & CODEC_FLAG_QSCALE) ? 2.0f : ctx->avctx->channels);

to:

    const int chan_bitrate = FFMIN(ctx->avctx->bit_rate, 240000) / ((ctx->avctx->flags & CODEC_FLAG_QSCALE) ? 2.0 : ctx->avctx->channels);

significantly improves the quality. Bitrates remain relatively high in this change.
I have not tested all cases, but it works on 256kbps, 320kbps, and 384kbps on many sounds.

comment:42 by Kamedo2, 11 years ago

I've listened to over 100 samples of diverse music and speech records. No problem so far. It works on 96, 112, 128,... 256kbps, but hangs on 288kbps.

comment:43 by klaussfreire, 11 years ago

Yeah, but because you're capping psy's bitrate target to non-problematic rates. I don't think that's ideal, though that indeed proves the problem lies in psy.

in reply to:  43 comment:44 by Kamedo2, 11 years ago

Replying to klaussfreire:

Yeah, but because you're capping psy's bitrate target to non-problematic rates. I don't think that's ideal, though that indeed proves the problem lies in psy.

Rates go up even after capping. So it's not merely a cap. I think we're close to the solution.

comment:45 by klaussfreire, 11 years ago

They go up because twoloop will push all scalefactors down uniformly until it achieves the desired bitrate, but:

  • It won't work with VBR, VBR almost wholly depends on psy to dictate scalefactor band noise floors. Twoloop will push scalefactors down a bit more I think but not much at those high bitrates
  • It's still suboptimal, it's better to let psy decide, since psy understands perceptual entropy better

Sadly, I didn't have time today to work on it. Lets hope I can do so tomorrow. With your analysis I'm confident I can patch psy without having to cap anything.

comment:46 by Kamedo2, 11 years ago

How is the development going?

comment:47 by klaussfreire, 11 years ago

Reading the specs right now. I had a hunch that the spec might say something about this.

comment:48 by klaussfreire, 11 years ago

There. Line 308:

pctx->frame_bits   = chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate;

Must be

pctx->frame_bits   = FFMIN(3000, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);

That is indeed said on the spec.

Step 15 of subpart 4: Steps in threshold calculation: then bit allocation is limited to 0 < bit_allocation < 3000. It seems they thought of it all.

comment:49 by Kamedo2, 11 years ago

Great! I'm goint to have time to test that improvement 5 hours later, so I'm going to test that. Extensively. And I think I have to look for ways to sharpen the LPF, using more order, at the cost of more computational time. Currently it's not very clear cut.

comment:50 by klaussfreire, 11 years ago

2560 (the number you found) works better for us though. That's certainly in relation to some deficiency in twoloop, but hey. Lets just document that this should be a 3000 but can't and be done.

comment:51 by klaussfreire, 11 years ago

The LPF could be accomplished by zeroing the coefficients in the FFT. To get the lowest possible ripple, the boundary coefficient needs some care, but AFAIR it's the best method, and it's free for something that's already doing FFT.

comment:52 by Kamedo2, 11 years ago

It's not a regression, but surround bitrate seems to be capped and do not change by -b:a 256k, 320k, 384k.
Surround sample file is in here. http://people.xiph.org/~xiphmont/demo/opus/demo3.shtml
I'm currently using tx->frame_bits = FFMIN(3000,...
No obvious bugs so far.

comment:53 by Kamedo2, 11 years ago

http://i44.tinypic.com/2s805y0.png

I used tx->frame_bits = FFMIN(2560, and psymodel.h line 32:

#define AAC_CUTOFF(s) (s->bit_rate ? FFMIN3(FFMIN3(s->bit_rate/s->channels/2, 4000 + s->bit_rate/s->channels/4, 12000 + s->bit_rate/s->channels/16), 20000, s->sample_rate / 2): (s->sample_rate / 2))

This is better on mono, surround, and on very low bitrates(such as 32kbps stereo).
truncut.wav has few HF content, so the bitrate saturates in 172kbps.

comment:54 by Kamedo2, 11 years ago

In 4 hours of hearing more than 100 musical, vocal, ambient and artificial sounds, on 64-480kbps, 44.1kHz, 48kHz, stereo, surround, I have found no problematic samples. This solution is great. Thank you for fixing, klaussfreire.

I think I'm going to test mono, collecting more surround samples to test, 32kHz or less, and VBR modes tomorrow.

comment:55 by Kamedo2, 11 years ago

http://i43.tinypic.com/15714c5.png
All are stereo. I listened to some of the encoded AACs, and there were no problem.

comment:56 by Kamedo2, 11 years ago

Should I use ffmpeg_g to spot the bug? Thousands of diverse sound files are now encoded to see whether it doesn't freeze or fail.

comment:57 by Kamedo2, 11 years ago

Recommended cutoff frequency for FFmpeg AAC.
http://i41.tinypic.com/28al1fn.png
psymodel.h line 32:

#define AAC_CUTOFF(s) (s->bit_rate ? FFMIN3(FFMIN3(s->bit_rate/s->channels/2, 3000 + s->bit_rate/s->channels/4, 12000 + s->bit_rate/s->channels/16), 20000, s->sample_rate / 2): (s->sample_rate / 2))

The LPF is not applied in VBR now, resulting in noticeably poor quality.

comment:58 by Kamedo2, 11 years ago

http://i40.tinypic.com/14smbo0.png
songs: 5 min snippets of pops and jazz, 44.1kHz, stereo
non-music sounds: 16 min of artificial sounds, difficult samples, speech, etc, 48kHz, stereo

LAME equivalentBitrateVBR number
160.029
-V9.9320.053
480.097
-V9640.23
-V8800.43
-V7960.55
-V61120.66
-V51280.86
1441.06
-V41601.17
-V31761.29
-V21921.43
-V12242.2
-V02564.3
2886.2
3207
3527.7
38410

comment:59 by klaussfreire, 11 years ago

How about the subjective quality on the various VBR modes, as compared to CBR (actually ABR, since a CBR setting in AAC produces ABR).

I worked hard to get good results, but there's still problematic samples, that sound better on equivalent ABR than VBR.

in reply to:  57 ; comment:60 by klaussfreire, 11 years ago

Replying to Kamedo2:

psymodel.h line 32:

#define AAC_CUTOFF(s) (s->bit_rate ? FFMIN3(FFMIN3(s->bit_rate/s->channels/2, 3000 + s->bit_rate/s->channels/4, 12000 + s->bit_rate/s->channels/16), 20000, s->sample_rate / 2): (s->sample_rate / 2))

The LPF is not applied in VBR now, resulting in noticeably poor quality.

Try this cutoff:

#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \
    bit_rate/channels, \
    3000 + bit_rate/channels/2, \
    16000 + bit_rate/channels/8), \
    20000, \
    sample_rate / 2): (sample_rate / 2))
#define AAC_CUTOFF(s) ( \
    (s->flags & CODEC_FLAG_QSCALE) \
    ? _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \
    : _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \
)

I find it works better, the other was was pretty dull for 64k/ch, which ought to be transparent for AAC. This one also works on VBR.

by Kamedo2, 11 years ago

Attachment: ffmpeg_aacvbr_pulse1.flac added

Sound disappears for about 20ms in VBR mode -q:a 5, -q:a 10. Sounds like an annoying pulse.

in reply to:  60 comment:61 by Kamedo2, 11 years ago

Replying to klaussfreire:

Try this cutoff:

#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \
    bit_rate/channels, \
    3000 + bit_rate/channels/2, \
    16000 + bit_rate/channels/8), \
    20000, \
    sample_rate / 2): (sample_rate / 2))
#define AAC_CUTOFF(s) ( \
    (s->flags & CODEC_FLAG_QSCALE) \
    ? _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \
    : _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \
)

I tried, but isn't this cutoff strange? It sounds like the lowpass is always 20kHz.
The problem of ffmpeg_aacvbr_pulse1.flac is solved by this.

I'm using current git head 54813 + aac-improvements-wip-v2-rclookahead.2.patch + aacpsy.c Line 308

pctx->frame_bits   = FFMIN(2560, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);

comment:62 by klaussfreire, 11 years ago

LOL, sorry, the VBR condition is backwards. An old idiocy of mine, I always reverse if conditions. Kinda like coding dyslexia.

It should be

#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \
    bit_rate/channels, \
    3000 + bit_rate/channels/2, \
    12000 + bit_rate/channels/8), \
    20000, \
    sample_rate / 2): (sample_rate / 2))
#define AAC_CUTOFF(s) ( \
    (s->flags & CODEC_FLAG_QSCALE) \
    ? _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \
    : _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \
)

Though I'm getting some weird results with very low quality settings.

comment:63 by Kamedo2, 11 years ago

Aren't you trying to access s->bit_rate when it's VBR? Or am I missing something?

comment:64 by Kamedo2, 11 years ago

Is s->global_quality different from VBR number -q:a x?

LAME equivalentStereo BitrateVBR numberRecommended cutoff
160.0294000
-V9.9320.0537000
480.0979000
-V9640.2311000
-V8800.4313000
-V7960.5515000
-V61120.6615500
-V51280.8616000
1441.0616500
-V41601.1717000
-V31761.2917500
-V21921.4318000
-V12242.219000
-V02564.320000
2886.220000
320720000
3527.720000
3841020000

in reply to:  63 ; comment:65 by klaussfreire, 11 years ago

Replying to Kamedo2:

Aren't you trying to access s->bit_rate when it's VBR? Or am I missing something?

Yes, bit_rate in that case holds the default of 128kbps. Psy does the same, but it works well since that's considered to be AAC's transparent rate. So, for VBR, you make psy work at transparent settings, and compensate bit allocation based on RD scaling.

in reply to:  64 comment:66 by klaussfreire, 11 years ago

Replying to Kamedo2:

Is s->global_quality different from VBR number -q:a x?

It's x * 120 AFAIK

comment:67 by klaussfreire, 11 years ago

I think I finally got VBR to talk to psy.

It's looking good. I'll post an updated patch with all this in a while (still lots of tests to perform)

in reply to:  65 comment:68 by Kamedo2, 11 years ago

Replying to klaussfreire:

Yes, bit_rate in that case holds the default of 128kbps. Psy does the same, but it works well since that's considered to be AAC's transparent rate.

AAC is not transparent in 128kbps stereo, although Apple used to advertise that way. http://d.hatena.ne.jp/kamedo2/20111029/1319840519

in reply to:  60 comment:69 by Kamedo2, 11 years ago

Replying to klaussfreire:

#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \
    bit_rate/channels, \
    3000 + bit_rate/channels/2, \
    16000 + bit_rate/channels/8), \
    20000, \
    sample_rate / 2): (sample_rate / 2))
#define AAC_CUTOFF(s) ( \
    (s->flags & CODEC_FLAG_QSCALE) \
    ? _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \
    : _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \
)

I find it works better, the other was was pretty dull for 64k/ch, which ought to be transparent for AAC. This one also works on VBR.

The high cutoff causes trouble for whitenoise.flac below 55kbps.
And I'm almost certain 16kHz is optimal at 128kbps stereo.
http://d.hatena.ne.jp/kamedo2/20120221/1329845124
http://d.hatena.ne.jp/kamedo2/20120729/1343545890
http://i43.tinypic.com/cmhx3.png
http://i39.tinypic.com/2ecdv0o.png

comment:70 by Kamedo2, 11 years ago

I recommend psymodel.h line 24 to be:

#include "libavutil/libm.h"
#include "avcodec.h"

/** maximum possible number of bands */
#define PSY_MAX_BANDS 128
/** maximum number of channels */
#define PSY_MAX_CHANS 24

#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \
    bit_rate/channels/2, \
    3000 + bit_rate/channels/4, \
    12000 + bit_rate/channels/16), \
    20000, \
    sample_rate / 2): (sample_rate / 2))
#define AAC_CUTOFF(s) ( \
    (s->flags & CODEC_FLAG_QSCALE) \
    ? _AAC_CUTOFF(((int)(135000.0f*sqrtf(s->global_quality ? s->global_quality/120.0f : 1.0f))), 2, s->sample_rate) \
    : _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \
)

In this way, I can set cutoff to VBR modes as well.
PSY_MAX_CHANS 24 is to accommodate NHK 22.2ch.

I notice that in -q:a 0.2 and -q:a 0.4, the lower freq is in trouble. It sounds like a thunder far away.

comment:71 by klaussfreire, 11 years ago

Yes, I'm fixing the lower frequency right now. It's a matter with tonal band priorization that in VBR doesn't really work as intended. I'm preparing a better patch now. I'll test your cutoffs.

comment:72 by Kamedo2, 11 years ago

After applying the new LPF at comment:70, the result bitrate of music changed a bit. I think I have to replot the graph. And one more problem. -q:a 0.029 or -q:a 10 is unfriendly for an average user. I think the value should be roughly equivalent of LAME. I mean, if one use -q:a 2, the result of average sound is roughly 96kbps/channel, which is the same behavior as LAME -V2. Is applying new LPF method comment:51 easy?

comment:73 by klaussfreire, 11 years ago

After two days of toying around, the butterworth filter used in psy is actually counterproductive. Keeping all things equal, lowering the cutoff actually increases bitrate, if a fixed RD is forced. So, for VBR, it's a no-no.

I'm trying an FFT-based LP by simply zeroing coeffs, with care at the boundary to minimize ripple, and it seems to work a lot better, at least for VBR.

Right now, the implementation is just a POC. It's very dirty. But I'm getting convinced this is the way for VBR... and maybe for ABR too. I'm not sure.

Edit: And, to boot, an FFT is phase-linear. I can actually hear group delay with the butterwroth. Ugly.

Last edited 11 years ago by klaussfreire (previous) (diff)

comment:74 by Kamedo2, 11 years ago

Is that FFT, not MDCT?
I'm guessing that lowering the cutoff increases the bitrate is the effect of comment:32. Very strange, as HF contents usually takes up more bits, but it makes sense.

comment:75 by klaussfreire, 11 years ago

You're right, the one I have done right now is MDCT, because it's done within the bit allocator. But I've been meaning to implement an actual FFT filter later on, if not too hard, and if the technique pans out.

comment:76 by klaussfreire, 11 years ago

Thing is, the butterworth doesn't really remove that much content, and it changes the masking thresholds in a way that actually requires more bits to encode. A higher-order butterworth might work, but it would have way too much group delay.

comment:77 by klaussfreire, 11 years ago

BTW, wait before you redo that graph, I have a much better VBR patch almost ready.

comment:78 by klaussfreire, 11 years ago

Alright, i'm attaching a new VBR patch. CBR/ABR shouldn't have changed (shouldn't, but might). I will probably want to apply the same logic to CBR/ABR as well, since it works very well (ie: cutoff not with a filter but with the bit allocator, stop spending bits on HF if we're starving for bits).

A heads-up: VBR's q-to-kbps curve has changed, and there's some artifacts that sound like scratchy noises (especially audible in the sine sample), that are due to clipping. I think it's not specific to this patch, but I just noticed it. I'm not sure how to attack it. Normally, I'd apply compression on the IMDCT stage, but since that's on the decoder side, I'll probably have to find a clever way to predict clipping on the encoder and compensate. Craptastic.

Anyway, I do think VBR has been greatly improved on this patch. Let me know what you think.

by klaussfreire, 11 years ago

VBR improvements over wip-v2-rclookahead

comment:79 by Carl Eugen Hoyos, 11 years ago

I believe your latest patch contains trailing whitespace (that cannot be committed to FFmpeg git), consider running tools/patcheck over the diff.

comment:80 by Kamedo2, 11 years ago

I successfully applied the patch from latest git head N-54889-g47d57f2.

comment:81 by Kamedo2, 11 years ago

http://i41.tinypic.com/28mo7jn.png
Very strange behavior, and whitenoise.flac at -q:a 1 completely lacks LF contents.
Somehow, this encoder tends to omit the lowest tone in white noise which is audible.
-q:a 1.7 and -q:a 2.7 (the peak of bitrate) of whitenoise.flac is strange, too.

comment:82 by klaussfreire, 11 years ago

Yeah it seems to have an anomaly around 1. I had only tested whitenoise up to 0.7. I'll try to patch it up.

comment:83 by klaussfreire, 11 years ago

Ah, yeah, I know. It's probably the scaler offset. It must be unpredictable in whitenoise because of how flat the envelope is.

comment:84 by Kamedo2, 11 years ago

I don't recommend to ambitiously try to save the HF content above 18kHz when there are enough bits. It sounds unstable. Some 1990s early MP3 encoders had the tactic, but none of them were good. Rather, clean, fixed LPF should be applied at all time. Avoid the situation that one can hear the 12-20kHz content in some part of the music, and hearing the dull 12kHz LPF-like sound in the other part of the music.

As for

pctx->frame_bits   = FFMIN(2560, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);

do we get more stable results when the number 2560 is lowered?
(240kbps is a 'megadose' or 'overkill' bitrate for AAC, so slight degradation is not a major problem.)

comment:85 by Kamedo2, 11 years ago

I notice that the LPF on some short blocks is not working in at least -q:a 0.3 and 0.4.

http://i42.tinypic.com/okor9c.png

Stereo BitrateVBR number
320.14
640.25
960.33
1280.39
1600.46
1920.55

in reply to:  84 ; comment:86 by klaussfreire, 11 years ago

Replying to Kamedo2:

I don't recommend to ambitiously try to save the HF content above 18kHz when there are enough bits. It sounds unstable. Some 1990s early MP3 encoders had the tactic, but none of them were good. Rather, clean, fixed LPF should be applied at all time. Avoid the situation that one can hear the 12-20kHz content in some part of the music, and hearing the dull 12kHz LPF-like sound in the other part of the music.

I just want to preserve the HF component of transients. There might be better ways of doing that. I guess I'll keep iterating on it. However, I believe the way it's being done now works well. If you check, the LP cutoff is chosen from the allocation given by psy. Psy contains bit reservoir logic, which means it will momentarily increase bits (and cutoff) for some difficult transients. Right now, it works wonders for hi-hats.

I will probably have to be stricter about the cutoff, though. As you say, when the signal by itself (not by psy's indication, but signal strength alone) suddenly jumps in HF content, the result is unpleasant. I think I have cleaned up most of those cases, but who knows. It's hard to discern those from actual transients.

As for

pctx->frame_bits   = FFMIN(2560, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);

do we get more stable results when the number 2560 is lowered?
(240kbps is a 'megadose' or 'overkill' bitrate for AAC, so slight degradation is not a major problem.)

If it doesn't limit the ability to increase allocation for transients, it might. I'll look into it.

in reply to:  86 ; comment:87 by Kamedo2, 11 years ago

Replying to klaussfreire:

I just want to preserve the HF component of transients. There might be better ways of doing that. I guess I'll keep iterating on it. However, I believe the way it's being done now works well. If you check, the LP cutoff is chosen from the allocation given by psy. Psy contains bit reservoir logic, which means it will momentarily increase bits (and cutoff) for some difficult transients. Right now, it works wonders for hi-hats.

So, if there is a group of beat sounds that is on the threshold of tonal/transients, the LPF is sometimes on and sometimes off? Currently, the on/off switch itself is audible and is quite annoying. It sounds like a stopwatch.

I will probably have to be stricter about the cutoff, though. As you say, when the signal by itself (not by psy's indication, but signal strength alone) suddenly jumps in HF content, the result is unpleasant. I think I have cleaned up most of those cases, but who knows. It's hard to discern those from actual transients.

ffmpeg_aacvbr_pulse1.flac at -q:a 0.25 produces strange HF sounds.

by Kamedo2, 11 years ago

Attachment: ffmpeg_aacvbr_pulse2.flac added

Partial white noise, splitted by 256th lanczos filter. HF pulse noise that sounds like stopwatch is added in VBR around -a:q 0.3

in reply to:  87 comment:88 by klaussfreire, 11 years ago

Replying to Kamedo2:

Replying to klaussfreire:

I just want to preserve the HF component of transients. There might be better ways of doing that. I guess I'll keep iterating on it. However, I believe the way it's being done now works well. If you check, the LP cutoff is chosen from the allocation given by psy. Psy contains bit reservoir logic, which means it will momentarily increase bits (and cutoff) for some difficult transients. Right now, it works wonders for hi-hats.

So, if there is a group of beat sounds that is on the threshold of tonal/transients, the LPF is sometimes on and sometimes off? Currently, the on/off switch itself is audible and is quite annoying. It sounds like a stopwatch.

No, the cutoff moves up and down, but the LP remains on.

I'll have to check the sample

comment:89 by Kamedo2, 11 years ago

You seems to be using the heuristics that transients HF components are loud and tonal HF components are quiet.

in reply to:  89 comment:90 by klaussfreire, 11 years ago

Replying to Kamedo2:

You seems to be using the heuristics that transients HF components are loud and tonal HF components are quiet.

No, I let psy detect the transients. The only heuristic, is that I attempt to encode a little bit more of the HF with decreased quality.

Ie, from 0-cutoff, normal quantization. From cutoff-cutoff * 1.2, coarse (progressively coarser in fact) quantization. Now, I let bit allocation zero out beyond 1.2. I may have to force it to avoid the artifacts you mention.

comment:91 by Kamedo2, 11 years ago

Seeing the spectrogram, sometimes, up to 22kHz is encoded. No way we can hear that high. However, because of your algorithm, the cutoff seems to be much higher than it actually is, and the sound is much clearer in typical cases. But we have to be careful of exceptions. I think I feel strange when the encoded_highest_sound - normal_cutoff is more than 3kHz. Sounds something like plip, plip. Is coarse quantization at cutoff~cutoff*1.2 applied only to transients?

comment:92 by klaussfreire, 11 years ago

No, that's applied to tonal signals as well. A way to squeeze a little extra bandwidth. It proved to be a winning move for music, though I didn't test that much with noise.

Last edited 11 years ago by klaussfreire (previous) (diff)

comment:93 by Kamedo2, 11 years ago

Is that included in a wip-v3-vbr.patch, or a new feature? It sounds like the extra HF content encode is only on transients. And some transients are indeed encoded up to 22kHz.
Are HF contents over cutoff*1.2 totally discarded? (I believe this is the best move.)

comment:94 by Kamedo2, 11 years ago

The LAME sometimes acts like your algorithm, but within 2kHz or so. It's related to -Y switch, and LAME sometimes encodes 16~18kHz contents.

in reply to:  93 comment:95 by klaussfreire, 11 years ago

Replying to Kamedo2:

Is that included in a wip-v3-vbr.patch

Yes

Are HF contents over cutoff*1.2 totally discarded? (I believe this is the best move.)

No, and maybe that's the problem. 1.2 just happens to be the point at which the increased quantization floor starts zeroing out all components. Until that, RD optimization brings down the quantization floor to maintain acceptable quality, so you don't notice the floor rising (and it fact it doesn't for fully tonal bands, that's what RD optimization is about, whereas it does rise for noisy ones).

So, in essence, up to cutoff * 1.2, tonal components are retained at the expense of HF noise, which seems like a sensible tradeoff.

What must be happening, is that, on some signals, the zeroing point happens above 1.2, significantly above. So it's perhaps wise to hardcode that 1.2 value, and force a zero on those bands instead.

comment:96 by Kamedo2, 11 years ago

I think we should hardcode min(cutoff+2500, cutoff*1.2). When cutoff is 18kHz, cutoff*1.2 is 21.6kHz which is too high. Could you provide the relation between -q:a value and cutoff so we can have better grasp on what's happening?

comment:97 by klaussfreire, 11 years ago

So, I tracked the anomaly near -q:a 1 to the ESC_BT codebook. It seems when noise floors are too low, the coefficients can't be properly encoded, and all kinds of bad things ensue. I'll see how to fix it.

comment:98 by Kamedo2, 11 years ago

I noticed that this new VBR encoder has zero delay. ABR encoder at 64kbps stereo has 1 sample delay. Probably because the lack of the butterworth LPF.

comment:99 by klaussfreire, 11 years ago

That's why I want to get rid of the butterworth. It's good, but FFT is better, since it's phase-linear. With all the quantization noise I don't think we care that much about ripple, but even if we did, FFTs can be made to minimize it.

comment:100 by Kamedo2, 11 years ago

I think I can start the blind test from August 3rd. With the results, we can overwrite the outdated FFmpeg AAC Encoding Guide. https://trac.ffmpeg.org/wiki/AACEncodingGuide

comment:101 by Kamedo2, 11 years ago

Is the comment:97 fixable? I think it will contribute to higher quality in 160kbps and 192kbps. Currently, it is still worse than the mighty Apple AAC.

I assume most blocks are long(1024 samples) tonal blocks, and short, transient blocks are rare, that are apparently causing problems, am I right?

comment:102 by klaussfreire, 11 years ago

Yes, I have a fix in the works. That limitation is the reason the standard limits allocation to 3000 bits, most likely.

comment:103 by Kamedo2, 11 years ago

Isn't aaccoder.c line 787~795 strange? I believe somewhere making cutoff value or using cutoff value should be the source of the trouble, which causes weird sounds in low bitrates such as -q:a 0.25.

comment:104 by klaussfreire, 11 years ago

So, I tried a whole new approach, and it seems vastly superior.

I modified psy's "Rate control" to work differently for VBR. Instead of using the bit reservoir, it just computes the optimum PE and scales it by quality. And it works nicely. I still had to push scalers a bit more on the allocator and do the LP filtering to reach the very low bit rates with VBR, but it's sounding a lot better.

I'll do some more testing and then upload the updated patch.

comment:105 by Kamedo2, 11 years ago

Wow, great!

comment:106 by Kamedo2, 11 years ago

I inserted

av_log(NULL, AV_LOG_DEBUG, "\n cutoff=%d, lambda=%f, frame_bit_rate=%d, bandwidth=%d\n",cutoff,lambda,frame_bit_rate,bandwidth);

in aaccoder.c twoloop line 795, and found cutoff differs between different frames. I used -q:a 0.4, stereo 44.1kHz. I assume <99 cutoffs are the short blocks and 500< cutoffs are the long tonal blocks. The cutoff varies throughout the same music. 11.7k~13.6k for the short blocks, 11.5k~13.2k for the long blocks. (Calculated from the 25 raw examples below)

 cutoff=77, lambda=47.000000, frame_bit_rate=46034, bandwidth=14508

 cutoff=614, lambda=47.000000, frame_bit_rate=45648, bandwidth=14412

 cutoff=76, lambda=47.000000, frame_bit_rate=45648, bandwidth=14412

 cutoff=612, lambda=47.000000, frame_bit_rate=45417, bandwidth=14354

 cutoff=76, lambda=47.000000, frame_bit_rate=45417, bandwidth=14354

 cutoff=532, lambda=47.000000, frame_bit_rate=37937, bandwidth=12484
    Last message repeated 1 times

 cutoff=538, lambda=47.000000, frame_bit_rate=38477, bandwidth=12619
    Last message repeated 1 times
size=     242kB time=00:00:15.80 bitrate= 125.2kbits/s
 cutoff=68, lambda=47.000000, frame_bit_rate=39017, bandwidth=12754

 cutoff=544, lambda=47.000000, frame_bit_rate=39017, bandwidth=12754

 cutoff=548, lambda=47.000000, frame_bit_rate=39402, bandwidth=12850
    Last message repeated 1 times

 cutoff=551, lambda=47.000000, frame_bit_rate=39711, bandwidth=12927
    Last message repeated 1 times

 cutoff=554, lambda=47.000000, frame_bit_rate=39942, bandwidth=12985
    Last message repeated 1 times

 cutoff=69, lambda=47.000000, frame_bit_rate=40173, bandwidth=13043

 cutoff=556, lambda=47.000000, frame_bit_rate=40173, bandwidth=13043

 cutoff=69, lambda=47.000000, frame_bit_rate=40405, bandwidth=13101

 cutoff=558, lambda=47.000000, frame_bit_rate=40405, bandwidth=13101

 cutoff=561, lambda=47.000000, frame_bit_rate=40636, bandwidth=13159
    Last message repeated 1 times

 cutoff=562, lambda=47.000000, frame_bit_rate=40713, bandwidth=13178
    Last message repeated 1 times

 cutoff=71, lambda=47.000000, frame_bit_rate=41870, bandwidth=13467

 cutoff=574, lambda=47.000000, frame_bit_rate=41870, bandwidth=13467

 cutoff=79, lambda=47.000000, frame_bit_rate=47653, bandwidth=14913
    Last message repeated 1 times

 cutoff=78, lambda=47.000000, frame_bit_rate=46651, bandwidth=14662
    Last message repeated 1 times

 cutoff=76, lambda=47.000000, frame_bit_rate=45031, bandwidth=14257
    Last message repeated 1 times
[output stream 0:0 @ 04adab60] EOF on sink link output stream 0:0:default.
No more output streams to write to, finishing.

 cutoff=75, lambda=47.000000, frame_bit_rate=44337, bandwidth=14084
    Last message repeated 1 times

 cutoff=68, lambda=47.000000, frame_bit_rate=39711, bandwidth=12927
    Last message repeated 1 times
[aac @ 04aaf580] Trying to remove 504 more samples than there are in the queue
size=     253kB time=00:00:16.10 bitrate= 128.9kbits/s
video:0kB audio:250kB subtitle:0 global headers:0kB muxing overhead 1.475195%
755 frames successfully decoded, 0 decoding errors
[AVIOContext @ 04ad0440] Statistics: 30 seeks, 779 writeouts
[AVIOContext @ 04d6f8a0] Statistics: 3123324 bytes read, 2 seeks
ffmpeg54890g.exe -v 9 -loglevel 99 -i ffmpeg_aacvbr_pulse2.wav -c:a aac -strict experimental -q:a 0.4 ffmpeg_aacvbr_pulse2.mp4

I tried to automate it by batch script, including preserving the av_log output but somehow it freezes.

comment:107 by klaussfreire, 11 years ago

Don't worry, for the new patch I'm using refbits instead of destbits, refbits is a direct derivation of lambda, so it won't change. I couldn't make the changing bandwidth work in a stable fashion without a lot more work, so I'll reserve that for a further patch, maybe.

comment:108 by Kamedo2, 11 years ago

The next patch seems to be a good one.

comment:109 by Kamedo2, 11 years ago

Is the patch available now?

comment:110 by klaussfreire, 11 years ago

Patience. Later today, or perhaps tomorrow, depending on your time zone

comment:111 by klaussfreire, 11 years ago

Damn. The patch works wonderfully well in VBR, but breaks CBR. I'll have to look into it during the weekend.

Patience indeed.

comment:112 by Kamedo2, 11 years ago

Yes, the VBR sounds dull and is currently(at v3) poorer than CBR, and it should have a lot of room to improve.

comment:113 by Kamedo2, 11 years ago

I've encoded weeks of AACs using v3 patch, using diverse samples and diverse bitrates and there were no problem(empty files, return with errors, freezes).

comment:114 by Kamedo2, 11 years ago

klaussfreire, could you provide the VBR-only patch? I'd like to test it. I may be able to detect the problem(s).

by klaussfreire, 11 years ago

Improved VBR, fixed psy threshold reduction bug

comment:115 by klaussfreire, 11 years ago

Attached the current WIP.

An explanation of what caused the bug for high q values: there was a bug in psy's threshold reduction for hole avoidance. When a second pass was needed, it would accumulate errors due to a simple typo (reduction += instead of reduction =).

I don't have the 3GPP spec to check, but I just noticed the code made no sense with the +=, but did with =.

Then there's the ESC_BT thing.

I think most serious anomalies have been fixed in this bug, I haven't had time to properly test CBR, but it seems to mostly work now. That was very subtle bit reservoir a bug on my "lookahead" patch that didn't surface until I fixed psy.

Anyway, I still would like to make VBR achieve lower bitrates without having to resort to LP filtering. I somehow sense it should be possible. In any case, I made CBR also use the same scalefactor-band-based LP filtering to remove the need for the butterworth that didn't save many bits anyway, and now it responds to the -cutoff argument, so if you don't like the default cutoff you can override yourself. It seemed worth parameterizing since I've found some sources that sound better at low bit rates with higher cutoffs, and some that don't. So it's source-dependent.

Anyway, enjoy the patch, I'm not sure I'll have time to work on a more permanent (one that I'd push to trunk) one till next weekend.

comment:116 by Kamedo2, 11 years ago

Yes, the cutoff is quite source-dependent, and listener-dependent too. Older people may prefer lower cutoffs. BTW, I'm 25 yrs old.

comment:118 by Kamedo2, 11 years ago

http://i44.tinypic.com/1zmczg5.png
aaccoder.c line 806 from

                ? (refbits * 1.6f * avctx->sample_rate / 1024) 

to

                ? (refbits * 2.5f * avctx->sample_rate / 1024) 

raises the LPF and the sound is much clearer(at the cost of more noise, but it's certainly better per real bitrate).
I feel the sound is bad in only tonal part of the music in VBR. And this encoder uses fewer bits, sometimes nearly half less, for the tonal part, unlike Opus, which has a distinctive tonality boost function.

comment:119 by klaussfreire, 11 years ago

Yes, I was in the middle of tweaking rdlambda scale for VBR (which is what gives the tonality boost). It seems way off target for VBR, since a lambda that in VBR results in 64kbps, in CBR it will give you about 32 or less.

With that properly tweaked, we can save lots of bits from noisy bands and put them to better use on tonal bands. For VBR, that means lower bitrates for the same quality level.

Increasing cutoff like you did there has the unwanted side effect of lowering quality a bit too much on tonal bands, for a set file size. I do my tests by searching through -q:a until I get a file roughly the same size as a reference CBR-encoded version, and comparing quality among those. With higher cutoffs, that procedure resulted in noticeable distortion on the HF bands, which is why I left it at 1.6, and it's what I believe will be fixed by tweaking rdlambda for VBR.

It can also be fixed by implementing codebook 13. But that's for another (future, way future) patch, since I see no easy way to implement CB 13 with twoloop, so I'll have to rewrite it.

comment:120 by Kamedo2, 11 years ago

This paper, fig. 6 shows bit allocation curves, although this is Opus.
http://jmvalin.ca/papers/aes135_opus_celt.pdf

comment:121 by klaussfreire, 11 years ago

Cool paper. Still, everything seems quite specific to Opus.

comment:122 by Kamedo2, 11 years ago

Is aaccoder.c line 829:

                if (start >= cutoff || band->energy <= (band->threshold * zeroscale) || band->threshold == 0.0) { 

correct? Not start >= cutoff+cutoff/5?

comment:123 by klaussfreire, 11 years ago

Yep, the cutoff is used as-is in this patch, the offset is already accounted for in its computation above that.

comment:124 by Kamedo2, 11 years ago

I've encoded weeks of AACs using v4 patch, using diverse samples and diverse bitrates and there were no problem(empty files, return with errors, freezes).

Is 'tweaking rdlambda for VBR' ready? If not, I think I should test v4 ABR first, because it's stable, have less artifacts in tonal samples. The blind test will be conducted in ABC/HR methodology, and there should be some opponents. I'm thinking of...

  • current git head with no patch, abr
  • v4 patch(or anything latest), abr
  • fdk-aac, abr

The bitrate will be 96kbps and 128kbps.

comment:125 by Kamedo2, 11 years ago

Or, I can drop fdk-aac and instead test on 3 bitrates. Do you have any idea?

comment:126 by Carl Eugen Hoyos, 11 years ago

Comparing with libfaac would be useful...

in reply to:  126 comment:127 by Kamedo2, 11 years ago

Replying to cehoyos:

Comparing with libfaac would be useful...

Is comment:69 not enough? (The test was in 2012 July.)

comment:128 by Carl Eugen Hoyos, 11 years ago

I thought that additional improvements were made since (and if ffaac does not beat libfaac and assuming fdk-aac beats libfaac, it might make more sense to compare with libfaac) but please don't let me misguide you.

comment:129 by Kamedo2, 11 years ago

I don't think many people will use libfaac. Both libfaac and libfdk_aac are non-free, and if many people prefer fdk-aac over faac, the new results of the new fdk-aac is more interesting than the another results of the old faac. (As far as I know, there are no blind test of fdk-aac.)

in reply to:  124 comment:130 by klaussfreire, 11 years ago

Replying to Kamedo2:

Is 'tweaking rdlambda for VBR' ready?

No, I'll have time starting tomorrow.

comment:131 by Kamedo2, 11 years ago

This is not my last test, and for a desire to compare this encoder with other encoders, I can do so later. By that time, I hope the new VBR is the state-of-the-art encoder.

comment:132 by Kamedo2, 11 years ago

I'm going to use these 20 samples below. There are six opponents(the first 3 are 96kbps, and the last 3 are 128kbps), so I have to score 6*20=120 sounds. The test is ready.
http://www.hydrogenaudio.org/forums/index.php?showtopic=98003

comment:133 by Mark, 11 years ago

Hi All,
Great to see that the native AAC encoder is getting some attention, and trying to make it mainstream. Using Windows 7 and Zeranoe's FFmpeg builds, I only get a choice of "The Native Encoder" or "libvo_aacenc".
From what I have read "libvo_aacenc" only seams to support sterio not 5.1 or higher.
I am no audiophile and a little hard of hearing so I cannot find fault with the Native Encoder but I can tell the difference between 2 and 6 channels :-)

Keep up the good work on a great piece of software.

Regards,
Mark

comment:134 by Kamedo2, 11 years ago

ffmpeg55212 -y -i input.wav -c:a aac -strict experimental -b:a 96k output.mp4
ffmpeg55212_patchv4 -y -i input.wav -c:a aac -strict experimental -b:a 96k output.mp4
ffmpeg55212 -y -i input.wav -c:a libfdk_aac -b:a 96k -afterburner 1 output.mp4

ffmpeg55212 -y -i input.wav -c:a aac -strict experimental -b:a 128k output.mp4
ffmpeg55212_patchv4 -y -i input.wav -c:a aac -strict experimental -b:a 128k output.mp4
ffmpeg55212 -y -i input.wav -c:a libfdk_aac -b:a 128k -afterburner 1 output.mp4

faad -b 4 -o output.float.wav output.mp4

The ABC/HR test is ongoing. These six outputs were shuffled and I listen to them without knowing which is which. I've done 2 samples out of 20. 10% done.

comment:136 by klaussfreire, 11 years ago

Do you have the files encoded with fdk for comparison?

by Kamedo2, 11 years ago

Attachment: fdkaac_10_12.zip added

samples #10-#12 encoded by fdkaac. *2.mp4 are the 128kbps samples, the others are the 96kbps samples.

by Kamedo2, 11 years ago

Attachment: fdkaac_13_16.zip added

samples # 10 - # 12 encoded by fdkaac. *2.mp4 are the 128kbps samples, the others are the 96kbps samples.

comment:137 by Kamedo2, 11 years ago

Oops, samples # 10 ~ # 12 and # 13 ~ # 16.

comment:138 by klaussfreire, 11 years ago

I think I've found the source of most of the "annoying" artifacts. With the recent fix to psy's hole avoidance, lots of the rate control hacks in the lookahead code are no longer necessary, since the bit reservoir now actually works. Though if I do completely disable them, the target bit rate is largely missed, so some RC stuff is still needed.

In short, RC hacks screw up on transients. I guess I'll have to explicitly limit RC hacks to non-transients (with perhaps some hysteresis). I'm working on a v5 fixing that.

Still, to get to fdk quality, I think we'll need to fix M/S encoding (which still has some artifacts, if it didn't, it can be a big efficiency bost) and implement codebook 13 (which fdk seems to use, though I haven't confirmed this). That's a much bigger project though.

comment:139 by Kamedo2, 11 years ago

Great, I'm guessing it's the reason why some samples got much poorer results than the fdk. Should I abort the v4 abr test and instead test on v5 after the release of 5? I'm on holiday now, but after August 26th, I'll move to more quiet place, so I can test more effectively.

comment:140 by klaussfreire, 11 years ago

I think I'll get you the v5 soonish, but I have an office to move this weekend so it may not be as soon as you'd like. In any case, soonish.

comment:141 by Kamedo2, 11 years ago

How is the development of v5?

in reply to:  141 comment:142 by klaussfreire, 11 years ago

Replying to Kamedo2:

How is the development of v5?

Sorry, urgent personal issues prevented me from reaching my self-imposed deadline. I'll try to dedicate some time to it as soon as I'm able, though. Next post ought to be a patch.

comment:143 by Kamedo2, 11 years ago

I resumed the ABC/HR test, and I've done 13 samples out of 20. How is the development going?

Last edited 11 years ago by Kamedo2 (previous) (diff)

in reply to:  143 comment:144 by klaussfreire, 11 years ago

Replying to Kamedo2:

I resumed the ABC/HR test, and I've done 13 samples out of 20. How is the development going?

Stalled for now, but I'll be able to resume soon

comment:145 by Timothy Gu, 11 years ago

Cc: timothygu99@gmail.com added

comment:146 by Kamedo2, 11 years ago

Thank you. Should I upload the current data?

comment:147 by klaussfreire, 11 years ago

Yes, please do. I'll make sure to address those concerns as well, and we'll save one round trip

comment:148 by Kamedo2, 11 years ago

http://i43.tinypic.com/35l9h94.png
You can download the original sound here. http://www.hydrogenaudio.org/forums/index.php?showtopic=98003

comment:149 by Kamedo2, 11 years ago

Oops, -b:a 128k, not -b:a 96k in the 128kbps exp+v4 column.
By the way, why is the FFT used in LPF? Couldn't it use MDCT and simply zeroing higher coefficients? Maybe I am missing something.

comment:150 by Kamedo2, 11 years ago

I'll finish the test soon(16/20, 80%). What should be the next opponents in the next blind listening test including the newer patch? I'm thinking of...

  • current git head with no patch, abr
  • next patch, abr
  • next patch, vbr
  • fdk-aac, abr

and possibly...

  • libopus, vbr
  • libmp3lame, vbr

Do you have any idea?

in reply to:  150 ; comment:151 by Carl Eugen Hoyos, 11 years ago

Replying to Kamedo2:

and possibly...

  • libopus, vbr
  • libmp3lame, vbr

Do you have any idea?

If you have time, it would be interesting to compare to the quality of other FFmpeg audio encoders, ie ac3, eac3 and mp2.

in reply to:  151 ; comment:152 by Kamedo2, 11 years ago

Replying to cehoyos:

If you have time, it would be interesting to compare to the quality of other FFmpeg audio encoders, ie ac3, eac3 and mp2.

It may be wrong, but I guess the ac3 is the most used variant. The bitrate will be around 128kbps, so the extremely high bitrate of eac3 will not fit the frame, I think. Are there some important use of eac3 and mp2, other than the BD and VCD encoding? (For BD the space is huge and quality at lower bitrate is insignificant.)

in reply to:  152 comment:153 by Carl Eugen Hoyos, 11 years ago

Replying to Kamedo2:

Replying to cehoyos:

If you have time, it would be interesting to compare to the quality of other FFmpeg audio encoders, ie ac3, eac3 and mp2.

It may be wrong, but I guess the ac3 is the most used variant. The bitrate will be around 128kbps, so the extremely high bitrate of eac3 will not fit the frame,

I am not sure I understand you.
Afaik, nobody ever made a listening test using different internal FFmpeg encoders (not even a very cursory one). It would be interesting to know that "96kb eac3 ~ 128 kb ac3 ~ 128kb aac ~ 256kb mp2" (I assume this isn't the case, just as an example). Even if done with much less effort than your above tests (if you just mention your impression of each encoder after a few tests), I believe this would be interesting information.
It was sometimes claimed that the wma encoders produce abysmal quality, so your comment on them (possibly with higher bitrates) would also be welcome.

I think. Are there some important use of eac3 and mp2, other than the BD and VCD encoding? (For BD the space is huge and quality at lower bitrate is insignificant.)

I believe that ac3 is a very important codec (WMP plays it out-of-the-box in different containers), knowing if eac3 beats it would be interesting.

comment:154 by Kamedo2, 11 years ago

I don't think of any good use of eac3, other than for BD. BD can have 32Mbps, and eac3 can have up to 6144kbps. If audio quality matters, simply use the maximum bitrate. And having more opponents in parallel slow down the test. However, we need a low anchor and possibly a high anchor. I think libopus will act as a high anchor and aac without patch act as a low anchor.

There are some good uses of wma, such as encoding for an old car stereo that plays MP3/WMA, but WMAEncode 0.2.9b is far more usable. The quality is in between LAME and Apple AAC.

https://trac.ffmpeg.org/wiki/GuidelinesHighQualityAudio

comment:155 by Kamedo2, 11 years ago

This document recommends to use -cutoff 15000 option. Too outdated, the cutoff is automatically applied since July 2012.
http://ffmpeg.org/ffmpeg-codecs.html#aac

This is the data I sent in 2012.
http://i41.tinypic.com/24dri41.png

By the way, the progress of the listening test is 95%(19/20) now.

comment:156 by Kamedo2, 11 years ago

I finished the test and I uploaded the results.
http://www.hydrogenaudio.org/forums/index.php?showtopic=102699
http://i42.tinypic.com/1043igy.png
http://i44.tinypic.com/2ld9bhl.png

comment:157 by klaussfreire, 11 years ago

Cool. I'll try to work on this tonight.

by klaussfreire, 11 years ago

V5 patch, twoloop RD fixed (I think)

comment:158 by klaussfreire, 11 years ago

So, I attached a patch that moves in the right direction (I think).

Most of the worse-performing samples, I noticed, had to do with hole avoidance being quickly violated when using low bit rates. So I re-did twoloop's RD improvement step to better respect hole avoidance, to be asymmetric in its scale manipulation (ie: to avoid adding all 1 or all 2, which would be quickly undone by the bitrate adjustment step), and everything seemed to work a lot better.

However, on the "asymmetric" little word, there's a huge hack involved. I wouldn't want to waste your time without a warning: this hack can most assuredly be improved. But I don't think I'll waste time improving a hack, since the real solution is to implement a dynamic programming coder, which I intend to do in the future. So while hackish and probably suboptimal, I'll probably leave it as-is since it works well enough.

I haven't tested VBR much. From what I tested, it seems mostly unharmed, but it still needs a better calibrated cutoff. That will take time (lets say it'll be v6).

So, this patch should be good enough for ABR. VBR will need a v6, and some day (time permitting) I'll post the patch with the dynamic coder.

I couldn't quite match FDK performance, but I suspect there's two reasons for this. First, M/S coding isn't as good as it should be. And 2, FDK probably uses a dynamic coder. So I think we'll catch FDK with the dynamic coder (which can also do the M/S part, so it'll fix both with one shot).

However, I tested most of the samples in your session, and they've all improved. Some more than others, of course. So, if not all the samples, you might want to retest the worst offenders.

Edit: I also haven't tested higher bit rates. I will tomorrow.

Last edited 11 years ago by klaussfreire (previous) (diff)

comment:159 by Kamedo2, 11 years ago

The v5 patch is encoding at 15-50x realtime, depending on bitrate and type of music encoded.

Last edited 11 years ago by Kamedo2 (previous) (diff)

comment:160 by Kamedo2, 11 years ago

I changed aaccoder.c line 806 from

                ? (refbits * 1.6f * avctx->sample_rate / 1024) 

to

                ? (refbits * 2.4f * avctx->sample_rate / 1024) 

This is certainly better, although exact optimal value is debatable.

I encoded 2 days of diverse sounds with many settings, and listened to 2 hours of the sounds. This encoder do a relatively good job even in abr 96kbps. It's not a blind test, but I feel the improvement. Also, I compared abr 128kbps vs vbr -q 0.3, but still, abr is better. The vbr exposes its weak point in relatively quiet, tonal sections. Low S/N and stronger LPF effect.

http://i40.tinypic.com/r1l7j4.png

comment:161 by Kamedo2, 11 years ago

I listened to about 8 hours of songs, movies, sine and white noise, and 5.1ch surround source. I'd say that abr is mature.

klaussfreire, could you add a "redirect" feature that when set bitrate is too high, redirect to the maximum bitrate possible, rather than to print the error message and stop. This simplify many batch encodes, including when encoding from hundreds of videos that have various audio frequencies and number of channels. Currently it gets:

[aac @ 013efa60] Too many bits per frame requested

Also, I notice that this commandline

ffmpeg -i ffmpeg_aacvbr_pulse1.wav -c:a aac -strict experimental -q:a 0.1 -ar 8000 -ac 1 ffmpeg_aacvbr_pulse1.mp4

gets the same Too many bits warning, and lowering the quality -q:a don't work. It only works when using -b:a, or setting higher frequency such as -ar 22050. It could be a problem when encoding from a video taken by some old digital cameras with 8kHz pcm audio attached.

The error message:

ffmpeg56470.exe -y -i ffmpeg_aacvbr_pulse1.wav -c:a aac -strict experimental -q:a
 0.3 -ar 8000 ffmpeg_aacvbr_pulse1.mp4
ffmpeg version N-56469-gf6622f9 Copyright (c) 2000-2013 the FFmpeg developers
  built on Sep 20 2013 15:29:55 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk
-aac --extra-ldflags=-static --extra-cflags='-march=native -mfpmath=sse' --optfl
ags=-O2
  libavutil      52. 45.100 / 52. 45.100
  libavcodec     55. 33.100 / 55. 33.100
  libavformat    55. 18.100 / 55. 18.100
  libavdevice    55.  3.100 / 55.  3.100
  libavfilter     3. 86.102 /  3. 86.102
  libswscale      2.  5.100 /  2.  5.100
  libswresample   0. 17.103 /  0. 17.103
  libpostproc    52.  3.100 / 52.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'ffmpeg_aacvbr_pulse1.wav':
  Metadata:
    encoder         : Coderium SoundEngine 4.59
  Duration: 00:00:12.12, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16
, 1411 kb/s
[aac @ 030cbf00] Too many bits per frame requested
Output #0, mp4, to 'ffmpeg_aacvbr_pulse1.mp4':
  Metadata:
    encoder         : Coderium SoundEngine 4.59
    Stream #0:0: Audio: aac, 8000 Hz, stereo, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le -> aac)
Error while opening encoder for output stream #0:0 - maybe incorrect parameters
such as bit_rate, rate, width or height

I think this is about time we remove the -strict experimental flag.

in reply to:  159 comment:162 by klaussfreire, 11 years ago

Replying to Kamedo2:

The v5 patch is encoding at 15-50x realtime, depending on bitrate and type of music encoded.

I believe I may have to disappoint you there. One of the optimizations that does that, is acting up on ABR, I noticed improved quality by restricting it, so the v6 with optimized VBR will have that disabled as well (and thus be a tad slower).

I thought that optimization was result-neutral, but it seems it isn't.

comment:163 by Kamedo2, 11 years ago

15x speed is 'tolerable' :)

I've encoded more than 50GB of mp4s, including surround 5.1ch with more than 1Mbps etc... and listened to 12 hours of mainly Pop music. v5 seems to be stable. Is fixing "Too many bits per frame requested" error easy?

comment:164 by klaussfreire, 11 years ago

I can make it only applicable when using ABR, but I think it's a useful message.

I could also turn it into a warning, I think.

comment:165 by Kamedo2, 11 years ago

I prefer warnings, rather than the error messages and stop. Kind, and easier to use.

By the way, I'll be free from September 28th, and I'm considering a listening test of

  • v4 abr
  • v6 abr
  • v6 vbr
  • fdk-aac vbr
  • ac3 abr
  • libmp3lame vbr

I've got a request of testing libfaac, mp2, and eac3, but I'm running out of the "slot".
From my normal non-blind listening of average music, my current impression is:

fdk-aac > libmp3lame > v5 abr >> v4 abr > v5 vbr > ac3

comment:166 by Kamedo2, 11 years ago

v5 vbr is still quite worse than the abr. I feel that whenever tonal sounds are there, the frequency bin around the tone degrades. Tones are poorer at hiding other sounds than the noise, that's why harpsichords remains to be one of the most critical and hardest instruments to code. http://wiki.hydrogenaudio.org/index.php?title=Perceptual_Noise_Substitution

comment:167 by klaussfreire, 11 years ago

Well, v6 is almost ready. I just need to clean it up a bit. I'll probably do that tonight.

In v6, my non-blind tests make me believe that v6 vbr > v6 abr > v5 abr.

Not sure how you compare abr vs vbr, what I do is pick a file or set of files, do a binary search of the quality level that results in the same overall file size, and then compare. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I exclude those, since they're pathological.

When I push the patches to the ML, I'll make most of what makes v6 vbr go crazy on techno (the relatively high peak bit rate allowance) configurable anyway.

in reply to:  167 ; comment:168 by Kamedo2, 11 years ago

Replying to klaussfreire:

Not sure how you compare abr vs vbr, what I do is pick a file or set of files, do a binary search of the quality level that results in the same overall file size, and then compare. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I exclude those, since they're pathological.

I compare abr vs vbr by a graph. I plot a "q vs bitrate" graph over a "standard" set of large set of sounds I extracted from diverse CDs. Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate. This method is common in the hydrogenaudio.
http://listening-tests.hydrogenaudio.org/sebastian/mp3-128-1/index.htm


When I push the patches to the ML, I'll make most of what makes v6 vbr go crazy on techno (the relatively high peak bit rate allowance) configurable anyway.

I think it's a good idea to automatically "cap" the bitrate based on the q number. 3x of the "standard" bitrate of the q or something.

Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps. (Or 210kbps, 289kbps, etc based on the sound content, but that's fine.) iTunes have that interface, and it's easier to use. This can be controversial as people may refer to some old documents of -q:a option and try to do the same, but the problem can be avoided by moving to a "classic mode" when the value is very small, like -q:a 0.3.

in reply to:  168 ; comment:169 by klaussfreire, 11 years ago

Replying to Kamedo2:

Replying to klaussfreire:

Not sure how you compare abr vs vbr, what I do is pick a file or set of files, do a binary search of the quality level that results in the same overall file size, and then compare. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I exclude those, since they're pathological.

I compare abr vs vbr by a graph. I plot a "q vs bitrate" graph over a "standard" set of large set of sounds I extracted from diverse CDs.

Yeah, I've seen those

Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate.

Just how do you check bit rate? Because I've noticed ffmpeg -i file tends to give bogus rates when used on VBR-encoded files (not even average).

Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.

That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).

However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:

ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aac

Although that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).

in reply to:  169 ; comment:170 by Kamedo2, 11 years ago

Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate.

Just how do you check bit rate? Because I've noticed ffmpeg -i file tends to give bogus rates when used on VBR-encoded files (not even average).

filesize[Byte]*8/Sample_length[Sec], But be careful of very short files, it can be bogus too.

Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.

That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).

Is LAME breaking the convention?
https://trac.ffmpeg.org/wiki/Encoding%20VBR%20%28Variable%20Bit%20Rate%29%20mp3%20audio

However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:

ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aac

Although that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).

Thank you for the info. Your behavior seems much like the cvbr(most used mode), Apple iTunes.

in reply to:  170 comment:171 by klaussfreire, 11 years ago

Replying to Kamedo2:

Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate.

Just how do you check bit rate? Because I've noticed ffmpeg -i file tends to give bogus rates when used on VBR-encoded files (not even average).

filesize[Byte]*8/Sample_length[Sec], But be careful of very short files, it can be bogus too.

As long as you're not also estimating sample_length with ffmpeg, which will also give you bogus, it should be fine ;)

Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.

That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).

Is LAME breaking the convention?
https://trac.ffmpeg.org/wiki/Encoding%20VBR%20%28Variable%20Bit%20Rate%29%20mp3%20audio

I think so. At least, it seems to be backwards (higher q should mean higher quality, but lame does it backwards).

comment:172 by Kamedo2, 11 years ago

libvorbis and libfaac break the convention, too. neroAacEnc.exe have the float quality value which 0 is lowest and 1 is highest, so if unchanged, the native encoder acts much like the nero.

in reply to:  170 ; comment:173 by Timothy Gu, 11 years ago

Replying to Kamedo2:

Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.

That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).

Is LAME breaking the convention?
https://trac.ffmpeg.org/wiki/Encoding%20VBR%20%28Variable%20Bit%20Rate%29%20mp3%20audio

However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:

ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aac

Although that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).

Thank you for the info. Your behavior seems much like the cvbr(most used mode), Apple iTunes.

If someone is to implement cvbr, I suggest to do it like the libopus encoder wrapper, where users are allowed to choose a "vbr" option like this http://ffmpeg.org/ffmpeg-codecs.html#Option-Mapping.

in reply to:  173 comment:174 by Kamedo2, 11 years ago

If someone is to implement cvbr, I suggest to do it like the libopus encoder wrapper, where users are allowed to choose a "vbr" option like this http://ffmpeg.org/ffmpeg-codecs.html#Option-Mapping.

Timothy_Gu, Thank you for the informative link. I'd like to use options like -b:a 256k -vbr.

in reply to:  169 comment:175 by Kamedo2, 11 years ago

However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:

ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aac

Although that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).

I tried it over 128 different songs and the result was:

-b:a 256k -q:a 1

  • Average 247kbps
  • SD +/-33kbps
  • Min 161kbps
  • Max 300kbps

-q:a 1

  • Average 235kbps
  • SD +/-30kbps
  • Min 154kbps
  • Max 287kbps

(comment:160 change is not applied in this test.)

comment:176 by Kamedo2, 11 years ago

I'm preparing for the next listening test.

# Native aac patch v4 abr
ffmpeg55212 -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4
ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav

# Native aac patch v5 abr
ffmpeg56470 -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4
ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav

# Native aac patch v5 vbr
ffmpeg56470 -y -i in.wav -c:a aac -strict experimental -q:a 0.3 out.mp4
ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav

# FDK-AAC vbr 3
ffmpeg56470 -y -i in.wav -c:a libfdk_aac -vbr 3 out.mp4
ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav

# LAME vbr -V5
ffmpeg55010 -y -i in.wav -c:a libmp3lame -q:a 5 out.mp3
ffmpeg56470 -y -i out.mp3 -c:a pcm_s32le out.32bit.wav

# FFmpeg ac3 cbr
ffmpeg56470 -y -i in.wav -c:a ac3 -b:a 128k out.ac3
ffmpeg56470 -y -i out.ac3 -c:a pcm_s32le out.32bit.wav

I thought of using float 32bit as the intermediate format, but FFmpeg's float pcm_f32le had the gain half of what it should be, and even after adjusting gain, much error(average of |lossy-original|) existed, unlike faad or madplay.

This is the statistics of 25 samples I'm going to use in the test.

v4 abrv5 abrv5 vbrFDK vbrlame V5ac3
25 Average129129151122135128
25 Std.Dev553920180
25 Min107108898687128
25 Max131133257173172128
Max sample25.Reunion Blues26.French26.French10.14.29.
Std.Average128128127127130128

Unit is kbps. Std.Average is the average bitrate of my large collection of CDs encoded.

I've found that v5 vbr boosts bitrate in speech samples. The speech sample 26.French was encoded in 257kbps, more than twice bitrate than the average bitrate of large set of diverse CD sounds. Another speech sample reached 216kbps. It's a problem, hopefully fixed in the next v6 patch.

comment:177 by klaussfreire, 11 years ago

Wait a little bit, I'll get you the v6 patch asap, even if not as clean as I'd like it to be.

comment:178 by klaussfreire, 11 years ago

Yes, the speech bug I noticed, because VBR was unconstrained. v6 uses constained VBR (loosely constrained) and performs much better. That's why I'd prefer you tested v6.

by klaussfreire, 11 years ago

Improved (mostly constrained) VBR, fixed RC bug from v5. There's some dead code that begs to be removed, but it's better to start testing before cleaning.

comment:179 by klaussfreire, 11 years ago

So... latest patch attached. It's not final yet, mostly because it needs some polish. But its performance I find quite acceptable.

comment:180 by Kamedo2, 11 years ago

Thank you. I'm successifully encoding. An extensive stability test is ongoing. I noticed that same -q:a results in almost half of the size of the -q:a in v4.

comment:181 by Kamedo2, 11 years ago

This is the statistics of 25 samples I'm going to use in the test.

v4 abrv6 abrv6 vbr q0.7FDK vbr3lame V5ac3
25 Average129129144122135128
25 Std.Dev552420180
25 Min1071081148687128
25 Max131133218173172128
Max sample25.Reunion Blues26.French26.French10.14.29.
Std.Average128128127127130128

Unit is kbps. Std.Average is the average bitrate of my large collection of CDs encoded.

comment:182 by Kamedo2, 11 years ago

http://i43.tinypic.com/2ufpzjs.png

My current impression, from non-blind test of non-samples: v5 abr = v6 abr > v6 vbr >> v4 abr

in reply to:  182 ; comment:184 by klaussfreire, 11 years ago

Replying to Kamedo2:

My current impression, from non-blind test of non-samples: v5 abr = v6 abr > v6 vbr >> v4 abr

That's weird (v6 abr > v6 vbr), because my tests showed the opposite, and you yourself said v6 vbr had decreased bit usage considerably (so it should imply higher efficiency, which is what I noticed).

Do you have an example? Could you describe in that example what you feel is inferior compared to abr?

Also, do you perhaps have something that results in abnormally large bitrates in your std calibration sample? That could be forcing you to pick a lower q to match the 128k average, and thus decrease overall quality. I did fix a few of those in v6, but maybe there's some left, or maybe it has to be further constrained.

comment:185 by klaussfreire, 11 years ago

PS: In aaccoder.c:1007, you can change

//if (mb >= ESC_BT) break;

Into

if (mb >= ESC_BT && sce->sf_idx[w*16+g] <= minscaler) break;

I think that could help, because, I believe, those bitrate peaks are due to abuse of ESC_BT bands. And that could also be the reason why some faint sounds get lost even at high Q, because AAC enforces a maximum dynamic range in scalers, and abusing of ESC_BT bands pushes that dynamic range in detriment of faint sounds.

in reply to:  184 comment:186 by Kamedo2, 11 years ago

Replying to klaussfreire:

Do you have an example? Could you describe in that example what you feel is inferior compared to abr?

In tonal part of the music, the vbr suffers from lower S/N ratio and more LPF effect, because the cutoff frequency is lower. 96kbps vs -q:a 0.52 is more pronounced.

Also, do you perhaps have something that results in abnormally large bitrates in your std calibration sample? That could be forcing you to pick a lower q to match the 128k average, and thus decrease overall quality. I did fix a few of those in v6, but maybe there's some left, or maybe it has to be further constrained.

No, the std calibration sample is from CDs, and it lacks speech samples. I tried these:
http://www.rarewares.org/test_samples/
and death2, KMFDM-Dogma, male_speech was particularly high in -q:a 0.7 (128kbps), more than 200kbps. Male speech have more bitrates than the female one.


comment:187 by klaussfreire, 11 years ago

Well, I've got something that might work, but since it means unconstraining vbr, it'll need some testing at lower bitrates. It'll take me time.

comment:188 by Kamedo2, 11 years ago

The bitrate distribution of the encoders.
http://i44.tinypic.com/28kj5nm.png

comment:189 by klaussfreire, 11 years ago

Nice, this will be useful.

comment:190 by Kamedo2, 11 years ago

comment:185 was tested, along with qaac(Apple AAC). Apple AAC is a very good encoder, so the bitrate it uses must be close to the optimal.
http://i40.tinypic.com/2cok3rn.png
-q:a 0.3 vs -q:a 0.7 vs -q:a 0.7 vs -vbr 3 vs --tvbr 63 vs -V5
Std.bitrate is 127.5, 127.0, 126.9, 126.9, 126.2, 129.9, all of them are very close to 128k.

comment:191 by Kamedo2, 11 years ago

Which version should I test? With or without comment:185? Or should I wait for something? I'm doing a batch encode, so I can redo anything without much effort.

comment:192 by Kamedo2, 11 years ago

My current plan:

bin\ffmpeg55212_v4patch -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4

bin\ffmpeg56667_v6patch_c185 -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4

bin\ffmpeg56667_v6patch_c185 -y -i in.wav -c:a aac -strict experimental -q:a 0.7 out.mp4

bin\ffmpeg56667 -y -i in.wav -c:a libfdk_aac -vbr 3 out.mp4

bin\ffmpeg56667 -y -i in.wav -c:a libfaac -q:a 97 out.mp4

bin\ffmpeg56667 -y -i in.wav -c:a libmp3lame -q:a 5 out.mp3

bin\ffmpeg56667 -y -i in.wav -c:a ac3 -b:a 128k out.ac3

Many people were asking the quality of faac. Weird, but maybe I should include it if many people are wondering. Expect the test to slow down.

The bitrate distribution of vbr encoders. Even when the comment:185 was applied, the speech samples takes up a lot of space.
http://i39.tinypic.com/2h8be3o.png

My current FFmpeg configuration:

$ ./configure --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk-a
ac --enable-libfaac --enable-libmp3lame --extra-ldflags=-static --extra-cflags=
'-march=nocona -mfpmath=sse' --optflags=-O2

Is there a way to expose more bugs so that we can fix it before the test?

comment:193 by klaussfreire, 11 years ago

Sorry about the delay. I wanted to give you a 6b patch, since I was making good progress, but I got stalled yesterday.

I managed to remove (or, rather, ameliorate) those bitrate outliers, without constraining VBR. I noticed they were related to silence parts. It seems in the absence of any significant signal, it will try to encode the noise, and being noise, it's quite hard to encode.

What I did is I modeled the absolute hearing threshold in aacpsy, and now that's performing better. But there's still a tendency to waste bits on noisy transients. I couldn't quite yet confirm it's a waste, all my attempts at saving bits in those, run afoul quality-wise. As if those bits were really needed. But I suspect there's still some work to be done in that regard.

So in essence, I achieved some extra efficiency by modeling absolute hearing thresholds. Since you never know what SPL will the sample be playing at, I matched the masking curve's lowest point to 16-bit quantization noise. That should correctly match most playback situations, but I'd like people to comment if there was an explicit reason why absolute thresholds haven't been accounted for.

As for your testing plans, you tell me. I can give you the current state of the encoder (I don't expect to do any more progress quickly, I tried lots of things and failed to improve it, so unless I get some kind of inspiration the encoder will remain as is for a while), or you can test the current one. Bitrate-wise, they're similar. The newest one performs a little better since it's unconstrained, but it still has a disadvantage against ABR regarding tonal, quiet passages.

comment:194 by Kamedo2, 11 years ago

I think it's not a major problem to boost rate on noisy transients, they're rare in actual encoding situations, unlike speech samples.

The v6 patch is already a very good one, so I'm very satisfied with the current quality, but fairness can be a problem, so if you have the version that reduces bitrate on speech samples, I'd like to test the one with reduced bitrate.

I'd like to test a version that is worthy to commit, so please be careful of stability issues like memory leaks and such, rather than the quality. I'll try my best to find problems before the test.

BTW, I decided to use cbr for the fdk-aac. The cbr sounds clearer.

comment:195 by Kamedo2, 11 years ago

Is the 6b patch available?

comment:196 by klaussfreire, 11 years ago

Soon.

comment:197 by Kamedo2, 11 years ago

Just for convenience, The most outlier sample in below is death2 (for v6 + comment:185, q=0.7, 1). The rate is highest, and encodes slowest.
http://www.rarewares.org/test_samples/

by Kamedo2, 11 years ago

Attachment: ffmpeg_aacvbr_degrade1.flac added

A sound that degrades on VBR. from GIZA studio Masterpiece BLEND 2001 Disc2 Track3 Stand Up (Mai Kuraki)

comment:198 by Kamedo2, 11 years ago

I encoded over 10,000 AAC mp4s, including really weird samples as the input, very small to very large volumes, totally odd settings, from 8kHz to 48kHz frequencies, many cutoff settings, and -q:a from -130 to 240. No apparent problem.

comment:199 by Kamedo2, 11 years ago

It doesn't encode 7.1ch surround file from here.
http://www-mmsp.ece.mcgill.ca/documents/AudioFormats/WAVE/Samples.html
I used cbr and other 7.1ch files and the results are the same.

ffmpeg_v6c185.exe -v 9 -loglevel 99 -y -i "8_Channel_ID.wav" -c:a aac -strict -2 
-q:a 1 8chwav_q1.mp4
ffmpeg version N-56667-g32cde96 Copyright (c) 2000-2013 the FFmpeg developers
  built on Sep 29 2013 23:03:27 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk
-aac --enable-libfaac --enable-libmp3lame --extra-ldflags=-static --extra-cflags
='-march=nocona -mfpmath=sse' --optflags=-O2
  libavutil      52. 46.100 / 52. 46.100
  libavcodec     55. 33.100 / 55. 33.100
  libavformat    55. 18.102 / 55. 18.102
  libavdevice    55.  3.100 / 55.  3.100
  libavfilter     3. 87.100 /  3. 87.100
  libswscale      2.  5.100 /  2.  5.100
  libswresample   0. 17.103 /  0. 17.103
  libpostproc    52.  3.100 / 52.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument
'9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level)
with argument '99'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argu
ment '1'.
Reading option '-i' ... matched as input file with argument '8_Channel_ID.wav'.
Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac'
.
Reading option '-strict' ... matched as AVOption 'strict' with argument '-2'.
Reading option '-q:a' ... matched as option 'q' (use fixed quality scale (VBR))
with argument '1'.
Reading option '8chwav_q1.mp4' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input file 8_Channel_ID.wav.
Successfully parsed a group of options.
Opening an input file: 8_Channel_ID.wav.
[wav @ 016bf2e0] Format wav probed with size=2048 and score=99
[wav @ 016bf2e0] File position before avformat_find_stream_info() is 128
[wav @ 016bf2e0] parser not found for codec pcm_s24le, packets or times may be i
nvalid.
[pcm_s24le @ 02c666e0] Channel layout '5.1' with 6 channels does not match speci
fied number of channels 8: ignoring specified channel layout
[wav @ 016bf2e0] parser not found for codec pcm_s24le, packets or times may be i
nvalid.
[wav @ 016bf2e0] Probe buffer size limit of 5000000 bytes reached
[wav @ 016bf2e0] File position after avformat_find_stream_info() is 5002208
Guessed Channel Layout for  Input Stream #0.0 : 7.1
Input #0, wav, from '8_Channel_ID.wav':
  Duration: 00:00:08.05, bitrate: 9216 kb/s
    Stream #0:0, 1226, 1/48000: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000
Hz, 7.1, s32, 9216 kb/s
Successfully opened the file.
Parsing a group of options: output file 8chwav_q1.mp4.
Applying option c:a (codec name) with argument aac.
Applying option q:a (use fixed quality scale (VBR)) with argument 1.
Successfully parsed a group of options.
Opening an output file: 8chwav_q1.mp4.
Successfully opened the file.
detected 8 logical cores
[graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'time_base' to value '1/48000
'
[graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'sample_rate' to value '48000
'
[graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'sample_fmt' to value 's32'
[graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'channel_layout' to value '0x
63f'
[graph 0 input from stream 0:0 @ 02cd9ac0] tb:1/48000 samplefmt:s32 samplerate:4
8000 chlayout:0x63f
[audio format for output stream 0:0 @ 037acf80] Setting 'sample_fmts' to value '
fltp'
[audio format for output stream 0:0 @ 037acf80] Setting 'sample_rates' to value
'96000|88200|64000|48000|44100|32000|24000|22050|16000|12000|11025|8000|7350'
[audio format for output stream 0:0 @ 037acf80] auto-inserting filter 'auto-inse
rted resampler 0' between the filter 'Parsed_anull_0' and the filter 'audio form
at for output stream 0:0'
[AVFilterGraph @ 02d4fee0] query_formats: 4 queried, 6 merged, 3 already done, 0
 delayed
[auto-inserted resampler 0 @ 03672580] ch:8 chl:7.1 fmt:s32 r:48000Hz -> ch:8 ch
l:7.1 fmt:fltp r:48000Hz
[aac @ 02c66ae0] Unsupported number of channels: 8
Output #0, mp4, to '8chwav_q1.mp4':
    Stream #0:0, 0, 1/90000: Audio: aac, 48000 Hz, 7.1, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s24le -> aac)
Error while opening encoder for output stream #0:0 - maybe incorrect parameters
such as bit_rate, rate, width or height
[AVIOContext @ 0390e260] Statistics: 0 seeks, 0 writeouts
[AVIOContext @ 016bf900] Statistics: 5013504 bytes read, 0 seeks

by Kamedo2, 11 years ago

Attachment: ffmpeg_aac_lead_voice.flac added

Degrades on FFmpeg aac encoder, both on vbr and abr. The original sound is very odd and may not be worthy to put a lot of effort improving it.

comment:200 by Kamedo2, 11 years ago

The sound sample above is from here.
http://www.hydrogenaudio.org/forums/index.php?showtopic=50056

I've uploaded 2 problematic samples for the v6 aac, but they are extreme exceptions, rarely happens in the real encoding situations. Generally, the v6 patch is a very good patch, the quality is satisfactory. If you upload the patch that prevents the speech bitrate bloat, I'll start testing. The opponents are here.

  • v4 abr
  • v7 abr
  • v7 vbr
  • fdk-aac cbr
  • libfaac vbr
  • ac3 cbr
  • libmp3lame vbr

BTW, I prefer the name v7 rather than the v6b, even if the revision is minor. It's easier to explain.

comment:201 by Kamedo2, 11 years ago

FFmpeg crashes when the sampling rate is 7350Hz, both vbr and abr.
Insignificant, but in case you have missed something.
It worked on 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 8000Hz.

ffmpeg56668.exe -v 9 -loglevel 99 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -str
ict -2 -ar 7350 -b:a 128k ffmpeg_aac_lead_voiceb128.mp4
ffmpeg version N-56667-g32cde96 Copyright (c) 2000-2013 the FFmpeg developers
  built on Sep 29 2013 23:03:27 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk
-aac --enable-libfaac --enable-libmp3lame --extra-ldflags=-static --extra-cflags
='-march=nocona -mfpmath=sse' --optflags=-O2
  libavutil      52. 46.100 / 52. 46.100
  libavcodec     55. 33.100 / 55. 33.100
  libavformat    55. 18.102 / 55. 18.102
  libavdevice    55.  3.100 / 55.  3.100
  libavfilter     3. 87.100 /  3. 87.100
  libswscale      2.  5.100 /  2.  5.100
  libswresample   0. 17.103 /  0. 17.103
  libpostproc    52.  3.100 / 52.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument
'9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level)
with argument '99'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argu
ment '1'.
Reading option '-i' ... matched as input file with argument '\ffmpeg_aac_lead_vo
ice.flac'.
Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac'
.
Reading option '-strict' ... matched as AVOption 'strict' with argument '-2'.
Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz)
) with argument '7350'.
Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v)
) with argument '128k'.
Reading option 'ffmpeg_aac_lead_voiceb128.mp4' ...
 matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input file ffmpeg_aac_
lead_voice.flac.
Successfully parsed a group of options.
Opening an input file: ffmpeg_aac_lead_voice.flac.

[flac @ 0159f2e0] Format flac probed with size=2048 and score=50
[flac @ 0159f2e0] File position before avformat_find_stream_info() is 4374
[flac @ 030866e0] sample/frame number mismatch in adjacent frames
    Last message repeated 114 times
[flac @ 0159f2e0] max_analyze_duration 5000000 reached at 5015510 microseconds
[flac @ 0159f2e0] File position after avformat_find_stream_info() is 377856
Input #0, flac, from 'ffmpeg_aac_lead_voice.flac':

  Metadata:
    REPLAYGAIN_TRACK_PEAK: 0.67306519
    REPLAYGAIN_TRACK_GAIN: -4.18 dB
    REPLAYGAIN_ALBUM_PEAK: 0.67306519
    REPLAYGAIN_ALBUM_GAIN: -4.18 dB
    COMMENT         : Encoded by FLAC v1.1.2a with FLAC Frontend v1.7.1
  Duration: 00:00:24.68, bitrate: 471 kb/s
    Stream #0:0, 50, 1/44100: Audio: flac, 44100 Hz, mono, s16
Successfully opened the file.
Parsing a group of options: output file ffmpeg_aac
_lead_voiceb128.mp4.
Applying option c:a (codec name) with argument aac.
Applying option ar (set audio sampling rate (in Hz)) with argument 7350.
Applying option b:a (video bitrate (please use -b:v)) with argument 128k.
Successfully parsed a group of options.
Opening an output file: ffmpeg_aac_lead_voiceb128.
mp4.
Successfully opened the file.
detected 8 logical cores
[graph 0 input from stream 0:0 @ 0159f100] Setting 'time_base' to value '1/44100
'
[graph 0 input from stream 0:0 @ 0159f100] Setting 'sample_rate' to value '44100
'
[graph 0 input from stream 0:0 @ 0159f100] Setting 'sample_fmt' to value 's16'
[graph 0 input from stream 0:0 @ 0159f100] Setting 'channel_layout' to value '0x
4'
[graph 0 input from stream 0:0 @ 0159f100] tb:1/44100 samplefmt:s16 samplerate:4
4100 chlayout:0x4
[audio format for output stream 0:0 @ 03142120] Setting 'sample_fmts' to value '
fltp'
[audio format for output stream 0:0 @ 03142120] Setting 'sample_rates' to value
'7350'
[audio format for output stream 0:0 @ 03142120] auto-inserting filter 'auto-inse
rted resampler 0' between the filter 'Parsed_anull_0' and the filter 'audio form
at for output stream 0:0'
[AVFilterGraph @ 030df400] query_formats: 4 queried, 6 merged, 3 already done, 0
 delayed
[auto-inserted resampler 0 @ 0307a240] ch:1 chl:mono fmt:s16 r:44100Hz -> ch:1 c
hl:mono fmt:fltp r:7350Hz
[aac @ 030ed9a0] Too many bits per frame requested, clamping to max

How is the development going?

comment:202 by Kamedo2, 11 years ago

I think aacenc.c line 106 and 133 lacks the position for 7350Hz, which is in the 13th address.

static const uint8_t *swb_size_1024[] = {
    swb_size_1024_96, swb_size_1024_96, swb_size_1024_64,
    swb_size_1024_48, swb_size_1024_48, swb_size_1024_32,
    swb_size_1024_24, swb_size_1024_24, swb_size_1024_16,
    swb_size_1024_16, swb_size_1024_16, swb_size_1024_8
};
static const uint8_t *swb_size_128[] = {
    /* the last entry on the following row is swb_size_128_64 but is a
       duplicate of swb_size_128_96 */
    swb_size_128_96, swb_size_128_96, swb_size_128_96,
    swb_size_128_48, swb_size_128_48, swb_size_128_48,
    swb_size_128_24, swb_size_128_24, swb_size_128_16,
    swb_size_128_16, swb_size_128_16, swb_size_128_8
};

aactab.c line 40 ~ 48 properly have the data for the 13th 7350Hz.

const uint8_t ff_aac_num_swb_1024[] = {
    41, 41, 47, 49, 49, 51, 47, 47, 43, 43, 43, 40, 40
};

const uint8_t ff_aac_num_swb_512[] = {
     0,  0,  0, 36, 36, 37, 31, 31,  0,  0,  0,  0,  0
};

const uint8_t ff_aac_num_swb_128[] = {
    12, 12, 12, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15
};

comment:203 by klaussfreire, 11 years ago

Sorry, I've been having trouble making good, provable progress with v6b/v7, especially since I've got a few deadlines coming that require lot of my time.

I'll try to fix the crashing, and get performance comparable to v6 without the VBR constraints, which I noticed are hurting quality on high bit rates.

However, it seems the tendency to spend lots of bits on speech stems from psy itself, not anything else. It seems to estimate speech has a lot of perceptual entropy. It may be true, but there is some inefficiency that's hard to fix without deviating from the standards.

So v7 will probably just address the bugs, crashing, some bugs in tonal band priorization in short window blocks, and stuff like that.

About 7.1ch, there's another ticket for that, IIRC. While I will still take a look with the report in this ticket, I'd suggest you post the relevant bits on that ticket as well (for traceability you know)

in reply to:  155 ; comment:204 by Timothy Gu, 11 years ago

Replying to Kamedo2:

This document recommends to use -cutoff 15000 option. Too outdated, the cutoff is automatically applied since July 2012.
http://ffmpeg.org/ffmpeg-codecs.html#aac

Hi, I wrote the documentation. Thanks for the report for that. However, I currently don't have any time to fix the doc. So it would be very kind of you to send a patch to ffmpeg-devel mailing list. Thanks.

in reply to:  203 comment:205 by Kamedo2, 11 years ago

Replying to klaussfreire:

However, it seems the tendency to spend lots of bits on speech stems from psy itself, not anything else. It seems to estimate speech has a lot of perceptual entropy. It may be true, but there is some inefficiency that's hard to fix without deviating from the standards.

Allocating less bits on short frame may help. It should increase the quality, although a strong tonality estimator is ideal.

comment:206 by Kamedo2, 11 years ago

aacenc.c line 106 and 133, this properly worked, and the result 7350Hz aac mp4s were playable on FFmpeg, foobar2000 v1.2.9 and Media Player Classic. The decoding failed on faad and WMP.

static const uint8_t *swb_size_1024[] = {
    swb_size_1024_96, swb_size_1024_96, swb_size_1024_64,
    swb_size_1024_48, swb_size_1024_48, swb_size_1024_32,
    swb_size_1024_24, swb_size_1024_24, swb_size_1024_16,
    swb_size_1024_16, swb_size_1024_16, swb_size_1024_8, swb_size_1024_8
};
static const uint8_t *swb_size_128[] = {
    /* the last entry on the following row is swb_size_128_64 but is a
       duplicate of swb_size_128_96 */
    swb_size_128_96, swb_size_128_96, swb_size_128_96,
    swb_size_128_48, swb_size_128_48, swb_size_128_48,
    swb_size_128_24, swb_size_128_24, swb_size_128_16,
    swb_size_128_16, swb_size_128_16, swb_size_128_8, swb_size_128_8
};

in reply to:  204 ; comment:207 by Timothy Gu, 11 years ago

Replying to Timothy_Gu:

Replying to Kamedo2:

This document recommends to use -cutoff 15000 option. Too outdated, the cutoff is automatically applied since July 2012.
http://ffmpeg.org/ffmpeg-codecs.html#aac

Hi, I wrote the documentation. Thanks for the report for that. However, I currently don't have any time to fix the doc. So it would be very kind of you to send a patch to ffmpeg-devel mailing list. Thanks.

Patch sent. http://ffmpeg.org/pipermail/ffmpeg-devel/2013-October/149225.html

comment:209 by Kamedo2, 11 years ago

I'll be free from October 16th. klaussfreire, could you provide the current status of the encoder in 16th? I'd be happier if it reduces the bitrate on speech samples, but even when it doesn't, I'll start the test.

comment:210 by Kamedo2, 11 years ago

klaussfreire, is the patch ready?

comment:211 by klaussfreire, 11 years ago

Sorry, I had the intention of posting it this weekend, but I've been up to my ears in deadlines. Will see about posting it tonight.

by klaussfreire, 11 years ago

v7 patch - mostly bugfixing on v6, but quite significant bugs - still incomplete (needs sample rate fixes and Mahler still sounds weird)

comment:212 by klaussfreire, 11 years ago

v7 patch is attached.

It's not commitable or complete yet. I know it was the idea, but I ran out of time, and since you'll be free and needing improvements to test... well...

This patch fixes some important bugs regarding RD limit computation in transients. It also has a more robust tonality boost (form factor in this patch) method, which accounts for what psy already does (I noticed it does its own bit). In essence, it was necessary to actually count nonzero lines. This, I believe, is mostly what was wasting bits on speech. Speech still takes more bits, but less.

Mahler (brass I guess in general) still sounds artifacty. I think I know how to fix it (and it might indeed fix some other stuff). But I haven't had time to incur there.

I haven't had time to see the sample rates issue. So maybe later, you've got it mostly done anyway ;)

comment:213 by Kamedo2, 11 years ago

klaussfreire,

Thank you for all the effort to improve the FFmpeg. I've successfully patched and configured it.

comment:214 by Kamedo2, 11 years ago

The bitrate distribution of the encoders I'm going to test.

http://i42.tinypic.com/14bnl6s.png
http://i40.tinypic.com/30u2ru0.png

The size decreased, compared to the v6 patch.
http://i43.tinypic.com/5fn3mg.png

in reply to:  212 comment:215 by Kamedo2, 11 years ago

Replying to klaussfreire:

v7 patch is attached.

It's not commitable or complete yet. I know it was the idea, but I ran out of time, and since you'll be free and needing improvements to test... well...

Will it take long to make the comittable version? If it takes long, maybe I should start testing now. Or, I can wait for more complete version.

comment:216 by klaussfreire, 11 years ago

It'll take a while. I'm 1 week away from a conference that demands my full attention.

comment:217 by Kamedo2, 11 years ago

Then I'll start the test. BTW, the sine warbling problem #2706 reappeared in -q:a 0.1, 0.2, and 0.7. 50Hz and 7000Hz sine warbles with v7. Should I test v7, or v6? (+comment:185+comment:206)

comment:218 by klaussfreire, 11 years ago

I'll retest the sine when I've got time. But try to confirm it's the same problem as before, because what I noticed at random bitrates is clipping, not warbling. That's a different (harder) problem.

comment:219 by Kamedo2, 11 years ago

Seems like a hard problem. If you commit v7, people will have trouble handling the unexpectedly large file size of speech contents such as university lectures, and you probably can't fix it before the conference. Testing v6 may make more sense, probably with comment:185 and comment:206.

comment:220 by Kamedo2, 11 years ago

The v7 uses less than one third bitrate of v6 in -q:a 0.4 and -q:a 0.7 when encoding sine waves.

v6+c185+c206v7
-q:a 0.47922
-q:a 0.710732
-b:a 64k5255
-b:a 128k87105

The result of sine_tester.flac in kbps.

by Kamedo2, 11 years ago

Attachment: sine_tester.flac added

Sine waves for a warbling test. 50 440 1000 3000 7000 10000 20000Hz. 24bit 48kHz PCM.

comment:221 by Kamedo2, 11 years ago

What was the "quite significant bugs" of v6? I didn't find any problem in a non-blind listening test, and the v6 was extensively tested over many songs, speeches, tv source, and artificial sounds, and I believe v6 is safe and stable.

in reply to:  221 comment:222 by klaussfreire, 11 years ago

Replying to Kamedo2:

What was the "quite significant bugs" of v6? I didn't find any problem in a non-blind listening test, and the v6 was extensively tested over many songs, speeches, tv source, and artificial sounds, and I believe v6 is safe and stable.

Well, for one, holes (bands below hearing threshold) would bork a tonality boost loop, creating all sorts of issues, most notably when using short transform length, since the borking would get carried over to other windows.

In essence, transients were broken. They still sounded alright most of the time, probably because of pure chance (ie: maybe there were no holes). But signals like Mahler, castanets and harpishcords tended to expose the bugs at lower bit rates.

comment:223 by Kamedo2, 11 years ago

Then which version should I test?

comment:224 by klaussfreire, 11 years ago

Feel free to choose. I don't think I can go back to v6, but if v6 performs better than v7, it's something I'll have to account for.

comment:225 by Kamedo2, 11 years ago

Comparison between v6 patch and v7 patch. The sound source is:

  • white noise, 5sec, stereo 48kHz
  • snippet of music albums, 300sec, stereo 44.1kHz
  • snippet of TV sources, 1200sec, stereo 48kHz

http://i43.tinypic.com/2dtnb79.png
The progress of the listening test is 8% (2 out of 25 samples).

comment:226 by Clément Bœsch, 11 years ago

Sorry to interrupt this awesome thread but... is there some known issues with the current patch which are not present in the current encoder?

Wouldn't it be nice to apply the current patch and move on that basis?

The patch is getting huge, and the more you wait, the less relevant the review on ffmpeg-devel will be.

Last edited 11 years ago by Clément Bœsch (previous) (diff)

comment:227 by Michael Niedermayer, 11 years ago

also the patch should be split into self contained fixes, 1 issue == 1 patch when its submitted to ffmpeg-devel.
We can apply a huge monolithic patch too if thats the only option but it will give everyone working on aac headaches (that is you all) when theres a regression and git bisect then ends up just pointing to a huge all in one change.

in reply to:  226 comment:228 by Kamedo2, 11 years ago

Replying to ubitux:

Sorry to interrupt this awesome thread but... is there some known issues with the current patch which are not present in the current encoder?

Not many, but I think of...

  • In the VBR encoding, the speech takes many bits and the music takes less bits. It should be the music that need bits, as the quality of the music is something more people care.
  • The encoding is a bit slower.

According to klaussfreire, the v7, the one currently tested in a blind test, is "not commitable or complete yet."

comment:229 by klaussfreire, 11 years ago

I always intended to push this forward, when testing is done, as a series of smaller patches. I'm not sure I can split all of it, but there are quite a few worthy split points.

I think the main issue with the current patch is overall un-tidyness, with dead code left over from earlier attempts at some solutions for instance. That's what I think doesn't make it commitable yet, it has to be cleaned up.

Plus, there are quite a few magic numbers that ought to be tunables.

The slowness can be fixed later, encoding quality I believe being more important than speed, if needed. It's not that much slower anyway, still faster than realtime. VBR wasn't even working before, so even if imperfect, any improvements to VBR are commitable.

in reply to:  229 comment:230 by Michael Niedermayer, 11 years ago

Replying to klaussfreire:

I always intended to push this forward, when testing is done, as a series of smaller patches. I'm not sure I can split all of it, but there are quite a few worthy split points.

sounds good, that resolves my concerns
thanks

comment:231 by klaussfreire, 11 years ago

Woops. I just fixed M/S encoding. Had to tell ;)

I was wondering whether your tests with faac/fdk used M/S encoding?

comment:232 by Kamedo2, 11 years ago

I used faac 1.28 and fdk 0.1.2 but how should I check? I didn't use any extra options.

comment:233 by klaussfreire, 11 years ago

Well, faac has a --no-midside, so I'd venture to guess that the default is to do M/S coding.

comment:234 by Kamedo2, 11 years ago

I checked the document (fdk-aac-0.1.2/documentation/aacEncoder.pdf), and it said:

3.3 Encoder Tools
The AAC encoder supports TNS, PNS, MS, Intensity and activates these tools depending on the audio signal and the encoder configuration (i.e. bitrate or AOT). It is not required to configure these tools manually.

comment:235 by klaussfreire, 11 years ago

I'm close to posting a v8. Main improvement in v8 is, besides various subtle but significant bug fixes, that M/S coding properly works and, I believe, is robust enough to be on by default.

But v8 won't have it on by default just yet. So, when I post it (I'm performing a last round of listening tests), be sure to run it with -stereo_mode auto to get results comparable to the other contenders.

I just wanted to post the progress report early since this ticket has been silent for a while ;)

comment:236 by Kamedo2, 11 years ago

The listening test of v4 abr, v7 abr, v7 vbr, libfaac vbr, fdk-aac cbr, libmp3lame vbr, ac3 cbr (7 encoders) at 128kbps is ongoing, and I've done 10 samples out of 25 samples (40%).

Will your patch v8 properly address the sine warbling problem #2706?

comment:237 by klaussfreire, 11 years ago

Sadly, no. I thought so, but further tests (at various bitrates) show saturation. Not the same issue as the original in the OP, but an issue nevertheless.

I have only attacked M/S coding and some bit allocation inefficiencies, but the improvement seemed to improve those issues during initial tests since tonal band encoding improved significantly, but it's still not enough it seems to avoid clipping due to quantization noise.

I may have a relatively easy fix for it: Since the original signal is saturated already, all quantization noise risks the same artifacts, but I believe there is a simple (and probably the only one) fix, which involves tweaking rounding of strong signals to round towards zero instead of nearest.

That needs careful calibration, however, in order to avoid modifying behavior on non-clipping signals (since rounding towards zero generally induces higher SNR), but I'll be travelling soon and won't be able to work much on it till jan 1st.

Last edited 11 years ago by klaussfreire (previous) (diff)

comment:238 by Kamedo2, 11 years ago

You mean the clipping of the output PCM? LAME solve it by reducing the gain to around 98%.

comment:239 by Kamedo2, 11 years ago

The listening test of v4 abr, v7 abr, v7 vbr, libfaac vbr, fdk-aac cbr, libmp3lame vbr, ac3 cbr (7 encoders) at 128kbps is ongoing, and I've done 18 samples out of 25 samples (72%).

comment:240 by Kamedo2, 11 years ago

I have just finished the blind listening test and the result is here. http://www.hydrogenaudio.org/forums/index.php?showtopic=104471
http://i61.tinypic.com/2ive6mb.png
http://i60.tinypic.com/2ro1bbd.png

comment:241 by klaussfreire, 11 years ago

Alright. I'm still not done with the clipping issue (it's proving to be more challenging than I thought). Though I do believe v8 will change things, because of M/S encoding (which all the other encoders you're testing I think use, except ac3, putting ffmpeg's AAC at a significant disadvantage).

comment:242 by Kamedo2, 11 years ago

How about artificially lowering the noise level of the loudest bin a bit?

comment:243 by klaussfreire, 11 years ago

No, it's not the loudest bin. It's a group of bins. Because the sine isn't represented on the mdct by a single bin, but rather a ripple pattern that has to be accurately encoded, or the resulting sine fluctuates in amplitude (hence the clipping).

But I got a solution. I'm doing the listening tests now, but it sounds good. Basically, a two-front approach: lower the volume of near-clipping windows (and only those, to make it idempotent), and tweak tonal band priorization (because the sine wave was getting too few bits after all).

comment:244 by klaussfreire, 11 years ago

Btw, I tested ogg and faac, and they too both bork the sine wave.

comment:245 by Kamedo2, 11 years ago

Sounds good! I'd like to test that too.

comment:246 by klaussfreire, 11 years ago

Patience. I just want to make sure there's no serious regression before posting it.

comment:247 by klaussfreire, 11 years ago

Good thing I checked for regressions, because I found a huge one (regarding the bitrate curves that are so common in this ticket)

comment:248 by klaussfreire, 11 years ago

Attached the v8 patch

comment:249 by Kamedo2, 11 years ago

I've got this error:

$ make
CC      libavcodec/aaccoder.o
libavcodec/aaccoder.c: In function 'search_for_quantizers_twoloop':
libavcodec/aaccoder.c:867:40: error: 'AACEncContext' has no member named 'cur_ty
pe'
         if (s->options.stereo_mode && s->cur_type == TYPE_CPE)
                                        ^
libavcodec/aaccoder.c:814:44: warning: variable 'energies' set but not used [-Wu
nused-but-set-variable]
     float dists[128] = { 0 }, uplims[128], energies[128];
                                            ^
libavcodec/aaccoder.c: At top level:
libavcodec/aaccoder.c:1527:9: warning: initialization from incompatible pointer
type [enabled by default]
         quantize_and_encode_band,
         ^
libavcodec/aaccoder.c:1527:9: warning: (near initialization for 'ff_aac_coders[0
].quantize_and_encode_band') [enabled by default]
libavcodec/aaccoder.c:1533:9: warning: initialization from incompatible pointer
type [enabled by default]
         quantize_and_encode_band,
         ^
libavcodec/aaccoder.c:1533:9: warning: (near initialization for 'ff_aac_coders[1
].quantize_and_encode_band') [enabled by default]
libavcodec/aaccoder.c:1539:9: warning: initialization from incompatible pointer
type [enabled by default]
         quantize_and_encode_band,
         ^
libavcodec/aaccoder.c:1539:9: warning: (near initialization for 'ff_aac_coders[2
].quantize_and_encode_band') [enabled by default]
libavcodec/aaccoder.c:1545:9: warning: initialization from incompatible pointer
type [enabled by default]
         quantize_and_encode_band,
         ^
libavcodec/aaccoder.c:1545:9: warning: (near initialization for 'ff_aac_coders[3
].quantize_and_encode_band') [enabled by default]
libavcodec/aaccoder.c:366:14: warning: 'find_max_absval' defined but not used [-
Wunused-function]
 static float find_max_absval(int group_len, int swb_size, const float *scaled)
{
              ^
make: *** [libavcodec/aaccoder.o] Error 1

comment:250 by klaussfreire, 11 years ago

Ouch, I forgot a file. Gotta wait till I get home to fix it.

by klaussfreire, 11 years ago

v8 patch - tweaked tonal band priorization, especially in transients, fixed M/S encoding and made default, and other assorted bugs. Added missing include changes.

comment:251 by klaussfreire, 11 years ago

Ok, patch updated.

comment:252 by Kamedo2, 11 years ago

Comparison between v8 patch and v7 patch. The sound source is:

  • white noise, 5sec, stereo 48kHz
  • snippet of music albums, 300sec, stereo 44.1kHz

http://i60.tinypic.com/b8s3ds.png

The v8 has a minor flaw of handling stereo image in the first frame, when only one channel is loud.

by Kamedo2, 11 years ago

Attachment: Whitenoise_left.flac added

Whitenoise.flac without the sound of right channel. A strange noise appears in the center in v8.

comment:253 by Kamedo2, 11 years ago

The problem of sound partially disappearing, like comment:12, reappeared in v8 in 320kbps ABR.

in reply to:  253 comment:254 by klaussfreire, 11 years ago

Replying to Kamedo2:

The problem of sound partially disappearing, like comment:12, reappeared in v8 in 320kbps ABR.

On which sample?

I tried most, although only up to 256kbps

by Kamedo2, 11 years ago

Attachment: ffmpeg_aac256k_degrade.flac added

The sound degrades on v8 around 256kbps. Mainly right channel suffers. from Kohmi Hirose GIFT/Ai wa tokkoyaku Track3

comment:257 by klaussfreire, 11 years ago

Oh, it's not the same bug at all. It's a remaining bug on M/S encoding. I'll see what I can do about it, but it's not a regression by all means (phew)

comment:258 by Kamedo2, 11 years ago

The stereo image of Mama.wv, itCouldBeSweet.wv is strange in v8 ABR 192kbps. http://www.rarewares.org/test_samples/

comment:259 by klaussfreire, 11 years ago

It's probably the same bug, it happens sometimes with short windows I noticed.

comment:260 by Kamedo2, 11 years ago

I've successfully encoded over 10GB of AACs, but the stereo images of some vocal contents are strange at 192kbps. The bug rarely happens at 160kbps.

comment:261 by Kamedo2, 11 years ago

Have you figure it out what the "remaining bug on M/S encoding" is?

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 128kbps.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 192kbps.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 320kbps.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, before and after the v8 AAC encode, quality option -q:a 1.5

comment:262 by Kamedo2, 11 years ago

I uploaded the diff (error) of the v8 encoder. The original ItCouldBeSweet.wv is available in here. http://www.rarewares.org/test_samples/

comment:263 by Kamedo2, 11 years ago

I'd like to see the patch committed, but is there any progress? If the remaining bug of v8 is hard to fix, maybe we should consider pushing v7. It's stable.

comment:264 by klaussfreire, 11 years ago

It was a very simple thing, but very hard to find.

Anyway, fixed (hope). I'm testing an v8b patch now, we'll see if this passes the test.

comment:265 by Kamedo2, 11 years ago

So, what was the bug? Can I join the testing?

by klaussfreire, 11 years ago

Cumulative patch over v8 to fix M/S coding

comment:266 by klaussfreire, 11 years ago

I just added a cumulative patch to fix v8's M/S coding, you're very welcome to test it.

comment:267 by Kamedo2, 11 years ago

The stereo images are still somewhat buggy, especially in ItCouldBeSweet.wv
I'll check the source code to see if there are something I can do.

The -q:a vs bitrate graph.
http://i57.tinypic.com/kndcw.png

comment:268 by klaussfreire, 11 years ago

I tested ItCouldBeSweet, and it sounds ok here.

Which parameters did you use for the encoding?

comment:269 by Kamedo2, 11 years ago

The bug is most obvious at -b:a 224k. The file is 672,362 Bytes, 267 kbps.

ffmpeg_2.1v8fix -y -i ItCouldBeSweet.wav -vn -c:a aac -strict -2 -b:a 224k ItCouldBeSweet_224k.mp4
ffmpeg version 2.1.git Copyright (c) 2000-2014 the FFmpeg developers
  built on May  3 2014 15:51:52 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --extra-ldflags=-static --extra-cflags='-march=native -mfpmath=sse' --optflags=-O2
  libavutil      52. 63.101 / 52. 63.101
  libavcodec     55. 52.101 / 55. 52.101
  libavformat    55. 32.101 / 55. 32.101
  libavdevice    55.  9.101 / 55.  9.101
  libavfilter     4.  1.102 /  4.  1.102
  libswscale      2.  5.101 /  2.  5.101
  libswresample   0. 17.104 /  0. 17.104
  libpostproc    52.  3.100 / 52.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'ItCouldBeSweet.wav':
  Duration: 00:00:20.02, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Output #0, mp4, to 'ItCouldBeSweet.ffv8f_224k.mp4':
  Metadata:
    encoder         : Lavf55.32.101
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 224 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le -> aac)
Press [q] to stop, [?] for help

by Kamedo2, 11 years ago

Just for comparison. The diff of the ItCouldBeSweet, between the original and qaac encode, 128kbps.

by Kamedo2, 11 years ago

Just for comparison. The diff of the ItCouldBeSweet, between the original and FDK-AAC encode, 128kbps.

comment:270 by klaussfreire, 11 years ago

You seem to have either a bad patch or a misapplied patch, because that's the bug I solved, and in my build it works fine. Let me test the patch...

comment:271 by Kamedo2, 11 years ago

The aac-improvements-wip-v8-fix.patch is very short. Please confirm that the patch is everything I need.
I checked the source and the patch seems to be properly applied.

The v8 diff and v8_fix diff sounds different and the artifact of v8_fix is less severe.

comment:272 by Kamedo2, 11 years ago

I tried from the current git head, and the v8 patch seems to fail.
v8-fix patch is alright.

$ patch -p1 < aac-improvements-wip-v8.patch
patching file libavcodec/aac.h
patching file libavcodec/aaccoder.c
patching file libavcodec/aacenc.c
patching file libavcodec/aacenc.h
patching file libavcodec/aacpsy.c
patching file libavcodec/psymodel.c
Hunk #1 FAILED at 101.
1 out of 1 hunk FAILED -- saving rejects to file libavcodec/psymodel.c.rej
patching file libavcodec/psymodel.h

$ patch -p1 < aac-improvements-wip-v8-fix.patch
patching file libavcodec/aaccoder.c

$ ./configure --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk-a
ac --enable-libmp3lame --enable-libfaac --enable-libvo-aacenc --extra-ldflags=-
static --extra-cflags='-march=native -mfpmath=sse' --optflags=-O2

comment:273 by Hendrik, 11 years ago

Sounds like you're supposed to apply v8 first and then v8-fix.
The sources may have changed so that v8 needs a bit of fixing to apply cleanly again.

comment:274 by klaussfreire, 11 years ago

That's the problem, you had to apply both. I'm going to upload a combined and rebased patch (v8 indeed doesn't apply cleanly on git head).

by klaussfreire, 11 years ago

Combined v8 + fix

comment:275 by klaussfreire, 11 years ago

Alright, uploaded the rebased and combined v8f patch.

I noticed, there's an issue with CBR now, that needs a bigger refactoring to be fixable. Will have to do that as a further patch down the line.

comment:276 by Hendrik, 11 years ago

A new issue, or just an existing problem? CBR/ABR is imho far more important than VBR modes, since its much more commonly used. If the patch improves VBR but causes issues for CBR, thats bad.

comment:277 by klaussfreire, 11 years ago

It's not a serious one. It's just that it won't target the bitrate so accurately as before, only when using M/S coding (which didn't even work before). So you could say it's not a regression, because the case that doesn't properly target the bitrate didn't even work before, it's overall better.

I can fix it (and I might post a patch fixing it), but I'm not sure how long it will take, so I don't want to delay committing these advances even more just for this that isn't even a regression.

comment:278 by Hendrik, 11 years ago

OK, thats fine, it sounded more serious. :)

comment:279 by Kamedo2, 11 years ago

This v8f patch is great! It sounds very stable.

http://i58.tinypic.com/1z4j08y.png

in reply to:  277 comment:280 by Carl Eugen Hoyos, 11 years ago

Replying to klaussfreire:

I don't want to delay committing these advances even more just for this that isn't even a regression.

Then please send your patch to the developer mailing list so Michael can apply it (or point to your public git repository so the change can be merged). You both have done an enormous amount of work, I believe there is no reason to delay the results anymore.
Please make a reference to this ticket part of the commit message.

comment:281 by Kamedo2, 11 years ago

A Result of a quick analysis, using 17 music tracks:

Default ABR 128kbps
Min / Average / Max bitrate: 129 / 130 / 131 kbps
Average Speed: 12.9x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k %o
This sounds great for 128 kbps.

Default VBR q1
Min / Average / Max bitrate: 169 / 192 / 211 kbps
Average Speed: 14.1x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 1 %o
Sounds very clear, but not something expected from 192 kbps.

ABR with ms_off
Min / Average / Max bitrate: 128 / 129 / 131 kbps
Average Speed: 14.1x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder anmr %o
Sounds bad when transients exist.

ABR with -aac_coder fast
Min / Average / Max bitrate: 128 / 134 / 141 kbps
Average Speed: 28.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder fast %o
This is fast, but with very bad quality. I don't know if there's good use of this option.

ABR with -aac_coder anmr
Min / Average / Max bitrate: 130 / 134 / 141 kbps
Average Speed: 8.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder anmr %o
Sounds bad when transients exist. Noticably worse than the default ABR, and it's slow.

VBR with -aac_coder anmr
Min / Average / Max bitrate: 172 / 191 / 213 kbps
Average Speed: 7.5x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 1 -aac_coder anmr %o
Sounds slightly bad when transients exist. It's slow.

Last edited 11 years ago by Kamedo2 (previous) (diff)

comment:282 by Kamedo2, 11 years ago

I think we should redirect users to 8k if the -b:a set is less than 8k.
The ancient FFmpeg set bitrate by the unit of kbps, so a novice user of modern FFmpeg may set bitrate like -b:a 128. If something sounds, the user may notice that the bitrate set is too low.

in reply to:  282 comment:283 by Carl Eugen Hoyos, 11 years ago

Replying to Kamedo2:

I think we should redirect users to 8k if the -b:a set is less than 8k.

This is not done for any other codec.

The ancient FFmpeg set bitrate by the unit of kbps

I just tested a five year old version and the unit is bps.

so a novice user of modern FFmpeg may set bitrate like -b:a 128. If something sounds, the user may notice that the bitrate set is too low.

Current FFmpeg prints a warning.

Since this is unrelated, please don't let this delay the patch.

in reply to:  281 comment:284 by Timothy Gu, 11 years ago

Replying to Kamedo2:

ABR with -aac_coder fast
Min / Average / Max bitrate: 128 / 134 / 141 kbps
Average Speed: 28.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder fast %o
This is fast, but with very bad quality. I don't know if there's good use of this option.

It is supposed to sound bad: http://ffmpeg.org/ffmpeg-codecs.html#Options-2:

This method sets a constant quantizer for all bands. This is the fastest of all the methods, yet produces the worst quality.

Also, can you test -coder twoloop?

comment:285 by Kamedo2, 11 years ago

ABR with -aac_coder twoloop
Min / Average / Max bitrate: 129 / 130 / 131 kbps
Average Speed: 13.6x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder twoloop %o
This sounds great.

ABR with -aac_coder faac
Min / Average / Max bitrate: 133 / 157 / 210 kbps
Average Speed: 12.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder faac %o
This has the worst quality. Long windows seems to have 2~3kHz LPF and short windows seems to have no LPF.

VBR with -aac_coder twoloop
Min / Average / Max bitrate: 111 / 129 / 146 kbps
Average Speed: 15.7x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 0.7 -aac_coder twoloop %o
This sounds great.

VBR with -aac_coder faac
Min / Average / Max bitrate: 213 / 259 / 311 kbps
Average Speed: 11.1x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 0.7 -aac_coder faac %o
Abysmal quality, comparable to ABR with -aac_coder faac, but with more bitrate.

comment:286 by Timothy Gu, 11 years ago

Sorry, twoloop is actually the default.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 128kbps.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 192kbps.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 320kbps.

comment:287 by klaussfreire, 11 years ago

I've only worked on twoloop (which is the default).

My next step is working on amnr, which can outperform twoloop if done well.

comment:288 by klaussfreire, 11 years ago

I'm close to finishing testing of a better patch for ABR's M/S bug. All that remains is confirming it fixes the above diffs, and I'll upload it and we can move on.

comment:289 by Kamedo2, 11 years ago

Please consider adding support for 7350Hz as in comment:206.
libavcodec/aacenc.c line 107 & 134 should be

static const uint8_t *swb_size_1024[] = {
    swb_size_1024_96, swb_size_1024_96, swb_size_1024_64,
    swb_size_1024_48, swb_size_1024_48, swb_size_1024_32,
    swb_size_1024_24, swb_size_1024_24, swb_size_1024_16,
    swb_size_1024_16, swb_size_1024_16, swb_size_1024_8, swb_size_1024_8
};
static const uint8_t *swb_size_128[] = {
    /* the last entry on the following row is swb_size_128_64 but is a
       duplicate of swb_size_128_96 */
    swb_size_128_96, swb_size_128_96, swb_size_128_96,
    swb_size_128_48, swb_size_128_48, swb_size_128_48,
    swb_size_128_24, swb_size_128_24, swb_size_128_16,
    swb_size_128_16, swb_size_128_16, swb_size_128_8, swb_size_128_8
};

by klaussfreire, 11 years ago

Fix M/S encoding in ABR

comment:290 by klaussfreire, 11 years ago

Attached a better fix to ABR, and the support for 7350Hz.

comment:291 by Kamedo2, 11 years ago

Thank you, klaussfreire. I successfully patched it and configured.

comment:292 by Carl Eugen Hoyos, 11 years ago

I am assuming that this patch is meant to be applied to FFmpeg git master soon, the following is partly necessary, partly just a suggestion:

  • The patch contains trailing white space, this cannot be pushed to our git repository, please remove it. tools/patcheck can help you finding such issues.
  • Please either remove all printf's or make them av_log's.
  • The function sqrf() is duplicated iiuc, please move it to a header (if the function is necessary).
  • There are three or four blocks where you just reindent existing code. It makes reading your patch (in the future) easier if you don't reindent them right now in the same commit, just leave them where they are. I can do the reindent for you (or you can send a followup patch).
  • And finally (purely optional):

Using the following

if (condition) {
    do1;
} else {
    do2;
}

instead of

if (condition)
    do1;
else
    do2;

has the advantage that future changes are smaller and easier to read (this point is of course up to you, it is your code).

If you want me to make any of these changes and attach the result here, please say so!

comment:293 by Kamedo2, 11 years ago

http://i58.tinypic.com/dz81nd.png
Other than this, I've encoded about 20GB of many music tracks, using diverse settings.

comment:294 by Timothy Gu, 11 years ago

@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.

Other than that, @klaussfreire, if the patch was to be applied to master, you could split this patch to at least two patches. The change of default to M/S encoding should also be documented somehow.

in reply to:  294 ; comment:295 by Cigaes, 11 years ago

Replying to Timothy_Gu:

@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.

IMHO, at this point, the question is not whether the ANMR coding is worse than it should be but whether it makes it worse than it currently is.

If we blocked patches because something could be done even better, then the only acceptable patch series would be “[PATCH 0/85042] Make FFmpeg the ultimate multimedia software”.

As I understand, this patch makes some modes work much better than now, with very little or no degradation on the little that did work: in my book, this is very good for inclusion. Knowing ways of making even better is good too, but for later patches.

in reply to:  295 comment:296 by Timothy Gu, 11 years ago

Replying to Cigaes:

Replying to Timothy_Gu:

@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.

IMHO, at this point, the question is not whether the ANMR coding is worse than it should be but whether it makes it worse than it currently is.

If we blocked patches because something could be done even better, then the only acceptable patch series would be “[PATCH 0/85042] Make FFmpeg the ultimate multimedia software”.

As I understand, this patch makes some modes work much better than now, with very little or no degradation on the little that did work: in my book, this is very good for inclusion. Knowing ways of making even better is good too, but for later patches.

I agree with you. However this behavior now contradicts the behavior originally in the documentation, which should be either fixed in the code or documented.

On a side note, can anyone check if ANMR with patch is better than without? If so then I have no problem landing the patch (except its nits) with the documentation changes.

in reply to:  294 comment:297 by klaussfreire, 11 years ago

Replying to cehoyos:

I am assuming that this patch is meant to be applied to FFmpeg git master soon, the following is partly necessary, partly just a suggestion:

  • The patch contains trailing white space, this cannot be pushed to our git repository, please remove it. tools/patcheck can help you finding such issues.

Forgot about that. Will do. But this patch is only a POC, I'll re-do the changes (verbatim, but in steps) and post them as separate, progressive patches.

  • Please either remove all printf's or make them av_log's.

Of course.

  • The function sqrf() is duplicated iiuc, please move it to a header (if the function is necessary).

Ok.

  • There are three or four blocks where you just reindent existing code. It makes reading your patch (in the future) easier if you don't reindent them right now in the same commit, just leave them where they are. I can do the reindent for you (or you can send a followup patch).

I could do the patch with -w to make it not diff whitespace-only. Does that work equally well? (there's a lot of code that really needs reindenting or it becomes unreadable). I could post both (with and without -w).

  • And finally (purely optional):

Using the following

if (condition) {
    do1;
} else {
    do2;
}

instead of

if (condition)
    do1;
else
    do2;

has the advantage that future changes are smaller and easier to read (this point is of course up to you, it is your code).

Surely. Easy enough to change.

Replying to Timothy_Gu:

@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.

What I could do, is try to get rid of a common hole-avoidance bug that makes all the other coders much worse. It shouldn't be hard. Other than that, I think only updating the doc is pertinent to thi patch set.

Other than that, @klaussfreire, if the patch was to be applied to master, you could split this patch to at least two patches. The change of default to M/S encoding should also be documented somehow.

Of course. Bugfixes first, improvements next. I need to update my A/B testing script, so I can run all patches through it - especially bugfixes, where automated A/B testing does work.

comment:298 by klaussfreire, 11 years ago

It seems that ANMR just needs a larger search space. Expanding TRELLIST_STATES to 121, and fixing path construction to respect SCALE_MAX_DIFF of course, though it doubles the time taken (ouch), it does improve the quality considerably. I think it's now just a matter of coding the same tonal priorizations and making sure it works well with VBR, and ANMR is probably good as done.

comment:299 by Kamedo2, 11 years ago

I'm about to start a preliminary listening test of:

  • FAAC abr 96k.mp4
  • FAAC vbr q30(~48kbps).mp4
  • FFmpeg native mp2 encoder 96k.mp2
  • vo-aacenc 0.1.3 abr 96k.mp4
  • Bladeenc 96k.mp3
  • FFmpeg native AAC encoder 96k.mp4
  • FFmpeg native AAC encoder+v8g patch 96k.mp4

using first 15 samples of a 2011 public multiformat listening test (30 samples).

I guess the v8g at 96kbps beats the FAAC at 96kbps, because the FAAC is quite bad at lower bitrates.

comment:300 by Kamedo2, 11 years ago

Because of my simple mistake, I'm testing these 6 encoders with a duplicate:

  • FAAC abr 96k.mp4
  • FAAC vbr q30(~48kbps).mp4
  • FAAC vbr q30(~48kbps).mp4
  • vo-aacenc 0.1.3 abr 96k.mp4
  • Bladeenc 96k.mp3
  • FFmpeg native AAC encoder 96k.mp4
  • FFmpeg native AAC encoder+v8g patch 96k.mp4

I've done 11 samples out of 15 x 2 samples. (37% done)

comment:301 by klaussfreire, 11 years ago

I don't understand. How is -q:a 30 48kbps?

Or is it 0.30?

in reply to:  301 comment:302 by Kamedo2, 11 years ago

Replying to klaussfreire:

I don't understand. How is -q:a 30 48kbps?

Or is it 0.30?

faac-1.28-mod\faac -q 30 -o out.mp4 in.44k.wav
encoderrateminaveragemax
FAAC96k979898
FAACq30435159
mp296k969696
vo-aacenc96k989899
Bladeenc96k969696
Native96k9898100
Native+v8g96k9899101

comment:303 by klaussfreire, 11 years ago

Oh, forgot FAAC uses percent in 0-100.

:)

comment:304 by Timothy Gu, 11 years ago

@Kamedo2 Just to make sure, the problem described in your recent edit to GuidelinesHighQualityAudio is fixed in v8g, right?

comment:305 by Timothy Gu, 11 years ago

Owner: set to klaussfreire

comment:306 by klaussfreire, 11 years ago

It needs updating of course. I know 96k and 64k both work reasonably well.

in reply to:  304 comment:307 by Kamedo2, 11 years ago

Replying to Timothy_Gu:

@Kamedo2 Just to make sure, the problem described in your recent edit to GuidelinesHighQualityAudio is fixed in v8g, right?

Yes.

comment:308 by Kamedo2, 11 years ago

I finished the test and the result is here.
http://www.hydrogenaud.io/forums/index.php?showtopic=105959
The v8g patch beat both unpatched AAC encoder and FAAC at 96k.
http://i62.tinypic.com/2cs6wc8.png
http://i60.tinypic.com/30b0aye.png

comment:309 by klaussfreire, 11 years ago

I must say your testing efforts are invaluable :)

comment:310 by Hendrik, 11 years ago

Any news on getting closer to submitting the improvements?
We're all looking forward to that, more every day!

comment:311 by klaussfreire, 11 years ago

I need to find some free time for that.

But yeah, close.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, between the original and the patch v8g AAC encode, 192kbps.

by Kamedo2, 11 years ago

The diff of the ItCouldBeSweet, between the original and the patch v8g AAC encode, 320kbps.

comment:312 by Kamedo2, 11 years ago

The glitches of v8g, stereo 48kHz diff. The black bar is 256 samples.
The sound is the intro of Fatboy Slim - Kalifornia.
http://www.hydrogenaud.io/forums/index.php?showtopic=19682
http://i62.tinypic.com/oqgh15.png

comment:313 by Kamedo2, 11 years ago

Is the development going alright?

comment:314 by klaussfreire, 11 years ago

This last issue I'm not sure how to fix.

I've been sick lately so no progress, but soon I'll re-engage and I'll priorize sending simple bugfixes to the ML first.

From what I can tell, this is another issue with bit allocation related to constantly and quickly repeating transients.

comment:315 by Kamedo2, 10 years ago

If it's not easy to fix the bit allocation, maybe we should commit the patches without the default M/S encoding.

comment:316 by Kamedo2, 10 years ago

I'm thinking of conducting a personal listening test of the stable v7 or the experimental M/S enabled v8g (or anything latest). I'd like to hear your opinion.

comment:317 by Carl Eugen Hoyos, 10 years ago

Ticket #3816 describes an apparently problematic sample that improves with the patch(es) attached here:
http://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket3816/

comment:318 by klaussfreire, 10 years ago

RL has been throwing obstacles at me lately, so I couldn't make any progress here.

I did manage to find a few low-hanging-bugs in ANMR, but, and this is quaint, fixing them makes ANMR 10x slower.

Anyway, things are coming back to normal in RL so I'll be investing some time soon into patch-submitting the small bugfixes. I'll surely have lots of rebasing to do.

I still couldn't fix "Fatboy Slim - Kalifornia", but since the issue has been eluding me, I'm going to leave this for later. I'm only going to check whether it's an issue with M/S coding (doesn't seem to be), because I'd like the patch set to end up making M/S coding the default.

by Kamedo2, 10 years ago

v7 patch altered to reflect the latest change by Michael Niedermayer at 20140525. This should work for the git head.

by Kamedo2, 10 years ago

v8g patch altered to reflect the latest change by Michael Niedermayer at 20140525. This should work for the git head.

comment:319 by Kamedo2, 10 years ago

I will start a listening test of the stable v7 patch, latest v8g patch, mp3lame, opus1.1 at 96 kbps in a week.

I will also use 40 samples from here.(See the Track download: section.)
http://listening-test.coresv.net/results.htm

in reply to:  318 comment:320 by Kamedo2, 10 years ago

Replying to klaussfreire:

I did manage to find a few low-hanging-bugs in ANMR, but, and this is quaint, fixing them makes ANMR 10x slower.

Is that

Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399

? Rare, but it happens on some tracks.
libavcodec/aacenc.c

/**
 * Encode scalefactors.
 */
static void encode_scale_factors(AVCodecContext *avctx, AACEncContext *s,
                                 SingleChannelElement *sce)
{
    int off = sce->sf_idx[0], diff;
    int i, w;

    for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) {
        for (i = 0; i < sce->ics.max_sfb; i++) {
            if (!sce->zeroes[w*16 + i]) {
                diff = sce->sf_idx[w*16 + i] - off + SCALE_DIFF_ZERO;
                av_assert0(diff >= 0 && diff <= 120);
                off = sce->sf_idx[w*16 + i];
                put_bits(&s->pb, ff_aac_scalefactor_bits[diff], ff_aac_scalefactor_code[diff]);
            }
        }
    }
}

by Kamedo2, 10 years ago

Attachment: ffmpeg_anmr_error.flac added

It causes the assertion error at aacenc.c line 399 by -aac_coder anmr on all -b:a and -q:a 0.1695 or bigger.

comment:321 by klaussfreire, 10 years ago

anmr is somewhat broken yes, I have some fixes, but they cause a huge performance regression so I don't consider them submittable yet.

It doesn't cause the same assertion failure with twoloop does it?

in reply to:  321 comment:322 by Kamedo2, 10 years ago

Replying to klaussfreire:

It doesn't cause the same assertion failure with twoloop does it?

It doesn't cause the same assertion failure with -aac_coder twoloop.

by Kamedo2, 10 years ago

Attachment: ffmpeg_anmr_error2.flac added

EBU–TECH 3253 Sound Quality Assessment Material recordings for subjective tests, 50 Male speech, English.

comment:323 by Kamedo2, 10 years ago

Is there a better version than the v8g patch to test?

comment:324 by klaussfreire, 10 years ago

Not yet. I will try to hunt for that fatboy issue again tonight, and if no progress is made, I will separate the patch into progressive improvements and test each separately. Maybe the process itself yields some insight.

In any case, which do you think is the best patch yet?

comment:325 by Kamedo2, 10 years ago

I think the v7 is the best patch because of the stability.

comment:326 by klaussfreire, 10 years ago

I don't want to be too dramatic, but I think I finally fixed fatboy. Subtle bugs being compounded. This version is much better (at least in fatboy).

I'm starting a thorough testing session and then I'll patch a rebased v9.

comment:327 by Kamedo2, 10 years ago

Sorry I couldn't reply fast. I can't wait to test!

by klaussfreire, 10 years ago

Hopefully final version of the AAC patch

comment:328 by klaussfreire, 10 years ago

Attached a new version of the patch, v9. v9 ABR performs much better than v7 in all the samples I tried, including fatboy. v9 VBR performs better too except in fatboy, I'll analyze the differences to v7 next to see why that's so. But for the time being, I think it'll be a good idea to test v9.

comment:329 by Kamedo2, 10 years ago

Thank you. The v9 is successfully running on many settings and samples.

by Kamedo2, 10 years ago

The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 128kbps.

by Kamedo2, 10 years ago

The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 192kbps.

by Kamedo2, 10 years ago

The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 320kbps.

comment:330 by Kamedo2, 10 years ago

I have to say these sounds very similar to the v8g patch.

comment:331 by klaussfreire, 10 years ago

Well, it is based on it, but it sounded very different to me, as it has important bugfixes. The encoded versions I mean, not the differences.

Namely, in codebook_trellis_rate/encode_window_bands_info, it was using the wrong scalefactors and that's a major bug when encoding transients.

And on the RD-reduction step, both v7 and v8g were assuming decreasing scalefactors had a predictable effect on distortion, and v9 just recomputes distortion, which proved to be a big improvement on VBR.

That pretty much singlehandedly fixed the biggest issues in fatboy.

comment:332 by klaussfreire, 10 years ago

Though I am considering rolling back one of v8g changes to the RD-reduction step compared to v7, since I believe the fixes in v9 make that change obsolete. I'll do that and a round of testing and let you know.

comment:333 by Kamedo2, 10 years ago

The v9 anmr still crashes on some rare samples. I will provide details later.

ffmpeg_r67961_v9 -i in.flac -c:a aac -strict experimental -q:a 1 -aac_coder anmr out.mp4

comment:334 by klaussfreire, 10 years ago

I don't get the crashes on the earlier samples (ffmpeg_anmr_errorX). If you can attach the rare samples I can debug.

comment:335 by Kamedo2, 10 years ago

sine_tester.flac causes assertion error at all -b:a and -q:a 2.839 or bigger.
This ​http://clt.odu.edu/dropbox/ffmpeg/input.mp4 also causes assertion error at -b:a 177k or bigger.
They all causes

Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399

by Kamedo2, 10 years ago

Attachment: ffmpeg_anmr_error3.flac added

EBU–TECH 3253 Sound Quality Assessment Material recordings for subjective tests, 3 Electronic gong 100 Hz.(sine wave)

comment:336 by Kamedo2, 10 years ago

The comparison of the v8g patch and the new v9 patch.

The commandline is

ffmpeg -i in.wav -c:a aac -strict experimental -q:a %v out.mp4

The v8g and v9 yields almost the same bitrate.
http://i59.tinypic.com/k2sfo.png

The v8g(the triangle plots) is about 2x faster than the v9.
http://i61.tinypic.com/qoeuds.png

by Kamedo2, 10 years ago

Attachment: FFmpeg_anmr_error4.flac added

This causes the assertion error on both -b:a 128k and -q:a 1. 4000Hz sine wave, stereo.

by Kamedo2, 10 years ago

Attachment: FFmpeg_anmr_error5.flac added

This causes the assertion error on both -b:a 128k and -q:a 1. 11000Hz sine wave, stereo.

comment:337 by Kamedo2, 10 years ago

The v9 anmr acts fine on most samples, but vulnerable to sine waves.

comment:338 by Kamedo2, 10 years ago

Please consider following cehoyos's suggestion in comment:292.

libavcodec/aaccoder.c line 344 368 394 714 971 1025 1274 1292
libavcodec/aacpsy.c line 989

comment:339 by Kamedo2, 10 years ago

I checked the aacenc.c:399 assertion error.

ffmpeg_r67961_v9_with_printf -y -i file.flac -c:a aac -strict experimental -aac_coder anmr -b:a 128k out.mp4

fileratewisf_idx[w*16+i]offSCALE_DIFF_ZEROdiff
sine_tester 16k 0 10 186 118 60 128
128k 0 18 89 150 60 -1
320k 0 4 87 152 60 -5
q1.5 0 10 140 77 60 123
q4 0 12 94 156 60 -2
q8 0 12 86 150 60 -4
ffmpeg_anmr_error3 128k 0 12 99 163 60 -4
256k 0 22 105 173 60 -8
64k 0 32 117 180 60 -3
q0.5 0 22 119 180 60 -1
q1 7 10 104 171 60 -7
q2 0 12 109 173 60 -4
ffmpeg_anmr_error4 64k 0 23 185 116 60 129
128k 0 23 170 109 60 121
192k 0 23 164 98 60 126
320k 0 23 161 99 60 122
q0.5 0 23 179 118 60 121
q1 0 23 172 108 60 124
ffmpeg_anmr_error5 64k 0 34 182 114 60 128
128k 0 34 182 115 60 127
192k 0 34 169 108 60 121
384k 0 40 103 166 60 -3
q1 0 34 183 119 60 124
q2 0 34 181 120 60 121

comment:340 by klaussfreire, 10 years ago

Sorry for the silence, the assertion error has been fixed already. Some bug I couldn't find with encode_window_bands_info, but I just made anmr use the other (codebook_trellis_rate? something like that) which is very similar but better (avoids adding holes or picking a codebook that cannot encode the coefficients, for instance).

I'm currently doing a frankenstein between v9 and v7, to get v9 close to v7 in stability only with better quality. I'm close to posting v9b, hopefully a version that can be sliced and committed (if v7 was acceptable v9b should be as well)

comment:341 by Kamedo2, 10 years ago

From the itCouldBeSweet and fatboy sample, and the diff of them, I have an impression that the instability only happens at short windows. I hope it's fixed in the next patch.

comment:342 by klaussfreire, 10 years ago

Actually, the instability happens transient high bit demand (ie: when a transient signal induces the bit allocator into allocating bits from the reservoir). That results in lower quality of following windows, and twoloop fails to avoid holes in those situations, creating holes that shouldn't be there, hence the instability.

comment:343 by Kamedo2, 10 years ago

Klaussfreire, thank you for the explanation!

comment:344 by Kamedo2, 10 years ago

How is the development going?

comment:345 by klaussfreire, 10 years ago

Actually quite well. I might post a patch tonight.

by klaussfreire, 10 years ago

v9b version, based on v9, matched behavior against v7

comment:346 by klaussfreire, 10 years ago

Attached a v9b. I'm still doing testing on this, but preliminar comparisons against v7 are strong. The instability of v9 seems to be eliminated, although I haven't tested very low bitrates (96kbps on stereo being the lowest for now).

ANMR doesn't crash, but other than that I haven't worked much on it.

in reply to:  338 comment:347 by klaussfreire, 10 years ago

Replying to Kamedo2:

Please consider following cehoyos's suggestion in comment:292.

libavcodec/aaccoder.c line 344 368 394 714 971 1025 1274 1292
libavcodec/aacpsy.c line 989

Don't worry about this, I'll make sure to clean up everything when I make the incremental patches at the end.

comment:348 by Kamedo2, 10 years ago

The v9b patch is working just fine. I'm currently testing many diverse samples and settings.

http://i62.tinypic.com/1z4vpxv.png
http://i58.tinypic.com/2882q82.png

comment:349 by Kamedo2, 10 years ago

The option -aac_coder fast and faac crash on both -b:a and -q:a.

ffmpeg_r68337_v9b -y -i ffmpeg_anmr_error5.flac -c:a aac -strict experimental -q:a 1 -aac_coder faac out.wav
ffmpeg_r68337_v9b -y -i ffmpeg_anmr_error4.flac -c:a aac -strict experimental -b:a 128k -aac_coder fast out.wav

by Kamedo2, 10 years ago

Attachment: FFmpeg_anmr_error6.flac added

This causes the assertion error on -b:a 96k, 128k, 160k on v9b. -q:a is OK. 9000Hz sine wave, stereo.

comment:350 by klaussfreire, 10 years ago

None have been maintained, so I guess it's expectable.

comment:351 by klaussfreire, 10 years ago

I'm not sure whether to fix or scrap fast and faac. Fast would be nice to rewrite using twoloop with quick and dirty parameters (far few iterations for instance), faac not sure how it compares against the others.

Anyone knows what's faac's rationale? Is it discardable?

comment:352 by klaussfreire, 10 years ago

I have some nice improvements for VBR half-baked, just making sure there are no regressions. Just wanted to mention in case you notice VBR not being up to par with ABR.

by Kamedo2, 10 years ago

Attachment: FFmpeg_anmr_error7.flac added

This causes the assertion error on -b:a 192k on v9b. Dave Matthews Band - Crush, http://www.hydrogenaud.io/forums/index.php?showtopic=102079&hl=

in reply to:  351 comment:353 by Kamedo2, 10 years ago

Replying to klaussfreire:

I'm not sure whether to fix or scrap fast and faac. Fast would be nice to rewrite using twoloop with quick and dirty parameters (far few iterations for instance), faac not sure how it compares against the others.

Anyone knows what's faac's rationale? Is it discardable?

Faac on v9 is extremely bad. I don't think we will need many -aac_coder variant in the final version. Overwhelming majority will use the default setting.

comment:354 by klaussfreire, 10 years ago

Alright, so the plan is to rewrite fast, and scrap faac.

I'll look into the assertion error. That's for anmr only... right?

Still, I'd like some validation on v9b twoloop if possible? If it's performing acceptably I'd like to start splitting and pushing the patch.

comment:355 by Kamedo2, 10 years ago

Klaussfreire, sorry for the slow response.

Yes, the assertion error only happens when the -aac_coder is anmr.
FFmpeg_anmr_error6 and FFmpeg_anmr_error7 crashes at the beginning of the file, but I've found 4 music tracks (out of hundreds) that crashes at the middle of the file.
I failed to reproduce the results after cutting to distributable short clips.

Also I have listened to hundreds of songs encoded by v9b anmr and twoloop and I have found no apparent problems so far.

comment:356 by Kamedo2, 10 years ago

Assertion error on v9b patch -c:a aac -strict experimental -aac_coder anmr

Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399

investigated.

/**
 * Encode scalefactors.
 */
static void encode_scale_factors(AVCodecContext *avctx, AACEncContext *s,
                                 SingleChannelElement *sce)
{
    int off = sce->sf_idx[0], diff;
    int i, w;

    for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) {
        for (i = 0; i < sce->ics.max_sfb; i++) {
            if (!sce->zeroes[w*16 + i]) {
                diff = sce->sf_idx[w*16 + i] - off + SCALE_DIFF_ZERO;if(diff<0 || diff>120)fprintf(stderr, "|| || k|| %d|| %d|| %d|| %d|| %d|| %d|| %d|| %d||\n",sce->ics.num_windows, sce->ics.max_sfb, w, i, sce->sf_idx[w*16+i], off , SCALE_DIFF_ZERO, diff); 
                av_assert0(diff >= 0 && diff <= 120);
                off = sce->sf_idx[w*16 + i];
                put_bits(&s->pb, ff_aac_scalefactor_bits[diff], ff_aac_scalefactor_code[diff]);
            }
        }
    }
}
audio fileratewmaximaxwisf_idx[w*16+i]offSCALE_DIFF_ZEROdiff
FFmpeg_anmr_error6 82k 1 38 0 31 187 125 60 122
FFmpeg_anmr_error6 84k 1 38 0 33 124 185 60 -1
FFmpeg_anmr_error6 88k 1 39 0 31 187 125 60 122
FFmpeg_anmr_error6 96k 1 40 0 31 187 125 60 122
FFmpeg_anmr_error6 112k 1 41 0 31 172 111 60 121
FFmpeg_anmr_error6 128k 1 42 0 31 169 108 60 121
FFmpeg_anmr_error6 144k 1 43 0 31 169 107 60 122
FFmpeg_anmr_error6 160k 1 43 0 31 170 107 60 123
FFmpeg_anmr_error6 176k 1 44 0 31 173 111 60 122
FFmpeg_anmr_error6 184k 1 44 0 31 171 109 60 122
FFmpeg_anmr_error6 186k 1 44 0 32 181 120 60 121
FFmpeg_anmr_error7 192k 8 12 7 0 181 119 60 122
FFmpeg_anmr_error7 196k 8 12 7 0 181 119 60 122
FFmpeg_anmr_error7 200k 8 12 7 0 181 119 60 122
FFmpeg_anmr_error7 208k 8 12 7 0 181 119 60 122
FFmpeg_anmr_error7 212k 8 12 7 0 181 119 60 122
Koimusume no rondo 160k 8 12 6 0 183 122 60 121
Koimusume no rondo 192k 8 12 3 0 183 121 60 122
Koimusume no rondo 224k 8 13 5 0 183 122 60 121
Sphere no hane 160k 8 12 6 0 187 126 60 121

wmax : sce->ics.num_windows
imax : sce->ics.max_sfb
Sorry that I cannot distribute those last two large files.

comment:357 by klaussfreire, 10 years ago

Alright, I could reproduce the issue, I just need to find how to fix it.

comment:358 by Kamedo2, 10 years ago

Very rare, but foobar2000 v1.3.1 outputs this error on v9b anmr 320kbps. The track plays just fine.

File verification error: Decoding error: Unsupported format or corrupted file, frame: 576 of 14855

comment:359 by Hendrik, 10 years ago

Any recent developments? I was almost hoping we would start merging things by now after some previous comments. =)

Anything we can help with to expedite things?

comment:360 by klaussfreire, 10 years ago

Well, I don't like the broken state of anmr. Those errors mentioned upthread, I could reproduce them alright, but not fix them. I've been busy with RL stuff these days too, so I had very little time for real debugging work there.

Twoloop seems stable enough, so maybe if people don't mind anmr's breakage, we could start the merging.

I would prefer to fix that at least before merging. I had thought it would be a quick fix, but it's proving to be rather not.

comment:361 by Kamedo2, 10 years ago

The assertion error of v9b patch when the aac_coder is set to anmr, ffmpeg.exe -y -i audio_file -c:a aac -strict experimental -aac_coder anmr -b:a xxk out.mp4

Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399

was investigated again by looking into full behavior of static void encode_scale_factors().
I don't know why, but the numerical value is slightly different from the comment:356.

sce->sf_idx[w*16 + i] spikes when the assertion error happens.

audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error6 82k 1 38 0 31 187 125reproduction failed
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 82k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 82 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
size=     103kB time=00:00:10.00 bitrate=  84.1kbits/s
video:0kB audio:100kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.416551%
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error6 84k 1 38 0 33 124 185reproduction failed
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 84k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 84 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
size=     105kB time=00:00:10.00 bitrate=  86.1kbits/s
video:0kB audio:103kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.359015%
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error6 88k 1 39 0 31 187 125reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 88k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 88 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx 99 99 97 97 97 96 96 96 96100100100117116116122116121120121120120120122120124121123124122124186156125125125123123118
diff :............................................................................................. +2.....................
     : 60 60 58 60 60 59 60 60 60 64 60 60 77 59 60 66 54 65 59 61 59 60 60 62 58 64 57 62 61 58 62122 30 29 60 60 58 60 55

|| || k|| 1|| 39|| 0|| 31|| 186|| 124|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error6 96k 1 40 0 31 187 125reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 96k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 96 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
zeros:  0  0  1  0  1  0  1  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
              #     #     #                 #
sf_idx114107107106106106106105105112112112112111112115108107109111110110116117119119120121122125125187164129126124121121120120
diff :......   ...   ...   ...............   ...................................................... +2........................
     : 60 53    59    60    59 60 67 60 60    59 61 63 53 59 62 62 59 60 66 61 62 60 61 61 61 63 60122 37 25 57 58 57 60 59 60

|| || k|| 1|| 40|| 0|| 31|| 187|| 125|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error6 184k 1 44 0 31 171 109reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 184k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 184 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
zeros:  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
           #
sf_idx 98 98 98 97 97 96 96 97 97 99 99 99 98101100100 99 98100101 99 99 99 99102108110115113105109171156110106106110113115118117114113114
diff :...   ....................................................................................... +2....................................
     : 60    60 59 60 59 60 61 60 62 60 60 59 63 59 60 59 59 62 61 58 60 60 60 63 66 62 65 58 52 64122 45 14 56 60 64 63 62 63 59 57 59 61

|| || k|| 1|| 44|| 0|| 31|| 171|| 109|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error6 186k 1 44 0 32 181 120reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 186k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 186 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx 60120120114114114120120 81118120120120120120120120120120120120120120120120120120120120120120120181120120120120120120120120120120120
diff :................................................................................................ +1 -1..............................
     : 60120 60 54 60 60 66 60 21 97 62 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60121 -1 60 60 60 60 60 60 60 60 60 60

|| || k|| 1|| 44|| 0|| 32|| 181|| 120|| 60|| 121||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error7 192k 8 12 7 0 181 119reproduced
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error7.flac':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
  Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 192 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx140141141138135135135133131129127127149124123123122116116126118120120118151129125121117116116115113120119119181153148143141137134131128125124121
diff :............................................................................................................ +2.................................
     : 60 61 60 57 57 60 60 58 58 58 58 60 82 35 59 60 59 54 60 70 52 62 60 58 93 38 56 56 56 59 60 59 58 67 59 60122 32 55 55 58 56 57 57 57 57 59 57

|| || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error7 196k 8 12 7 0 181 119reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error7.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 196k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error7.flac':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
  Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 196 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx140141141138135135135133131129127127149124123123122116116126118120120118151129125121117116116115113121121119181153148143141137134131128125124121
diff :............................................................................................................ +2.................................
     : 60 61 60 57 57 60 60 58 58 58 58 60 82 35 59 60 59 54 60 70 52 62 60 58 93 38 56 56 56 59 60 59 58 68 60 58122 32 55 55 58 56 57 57 57 57 59 57

|| || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error7 200k 8 12 7 0 181 119reproduced
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error7.flac':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
  Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 200 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx136136136138135135135133131129127127149124123122122116116114112120120118151129125119117116116114113120121119181150148143141137134131128125124121
diff :............................................................................................................ +2.................................
     : 60 60 60 62 57 60 60 58 58 58 58 60 82 35 59 59 60 54 60 58 58 68 60 58 93 38 56 54 58 59 60 58 59 67 61 58122 29 58 55 58 56 57 57 57 57 59 57

|| || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error7 208k 8 12 7 0 181 119reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error7.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 208k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error7.flac':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
  Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 208 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx136136137137134134134133131129127127149124123122118116116114112120120118148129125119117115116114113120121119181150148143141137134131128125124121
diff :............................................................................................................ +2.................................
     : 60 60 61 60 57 60 60 59 58 58 58 60 82 35 59 59 56 58 60 58 58 68 60 58 90 41 56 54 58 58 61 58 59 67 61 58122 29 58 55 58 56 57 57 57 57 59 57

|| || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
FFmpeg_anmr_error7 212k 8 12 7 0 181 119reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error7.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 212k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'FFmpeg_anmr_error7.flac':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
  Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    ALBUM           : Before These Crowded Streets
    ARTIST          : Dave Matthews Band
    GENRE           : Rock
    TITLE           : Crush
    track           : 8
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 212 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx136136137137135135135133131129127127149124123122118116116114112120120118151129125119117116116114113120121119181150148143141137134131128125124121
diff :............................................................................................................ +2.................................
     : 60 60 61 60 58 60 60 58 58 58 58 60 82 35 59 59 56 58 60 58 58 68 60 58 93 38 56 54 58 59 60 58 59 67 61 58122 29 58 55 58 56 57 57 57 57 59 57

|| || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
Koimusume no rondo 160k 8 12 6 0 183 122reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Koimusume_no_rondo.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 160k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'Koimusume_no_rondo.wav':
  Duration: 00:02:51.51, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  3  3  3  3  3  3  3  3  3  3  3  3  6  6  6  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx158158158149151156147147151146123124169166164159154152155154149144122122183177168163161159155150150147123123178173168159164157158158170154144140
diff :........................................................................ +1.....................................................................
     : 60 60 60 51 62 65 51 60 64 55 37 61105 57 58 55 55 58 63 59 55 55 38 60121 54 51 55 58 58 56 55 60 57 36 60115 55 55 51 65 53 61 60 72 44 50 56

|| || k|| 8|| 12|| 6|| 0|| 183|| 122|| 60|| 121||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
Koimusume no rondo 192k 8 12 3 0 183 121reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Koimusume_no_rondo.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 192k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'Koimusume_no_rondo.wav':
  Duration: 00:02:51.51, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 192 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  3  3  3  3  3  3  3  3  3  3  3  3  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx160153153153151151148143146144121121183166166163157155153156161156124123162161158157155151151160164151124120170168159159160161157161163154129123
diff :.................................... +2.........................................................................................................
     : 60 53 60 60 58 60 57 55 63 58 37 60122 43 60 57 54 58 58 63 65 55 28 59 99 59 57 59 58 56 60 69 64 47 33 56110 58 51 60 61 61 56 64 62 51 35 54

|| || k|| 8|| 12|| 3|| 0|| 183|| 121|| 60|| 122||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
Koimusume no rondo 224k 8 13 5 0 183 122reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Koimusume_no_rondo.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 224k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'Koimusume_no_rondo.wav':
  Duration: 00:02:51.51, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 224 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  3  3  3  3  3  3  3  3  3  3  3  3  3  5  5  5  5  5  5  5  5  5  5  5  5  5  6  6  6  6  6  6  6  6  6  6  6  6  6
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12  0  1  2  3  4  5  6  7  8  9 10 11 12  0  1  2  3  4  5  6  7  8  9 10 11 12  0  1  2  3  4  5  6  7  8  9 10 11 12
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx149149146149146146143142135139134133125157164154149158154151145146133119119122183169169158163152151151145136121118127173162164153153152150149151151121117124
diff :.............................................................................. +1...........................................................................
     : 60 60 57 63 57 60 57 59 53 64 55 59 52 92 67 50 55 69 56 57 54 61 47 46 60 63121 46 60 49 65 49 59 60 54 51 45 57 69106 49 62 49 60 59 58 59 62 60 30 56 67

|| || k|| 8|| 13|| 5|| 0|| 183|| 122|| 60|| 121||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
audio fileratewmaximaxwisf_idx[w*16+i]offstatus
Sphere no hane 160k 8 12 6 0 187 126reproduced
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Sphere_no_hane.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 160k "kfanmr.mp4"
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers
  built on Dec  9 2014 23:03:19 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr
a-cflags='-march=nocona' --optflags=-O2
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 14.100 / 56. 14.100
  libavformat    56. 15.103 / 56. 15.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'Sphere_no_hane.wav':
  Duration: 00:04:14.48, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, mp4, to 'kfanmr.mp4':
  Metadata:
    encoder         : Lavf56.15.103
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
w    :  0  0  0  0  0  0  0  0  0  0  0  0  3  3  3  3  3  3  3  3  3  3  3  3  6  6  6  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx137149143142139137136130132128132127154157156155148146134133134127126126187164163163159159158155155154148150166166166164159154154153154153153149
diff :........................................................................ +1.....................................................................
     : 60 72 54 59 57 58 59 54 62 56 64 55 87 63 59 59 53 58 48 59 61 53 59 60121 37 59 60 56 60 59 57 60 59 54 62 76 60 60 58 55 55 60 59 61 59 60 56

|| || k|| 8|| 12|| 6|| 0|| 187|| 126|| 60|| 121||
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:362 by Kamedo2, 10 years ago

Other assertion errors. Hope it helps.

$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 40k "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx122122121121121126126124129126129133134136138141139200176146137131123122123123123124124
diff :................................................... +1.................................
     : 60 60 59 60 60 65 60 58 65 57 63 64 61 62 62 63 58121 36 30 51 54 52 59 61 60 60 61 60


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 64k "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx108108107112107118125125117120130133134136138140137200157145137131123122123123121124120121121122119
diff :................................................... +3.............................................
     : 60 60 59 65 55 71 67 60 52 63 70 63 61 62 62 62 57123 17 48 52 54 52 59 61 60 58 63 56 61 60 61 57


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 80k "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
zeros:  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
           #
sf_idx158158120123118118120120120120120120122127130134140178187130132127129131184186125130160188136140188144122178
diff :...   ........................................................................ -1...........................
     : 60    22 63 55 60 62 60 60 60 60 60 62 65 63 64 66 98 69  3 62 55 62 62113 62 -1 65 90 88  8 64108 16 38116


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 128k "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
zeros:  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
           #
sf_idx152161109103103103118119111105105107170163133123117128123174174133129174174138177186135163129168129163124166126163129169
diff :...   .............................. +3.................................................................................
     : 60    17 54 60 60 75 61 52 54 60 62123 53 30 50 54 71 55111 60 19 56105 60 24 99 69  9 88 26 99 21 94 21102 20 97 26100


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 160k "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
zeros:  0  1  0  0  0  1  1  1  1  1  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
           #           #  #  #  #  #                                      #
sf_idx159159108108 88 88 88 88 88 88 88 98102102109111152145122125114109115173156127118159148130139148126130148120131148120135148
diff :...   .........               ....................................    +4...................................................
     : 60     9 60 40                60 70 64 60 67 62101 53 37 63 49 55   124 43 31 51101 49 42 69 69 38 64 78 32 71 77 32 75 73


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 320k "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
zeros:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sf_idx102102 99104102104101103100100 99102102104 99103134165103102102103103101101101103102103101102100102101102100101 99100100101101109110101
diff :...................................................... -2..............................................................................
     : 60 60 57 65 58 62 57 62 57 60 59 63 60 62 55 64 91 91 -2 59 60 61 60 58 60 60 62 59 61 58 61 58 62 59 61 58 61 58 61 60 61 60 68 61 51


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -q:a 0.5 "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
zeros:  0  1  0  0  0  1  1  1  1  1  1  1  1  1  1  0  0  0  0  0  0  0  1  1  1  0  0  1  0  0  0  0  0  0  0  0
           #           #  #  #  #  #  #  #  #  #  #                       #  #  #        #
sf_idx186186121119119119119119119119119119119119119121124130179178129123123123123175176180123134188129122171182133
diff :...    -5......                              .....................         ......   ........................
     : 60    -5 58 60                               62 63 66109 59 11 54         112 61     7 71114  1 53109 71 11


$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -q:a 1 "kfanmr2.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
i    :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
zeros:  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
           #
sf_idx180180119123120120121122124126121163157123119119119140166127127123176132156170136165120165160165171134171128168167166167138
diff :...    -1..................................................................................................................
     : 60    -1 64 57 60 61 61 62 62 55102 54 26 56 60 60 81 86 21 60 56113 16 84 74 26 89 15105 55 65 66 23 97 17100 59 59 61 31







$ ffmpeg_v9b -y -i "snippet_tai3.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 192k "kfanmr3.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
        #
sf_idx 91 91 90 90 88 92 93 93 93100104109 85 83 82 78 76 76 84 84 95100111148 86 83 77 73 87 96102105106110117153 90 89 77 77 77 98 98102106108116156
diff :   ..................................................................... -2................................. -3.................................
     :    60 59 60 58 64 61 60 60 67 64 65 36 58 59 56 58 60 68 60 71 65 71 97 -2 57 54 56 74 69 66 63 61 64 67 96 -3 59 48 60 60 81 60 64 64 62 68100


$ ffmpeg_v9b -y -i "snippet_tai3.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 256k "kfanmr3.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
        #
sf_idx 91 91 90 90 84 92 93 93 93100102105 85 83 81 78 76 76 84 84 95100111148 86 83 77 73 91 91102105106110117153 90 89 77 77 88 91102102106110116156
diff :   ..................................................................... -2................................. -3.................................
     :    60 59 60 54 68 61 60 60 67 62 63 40 58 58 57 58 60 68 60 71 65 71 97 -2 57 54 56 78 60 71 63 61 64 67 96 -3 59 48 60 71 63 71 60 64 64 66100


$ ffmpeg_v9b -y -i "snippet_tai3.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 320k "kfanmr3.mp4"
w    :  0  0  0  0  0  0  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  4  4  4  4  4  4  4  4  4  4  4  4  7  7  7  7  7  7  7  7  7  7  7  7
i    :  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11  0  1  2  3  4  5  6  7  8  9 10 11
zeros:  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
        #
sf_idx 91 91 90 90 84 92 93 93 93100102105 85 83 81 78 76 76 84 84 95 98111148 86 83 77 73 91 91102105106110117153 90 82 81 77 88 98102102106110116156
diff :   ..................................................................... -2................................. -3.................................
     :    60 59 60 54 68 61 60 60 67 62 63 40 58 58 57 58 60 68 60 71 63 73 97 -2 57 54 56 78 60 71 63 61 64 67 96 -3 52 59 56 71 70 64 60 64 64 66100

by Kamedo2, 10 years ago

v7 patch altered to reflect the latest changes. This should work for the git head.

by Kamedo2, 10 years ago

v9b patch altered to reflect the latest changes. This should work for the git head.

comment:363 by Kamedo2, 10 years ago

The -aac_coder anmr assertion error still happens after the Michael Niedermayer's changes.

comment:364 by klaussfreire, 10 years ago

I tracked it all the way to codebook_trellis_rate and encode_window_bands_info with short windows. When the optimum allocation spans more than SCALE_MAX_DIFF sf, anmr is carefull not to create allocations that result in deltas to be encoded greater than SCALE_MAX_DIFF, but codebook_trellis_rate and the other both undo this. I tried many ways to patch that with no avail as of yet. The issue almost always happens near zero bands and the switch from one window group to the next (which is usually where the greatest deltas happen and where it is hardest to enforce).

But I did just notice the assertion error happens on git head as well. It seems like a pre-existing bug. Feel free to check exactly under which conditions, I just saw the error logs on a run of tests and didn't get to comparing when it happens vs v9b just yet.

by Kamedo2, 10 years ago

Attachment: FFmpeg_anmr_error8.flac added

This causes the assertion error on -q:a 1 on v9b. -q:a 0.99 or 1.01 is safe. Susanne Vega, Tom's Diner http://www.rarewares.org/test_samples/

comment:365 by Kamedo2, 10 years ago

Lots of good jobs are going on, such as these, thanks to Claudio Freire and Michael Niedermayer.
Now the patch does not apply to the git head.

avcodec/aacpsy: Fix AAC Psy PE reduction calculation when multiple iterations are required

This is a small change, but it does have a big impact on bit allocation.

all the regressions marked in the report have no audible
difference (I didn't check them all though), but the improvements can
be heard.

This affects mostly high bit rates. It's related to issue #2686.

In the report, A is the patched version, B is unpatched, all
comparisons show deltas in the form (A-B), so a positive pSNR delta
means a better quality in the patched version, and negative a
regression. Regressions are only considered for pSNR deltas below
-1db, they're considered serious below -6db.

All measurements were done with tiny_psnr.

The summary of the report inline for quick reading:

Files: 58
Bitrates: 6
Tests: 347
Serious Regressions: 0 (0%)
Regressions: 10 (2%)
Improvements: 54 (15%)
Big improvements: 26 (7%)
Worst regression - sine_tester.flac - 384k
  - StdDev: 1.68        pSNR: -3.05     maxdiff: -178.00
Best improvement - 07 - Bound.flac - 384k
  - StdDev: -1700.05    pSNR: 20.64     maxdiff: -29595.00
Average          - StdDev: -55.67       pSNR: 1.20      maxdiff: -1593.00
AAC: Fix M/S stereo encoding

This patch fixes a pointer arithmetic bug in adjust_frame_information that resulted in heavily corrupted audio when using M/S encoding. Also, a backup copy of untransformed coefficients has to be kept around or attempts at re-processing the frame (which happens when hevavily overspending bits during transients) will result in re-encoding of the coefficients and subsequent corruption of the resulting stream.

A/B testing shows the bug as corrected, but still cannot prove that M/S coding is a win at least in numbers. Limited listening tests do show improvement on M/S encoded samples in lower bitrates, but they're hidden among the other artifacts that remain to be corrected in the encoder.

Some of the regressions flagged in the report do show poor stereo image (but not buggy), so M/S encoding is clearly not good enough yet to be defaulted to auto.

In numbers, Patched against Unpatched, stereo_mode auto:

  Files: 114
  Bitrates: 6
  Tests: 683

  Serious Regressions: 0 (0%)
  Regressions: 0 (0%)
  Improvements: 227 (33%)
  Big improvements: 92 (13%)
  Worst regression - mybloodrusts.wv - 256k
    - StdDev: 28.61       pSNR: -0.43     maxdiff: 1372.00
  Best improvement - 60.wv - 384k
    - StdDev: -369.57     pSNR: 45.02     maxdiff: -13322.00
  Average          - StdDev: -80.56       pSNR: 2.49      maxdiff: -8858.00

Patched against Unpatched stereo_mode ms_off shows no difference.

Patched stereo_mode auto vs Unpatched stereo_mode ms_off shows a small average improvement, just not too significant:

  Serious Regressions: 0 (0%)
  Regressions: 10 (1%)
  Improvements: 45 (6%)
  Big improvements: 2 (0%)
  Worst regression - Illinois.wv - 256k
    - StdDev: 33.20       pSNR: -2.03     maxdiff: 477.00
  Best improvement - song_of_circomstances.flac - 384k
    - StdDev: -3.97       pSNR: 7.61      maxdiff: -826.00
  Average          - StdDev: -10.25       pSNR: 0.20      maxdiff: -281.00

comment:366 by klaussfreire, 10 years ago

Yes, I'm picking apart v9b step by step. Rebasing v9b would be pointless until that's done (I'm going to push another set of small patches tonight for instance).

comment:367 by Kamedo2, 10 years ago

I've encoded many sounds, and listened to hours of them. Standard ABR and VBR were tested.

ffmpeg_aac320k_collapse4 at 96 kbps is quite bad at this moment of the git head (r70520).
The sine wave warbling problem #2706 still appear.

comment:368 by Kamedo2, 10 years ago

You might be aware of this, but the recent push to the git head:
"AAC: Add support for 7350Hz sampling rates, no error on too hight bitrate."

-    ERROR_IF(i >= 12,
+    ERROR_IF(i == 16
+                || i >= (sizeof(swb_size_1024) / sizeof(*swb_size_1024))
+                || i >= (sizeof(swb_size_128) / sizeof(*swb_size_128)),

undid Michael Niedermayer's contribution:
"avcodec/aacenc: Fix sample rate check".

Fixes out of array read
Fixes CID1257803, CID1257797, CID1257789, CID1257786

-    ERROR_IF(i == 16,
+    ERROR_IF(i >= 12,
              "Unsupported sample rate %d\n", avctx->sample_rate);
     ERROR_IF(s->channels > AAC_MAX_CHANNELS,
              "Unsupported number of channels: %d\n", s->channels);

comment:369 by Hendrik, 10 years ago

Actually no, since swb_size_1024/128 contain 13 elements each, so any value above 12 will trigger the OR conditions. (13 >= 13)

So what he did was practically allow one further frequency.

Not sure what the i == 16 check in addition is good for, however, i guess its the maximum limit the spec allows, and the others are just rates we don't implement?

Last edited 10 years ago by Hendrik (previous) (diff)

comment:370 by klaussfreire, 10 years ago

Actually, the 12 was there because swb_size_N were of size 12. The patch added another entry, so it'd now be size 13. But instead of using 13, I replaced it by sizeof which is easier to maintain.

i == 16 is there so that the code doesn't depend on the size of swb_size_N. The above loop that searches for the samplerate_index ends at 16, so 16 means the search didn't find anything, and that's also an error independent of whether i is out of bounds for swb_size_N.

comment:371 by Kamedo2, 10 years ago

Klaussfreire, could you please post your future Todo list?

comment:372 by klaussfreire, 10 years ago

Sorry for the late reply. I meant to answer a while ago but other duties made me lose track.

Currently, I'm debugging the patch adding support for 7350hz sample rates which doesn't pass the test for MIPS. So I'm debugging that, and am a bit at a loss. The issue seems to be a floating point precision issue that is not on the MIPS-specific side of the code, and that surfaces with hardware floating point emulation. So... doesn't look immediately fixable.

So if I don't find a fix, the TODO follows:

  • Push small fixes
    • ANMR bugs
  • Push improvements (in no particular order)
    • M/S search improvements
    • Clip avoidance
    • VBR support
    • R/C improvements, bit allocation improvments, etc (this last step I'm not sure I can split in smaller steps, as they all interact and extracting one not only is difficult but may also cause regressions)

The assertion error I will leave for GSoC students to try and fix (or will attack it later if GSoC fails to address it).

comment:373 by Hendrik, 10 years ago

Personally, I wouldn't worry too much about a MIPS-only issue at 7350Hz, thats unusual edge cases in the second degree. Can schedule it for later.

comment:374 by Kamedo2, 10 years ago

Is Rostislav Pehlivanov going to implement the Perceptual Noise Substitution?

comment:375 by Rostislav Pehlivanov, 10 years ago

Cc: atomnuker@gmail.com added

comment:376 by klaussfreire, 10 years ago

Well, that's really up to GSoC, but he's already done a PoC that's good enough for a starting point, so that's a probably.

comment:377 by Ridley Combs, 10 years ago

Cc: rodger.combs@gmail.com added

comment:378 by Ridley Combs, 10 years ago

Any progress on this?

comment:379 by Rostislav Pehlivanov, 10 years ago

klaussfreire is working on v9c of the patch and hopes to get it sent to the mailing list by the end of the month.

comment:380 by Kamedo2, 10 years ago

I am currently conducting a personal listening test of these 5 encoders at 96kbps. The progress is 23% (17 samples / 74 samples).

  • lame 3.99.5 --abr 98
  • opusenc opus-tools-0.1.9 --bitrate 91
  • NeroAACEnc -q 0.333
  • ffmpeg_r70351_v7_patch -c:a aac -strict experimental -b:a 96k
  • ffmpeg_r70351_v9b_patch -c:a aac -strict experimental -b:a 96k

comment:381 by klaussfreire, 10 years ago

Nice.

I have some improvements on v9b but RL has not been in the mood to let me polish the patches for the ML. I'm close to having one almost submittable, but every DB at work just decided to act up :(

Anyway, looking forward to hearing about the results of those tests.

comment:382 by klaussfreire, 10 years ago

BTW... on which revision are you applying v9b? (it doesn't apply on head anymore)

in reply to:  382 comment:383 by Kamedo2, 10 years ago

Replying to klaussfreire:

BTW... on which revision are you applying v9b? (it doesn't apply on head anymore)

ffmpeg version N-70351-g2b40416 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --optflags=-O2

  libavutil      54. 19.100 / 54. 19.100
  libavcodec     56. 26.100 / 56. 26.100
  libavformat    56. 23.106 / 56. 23.106
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 11.102 /  5. 11.102
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100

By the way, the cutoff option -cutoff 124 or less on 44.1kHz sample rate stops the FFmpeg aac, regardless of the settings and the bitrate.

comment:384 by klaussfreire, 10 years ago

Yes, I've been thinking about putting a lower limit, but I couldn't decide on a lower bound. Almost all candidates I could think of I could also think a reason why they may be useful.

I guess we have another candidate there. That's probably the cutoff for the first scalefactor window.

comment:385 by Kamedo2, 10 years ago

The progress of the personal listening test is 50% now. (37 samples done / 74 samples).

  • lame 3.99.5 --abr 98
  • opusenc opus-tools-0.1.9 --bitrate 91
  • NeroAACEnc -q 0.333
  • ffmpeg_r70351_v7_patch -c:a aac -strict experimental -b:a 96k
  • ffmpeg_r70351_v9b_patch -c:a aac -strict experimental -b:a 96k

comment:386 by Kamedo2, 10 years ago

I am wondering if someone may want to -cutoff 120 to make a LFE channel of the surround sound (although it won't work).
Lower than -cutoff 3000 makes no sense from psychoacoustic point of view.

comment:387 by Kamedo2, 10 years ago

I would like to thank Rostislav Pehlivanov and Michael Niedermayer for committing the improvement.
Is the current git head similar to the N-70351-g2b40416+v9b patch I am testing?

comment:388 by Rostislav Pehlivanov, 10 years ago

@Kamedo2:
No, the current git master contains no changes from the previous v9b or the future v9c yet.
Claudio Freire is currently merging his v9c patch with the current git master and should send it off to the mailing list once he's done. This shouldn't hopefully take long, he only needs to tweak the PNS band marking.

comment:389 by Kamedo2, 10 years ago

@atomnuker
Thank you. Then I will need another listening test to confirm the progress on the MOS scale. The current test is 64% done now.

by Kamedo2, 9 years ago

Attachment: ffmpeg_aac_error1.flac added

FFmpeg doesn't stop when the sample rate is 8kHz and the bitrate is high. -ar 8000 -b:a 96k, -q:a 0.958 or higher. Fear Factory, Digimortal, Linchpin.

comment:390 by Kamedo2, 9 years ago

This rare sample and this command induces infinite loop on the current git head.

ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4
ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4

comment:391 by Rostislav Pehlivanov, 9 years ago

@Kamedo2
The only time I've had an infinite loop was when I deliberately broke the trellis algorithm, so it's either that or the twoloop function. I'll take a look at it.

comment:392 by Kamedo2, 9 years ago

Any other options I should test?

comment:393 by Kamedo2, 9 years ago

This also induces infinite loop on the current git head.

ffmpeg73515 -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 13000 out.mp4
ffmpeg73515 -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 15000 out.mp4

comment:394 by Kamedo2, 9 years ago

The results of the 96kbps listening test

v9b patch beat the stable v7 patch. The quality of LAME and Nero was higher than the FFmpeg's native encoder.
http://i57.tinypic.com/11l37s2.png

Encoders

  • lame 3.99.5 --abr 98
  • opusenc opus-tools-0.1.9 --bitrate 91
  • NeroAACEnc -q 0.333
  • ffmpeg_r70351_v7_patch -c:a aac -strict experimental -b:a 96k
  • ffmpeg_r70351_v9b_patch -c:a aac -strict experimental -b:a 96k

Used Sound Samples

There were no significant differences between corpora.

http://i61.tinypic.com/30be92c.png
http://i58.tinypic.com/2qt8w1i.png

comment:395 by Kamedo2, 9 years ago

Further discussions of this listening test at 96kbps may be posted on Hydrogenaudio. http://www.hydrogenaud.io/forums/index.php?showtopic=109716

comment:396 by klaussfreire, 9 years ago

Isn't it a bit apples to oranges comparing vbr vs abr? Did you try v9b vbr?

Btw, I'm slowly pushing v9c (not posted here, but similar to v9b), once all is done and merged with GSoC stuff it should improve tenfold.

comment:397 by Kamedo2, 9 years ago

For state-of-the-art encoders, vbr is more advantageous than abr. But for Nero, vbr may not be noticeably superior to abr. http://d.hatena.ne.jp/kamedo2/20110430/1304181738

I tried Nero abr, but I could not set it to 96kbps.

by Kamedo2, 9 years ago

Attachment: SinceAlways.flac added

This is one exceptional case that degrades on v9b.

by Kamedo2, 9 years ago

Attachment: mybloodrusts.flac added

This is one exceptional case that degrades on v9b.

by Kamedo2, 9 years ago

Attachment: castanets.flac added

This is one exceptional case that degrades on v9b.

comment:398 by Kamedo2, 9 years ago

Other exceptional case where v7 is better than v9b includes "Can't Wait Until Tonight (Dry Wurlitzer Mix).flac", "41_30sec.flac" etc.

Can the infinite loop problem on the git head be solved?

comment:399 by klaussfreire, 9 years ago

About the 41_30sec, I believe v9c fixes that, but I'll double-check just in case.

Re. the infinite loop, I'll take a look too when I get the time.

comment:400 by Kamedo2, 9 years ago

I posted more graphs in the discussion thread of the personal listening test at 96kbps.
http://www.hydrogenaud.io/forums/index.php?showtopic=109716

FFmpeg_anmr_error7.flac​ still stops FFmpeg on options -aac_coder faac and fast.

comment:401 by Kamedo2, 9 years ago

-aac_coder faac induces infinite loop whenever the bitrate is clamped to max. It never induces infinite loop when the bitrate is below the max.
This bug is reproducible on any samples on any channel/sampling freq. settings.

ffmpeg74294 -y -i Whitenoise.flac -c:a aac -strict experimental -b:a 530k -ac 2 -ar 44100 -aac_coder faac whitenoise.mp4
ffmpeg version N-74294-g45d9d16 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      54. 30.100 / 54. 30.100
  libavcodec     56. 57.100 / 56. 57.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 32.100 /  5. 32.100
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'Whitenoise.flac':
  Duration: 00:00:05.00, start: 0.000000, bitrate: 1550 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
[aac @ 01e37760] Too many bits per frame requested, clamping to max
Output #0, mp4, to 'whitenoise.mp4':
  Metadata:
    encoder         : Lavf56.40.101
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 529 kb/s
    Metadata:
      encoder         : Lavc56.57.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
size=      86kB time=00:00:01.48 bitrate= 475.6kbits/s

[aac @ 01e37760] Too many bits per frame requested, clamping to max is the sign it fails.

ffmpeg74294 -y -i Whitenoise.flac -c:a aac -strict experimental -b:a 133k -ac 1 -ar 22050 -aac_coder faac whitenoise.m
p4
ffmpeg version N-74294-g45d9d16 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      54. 30.100 / 54. 30.100
  libavcodec     56. 57.100 / 56. 57.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 32.100 /  5. 32.100
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
Input #0, flac, from 'Whitenoise.flac':
  Duration: 00:00:05.00, start: 0.000000, bitrate: 1550 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
[aac @ 01997760] Too many bits per frame requested, clamping to max
Output #0, mp4, to 'whitenoise.mp4':
  Metadata:
    encoder         : Lavf56.40.101
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 22050 Hz, mono, fltp (16 b
it), 132 kb/s
    Metadata:
      encoder         : Lavc56.57.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help

in reply to:  390 ; comment:402 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

This rare sample and this command induces infinite loop on the current git head.

ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4
ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4

I cannot replicate this bug anymore so it's probably fixed, could you test with the newest git master to see if it causes problems?

I'll look into what causes the faac coder to get stuck at high bitrates.

in reply to:  402 comment:403 by Kamedo2, 9 years ago

Replying to atomnuker:

I cannot replicate this bug anymore so it's probably fixed, could you test with the newest git master to see if it causes problems?

OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4
OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4
OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4
OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4

Yes, it was fixed.

I'll look into what causes the faac coder to get stuck at high bitrates.

OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 19000 out.mp4
OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 17000 out.mp4
NG ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 15000 out.mp4
NG ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 13000 out.mp4
NG ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 11000 out.mp4
OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 9000 out.mp4
OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 19000 out.mp4
OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 17000 out.mp4
NG ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 15000 out.mp4
NG ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 13000 out.mp4
NG ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 11000 out.mp4
OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 9000 out.mp4

The bug still exists in the git master. It induces infinite loop when the cutoff is 15, 13, 11kHz, but not 9kHz. When the cutoff is 7kHz, the initial setup takes 16 seconds, which is unnaturally slow, but the encoded sound is OK. ffmpeg_aac_lead_voice.flac is 44.1kHz mono.

by Kamedo2, 9 years ago

Attachment: ffmpeg_96k_error.flac added

Low Freq. Sine Sweep Stereo with right channel inverted; inaudible on mono.

comment:404 by Kamedo2, 9 years ago

This sound will crash the FFmpeg, when the sampling rate is 96kHz.

ffmpeg74961-g61009a7 -y -i ffmpeg_96k_error.flac -c:a aac -strict experimental -b:a 96k -ac 2 -ar 96000 out.mp4
ffmpeg74961-g61009a7 -y -i ffmpeg_96k_error.flac -c:a aac -strict experimental -b:a 160k -ac 2 -ar 96000 out.mp4
ffmpeg74961-g61009a7 -y -i ffmpeg_96k_error.flac -c:a aac -strict experimental -q:a 1 -ac 2 -ar 96000 out.mp4

I tried many sine sweeps but it seems that the bug only happens when one of channel is inverted.

comment:405 by Rostislav Pehlivanov, 9 years ago

There's something weird happening in the search_for_is for one of the phases.
Will submit a patch in a few hours and reply here for you to test.

comment:406 by Kamedo2, 9 years ago

Thank you. With your devotion, the sound is getting great, and I have heard no apparent problem on over 20 hours of music and speech tracks on common settings.

in reply to:  406 comment:407 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

Thank you. With your devotion, the sound is getting great, and I have heard no apparent problem on over 20 hours of music and speech tracks on common settings.

Thanks, nice to know someone's using the encoder.
Make sure to reencode them once the encoder's ready :-)

Fixed the bug. Probably fixes quite a lot of IS artifacts too.

comment:408 by Kamedo2, 9 years ago

I think the sound deteriorated on ffmpeg75016-g50d9121, compared to 74961-g61009a7, after fixing the bug. Tested on music tracks on 128k, 192k, 320k, q1, q2. Stereo 192k 32000Hz is especially worsened.

comment:409 by Rostislav Pehlivanov, 9 years ago

Huh, that's odd. The changes which I made to PNS today (at 12:39 UTC, commit b6cc8ec7ec) brought PNS closer to what it used to be before but fixed the warbling artifacts at lower frequencies (it's used alot more now). The changes to the IS which fixed the bug yesterday (1956cfbaedd36) shouldn't really have done much to the quality at all and I didn't hear a difference.

74961-g61009a7 is before I made my PNS changes from yesterday, so are you sure that the last current git master still sounds worse? The PNS commit I made yesterday did reduce PNS usage too much (before I fixed that today).

Either way, I'll take a listen to what the encoder sounded like before and try to see if it's better in the current master.

by Kamedo2, 9 years ago

mybloodrusts.flac encoded at -b:a 128k by ffmpeg74961-g61009a7.

by Kamedo2, 9 years ago

mybloodrusts.flac encoded at -b:a 128k by ffmpeg75043-gb31041a.

comment:410 by Kamedo2, 9 years ago

I tried both the 74961-g61009a7 and the current git head(75043-gb31041a), and the 74961-g61009a7 was noticeably better than the current git head. The S/N was significantly better on all L, R, M, and S, and the difference was more pronounced on the tonal tracks than the transient blocks.

comment:411 by Rostislav Pehlivanov, 9 years ago

Well, thanks for the feedback.
I'll add back the 4+ quantization factor for the PNS energy tommorow morning. I've tested other decoders (to make sure it's not the ffmpeg decoder causing issues) so I have no idea why there's such an energy difference, but apparently that 4+ was enough to make PNS sound right.

As for the L, R, M and S signal/noise ratio, did you test that without PNS? That could have interfered. Could you tell me if IS sounded better before or after without PNS?

in reply to:  411 comment:412 by Kamedo2, 9 years ago

Confirmed that the current git head 75147-g9d742d2 fixed the regression.

Replying to atomnuker:

As for the L, R, M and S signal/noise ratio, did you test that without PNS? That could have interfered. Could you tell me if IS sounded better before or after without PNS?

I have tested -b:a 128k -ar 44100, -b:a 192k -ar 32000, -b:a 320k -ar 48000, -q:a 1 -ar 44100, -q:a 2 -ar 48000, without additional -aac_pns enable nor -aac_is enable settings.

What optional settings should I test?

comment:413 by Hendrik, 9 years ago

PNS and IS are enabled by default, so your tests would've included them in any case.
Pass -aac_pns 0 to disable it, and test IS alone.

Last edited 9 years ago by Hendrik (previous) (diff)

comment:414 by Kamedo2, 9 years ago

I have tested 75156-gfd8b90f. At 128kbps with IS, with PNS is better.

http://wiki.hydrogenaud.io/index.php?title=Joint_stereo#Intensity_Stereo
Intensity stereo is by definition a lossy coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.

comment:415 by klaussfreire, 9 years ago

In my tests, it has usually been the case that as you increase the bitrate, IS is used less while MS usage increases, naturally due to the R/D model used. If you see otherwise, it would be useful to know and have a sample to better tweak the model.

comment:416 by Kamedo2, 9 years ago

Klaussfreire, Thank you for the explanation! At 240kbps, it was hard to spot the difference between -aac_is 0 and -aac_is enable.

comment:417 by Rostislav Pehlivanov, 9 years ago

Hmm, perhaps it's best I add a print for PNS/IS/Prediction/MS/TNS usage when the verbose level has been increased.
Anyway, Kamedo2: I pushed some PNS patches yesterday which should have fixed the drop in quality. Did it improve?

in reply to:  417 comment:418 by Kamedo2, 9 years ago

Replying to atomnuker:

Anyway, Kamedo2: I pushed some PNS patches yesterday which should have fixed the drop in quality. Did it improve?

Yes, 75156-gfd8b90f is great.

comment:419 by Kamedo2, 9 years ago

I tested 75268-g3f9fa2d and the quality was a bit worse than the 74961-g61009a7 and 75147-g9d742d2 on -b:a 128k 44.1kHz stereo, but the encoding speed was very fast.

comment:420 by Kamedo2, 9 years ago

libavcodec / aaccoder_twoloop.h line 172: 60 - qstep

 166             if (tbits > destbits) {
 167                 for (i = 0; i < 128; i++)
 168                     if (sce->sf_idx[i] < 218 - qstep)
 169                         sce->sf_idx[i] += qstep;
 170             } else {
 171                 for (i = 0; i < 128; i++)
 172                     if (sce->sf_idx[i] > 60 - qstep)
 173                         sce->sf_idx[i] -= qstep;
 174             }

might meant

 166             if (tbits > destbits) {
 167                 for (i = 0; i < 128; i++)
 168                     if (sce->sf_idx[i] < 218 - qstep)
 169                         sce->sf_idx[i] += qstep;
 170             } else {
 171                 for (i = 0; i < 128; i++)
 172                     if (sce->sf_idx[i] > 60 + qstep)
 173                         sce->sf_idx[i] -= qstep;
 174             }

or

 166             if (tbits > destbits) {
 167                 for (i = 0; i < 128; i++)
 168                     sce->sf_idx[i] = FFMIN(sce->sf_idx[i]+qstep, 217);
 169             } else if (destbits > tbits){
 170                 for (i = 0; i < 128; i++)
 171                     sce->sf_idx[i] = FFMAX(sce->sf_idx[i]-qstep, 61);
 172             } else{
 173                 break;
 174             }

comment:421 by Hendrik, 9 years ago

I'm confused what your post is meant to say. This code is just copied from the old position in aaccoder.c

in reply to:  420 comment:422 by klaussfreire, 9 years ago

Replying to Kamedo2:

libavcodec / aaccoder_twoloop.h line 172: 60 - qstep

 166             if (tbits > destbits) {
 167                 for (i = 0; i < 128; i++)
 168                     if (sce->sf_idx[i] < 218 - qstep)
 169                         sce->sf_idx[i] += qstep;
 170             } else {
 171                 for (i = 0; i < 128; i++)
 172                     if (sce->sf_idx[i] > 60 - qstep)
 173                         sce->sf_idx[i] -= qstep;
 174             }

might meant

 166             if (tbits > destbits) {
 167                 for (i = 0; i < 128; i++)
 168                     if (sce->sf_idx[i] < 218 - qstep)
 169                         sce->sf_idx[i] += qstep;
 170             } else {
 171                 for (i = 0; i < 128; i++)
 172                     if (sce->sf_idx[i] > 60 + qstep)
 173                         sce->sf_idx[i] -= qstep;
 174             }

or

 166             if (tbits > destbits) {
 167                 for (i = 0; i < 128; i++)
 168                     sce->sf_idx[i] = FFMIN(sce->sf_idx[i]+qstep, 217);
 169             } else if (destbits > tbits){
 170                 for (i = 0; i < 128; i++)
 171                     sce->sf_idx[i] = FFMAX(sce->sf_idx[i]-qstep, 61);
 172             } else{
 173                 break;
 174             }

you're right, v9c has it fixed (probably earlier versions too).

comment:423 by Kamedo2, 9 years ago

I have encoded over 200 GB of diverse sounds on diverse settings without apparent problems.

comment:424 by Rostislav Pehlivanov, 9 years ago

I've been bugging Claudio almost daily to push his work to git master so that finally we can move on with testing it out and nailing any last bugs left.
This might hopefully happen in a day or two if there are no any setbacks left.

comment:425 by Hendrik, 9 years ago

We're all at the edge of our seats here and waiting. ;)

comment:426 by Kamedo2, 9 years ago

-stereo_mode in the FFmpeg Codecs Documentation was abolished and -aac_ms 1 (Force M/S stereo coding) will be used instead, Am I right?

comment:427 by Hendrik, 9 years ago

Yes, that is correct.

comment:428 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:
Yes, all encoder options now start with "-aac_":

ffmpeg -help encoder=aac

AAC encoder AVOptions:
  -aac_coder         <int>        E...A... Coding algorithm (from -1 to 3) (default 2)
     faac                         E...A... FAAC-inspired method
     anmr                         E...A... ANMR method
     twoloop                      E...A... Two loop searching method
     fast                         E...A... Constant quantizer
  -aac_ms            <boolean>    E...A... Force M/S stereo coding (default false)
  -aac_is            <boolean>    E...A... Intensity stereo coding (default auto)
  -aac_pns           <boolean>    E...A... Perceptual noise substitution (default auto)
  -aac_tns           <boolean>    E...A... Temporal noise shaping (default auto)
  -aac_pred          <boolean>    E...A... AAC-Main prediction (default auto)

Any option set to automatic means that the profile will determine it by default, unless it is set via the command line. Any option not set to a default 'auto' means the default value indicated will be set. Also, "-aac_ms" is not boolean as indicated but can be set to '-1' which means it will be automatically used when there will be an encoding gain.

Keep in mind the psychoacoustic system currently doesn't account for the cutoff which the new coder introduced, leading to bits being wasted and the quality being decreased. Claudio will be pushing a patch to fix that. This only affects heavy synth samples but should fix a lot of bugs which might be related currently. This is also what's currently blocking us from removing the 'experimental' flag.
Also, I still have to merge my LTP patches, which will happen later today.

Kamedo2: Not sure how but I got an email invitation from Shion to Slack (Audio Video Encoding Community) which you are apparently a member of. I understand enough Japanese to kinda understand the email and I'd love to join but I'm still learning, let alone understanding technical jargon. Sorry :|
Maybe after I know a little more.

Last edited 9 years ago by Rostislav Pehlivanov (previous) (diff)

comment:429 by Hendrik, 9 years ago

aac_tns "auto" is a bit misleading though, its not actually turned on for any of the profiles.

in reply to:  429 comment:430 by Rostislav Pehlivanov, 9 years ago

Replying to heleppkes:

aac_tns "auto" is a bit misleading though, its not actually turned on for any of the profiles.

Not yet, no. I'll change it to false when I commit my LTP changes.

comment:431 by Kamedo2, 9 years ago

The combinations of these options below are now extensively tested. Rate, speed, and error codes are monitored.

["-aac_coder faac", "-aac_coder fast", "-aac_coder twoloop"],
["-aac_ms 0","-aac_ms 1"],
["-aac_is 0","-aac_is 1"],
["-aac_tns 0","-aac_tns 1"],
["-profile:a aac_main -aac_pred 1 -aac_pns 0","-profile:a aac_low -aac_pns 1","-profile:a mpeg2_aac_low"],
["-ar 8000", "-ar 44100", "-ar 48000", "-ar 96000"],
["-b:a 16k", "-b:a 96k", "-b:a 128k", "-q:a 1", "-b:a 240k", "-b:a 320k", "-b:a 512k", "-q:a 0.25"]

(2304 combinations total)
--aac_coder anmr seems to be unstable and prone to crashing.
-aac_coder faac and -aac_coder fast often ignore bitrate blatantly.

atomnuker: Yes, I am a member of the Fueru Wakame, the audio video encoding community on Slack. Glad to hear that you understand Japanese to that point.

comment:432 by klaussfreire, 9 years ago

Neither anmr, faac or fast were modified.

It is possible that they need to be updated to avoid the crashing, although I don't see how exactly. You could try confirming whether earlier revisions also exhibit that behavior, and how far back.

faac will probably be scrapped, fast will have to be rewritten, and anmr is a big question mark at present. I've been working on ANMR and some problems have surfaced that don't seem easy to resolve, or at all possible with ANMR's approach. Surely it can be made not to crash, but beyond that I'm unsure how far we can push ANMR.

For now, the priority is twoloop.

comment:433 by Kieran Kunhya, 9 years ago

Since this appears to be the aac encoder development thread I have been fuzzing the encoder and get this crash a lot:

http://pastie.org/private/xlnfw9vfkn7dbxgwfaurug

comment:434 by klaussfreire, 9 years ago

Yeah, I asked on IRC but you seemed to be away: did you build with assertion_level=2? Can you share a sample that reproduces the crash? (or add the fuzzer as a fate test?)

I don't see many ways in which that crash could happen. The only way I can think of has an assert that should have tripped earlier.

comment:435 by Kieran Kunhya, 9 years ago

Yes, same with assertion-level=2

Sample can be found here: http://obe.tv/Downloads/fuzz1.wav

./ffmpeg_g -i "fuzz1.wav" -strict -2 -y out.aac
Version 0, edited 9 years ago by Kieran Kunhya (next)

comment:436 by Kamedo2, 9 years ago

The fuzz1.wav file seems to be improperly delivered. This seems to be a stereo 48kHz 16bit linear wav file, but the header is 52 C9 46 46, as opposed to usual 52 49 46 46(RIFF), and 'fmt ' chunk have 10 02 00 00(528) length when normally and from context 10 00 00 00(16).

Kierank, is the wav file playable in your environment?

comment:437 by Kieran Kunhya, 9 years ago

Hi,
No the file is not meant to be playable, it's the output from a tool designed to make crazy inputs in order to crash decoders (or in this case encoders).
Kieran

comment:438 by klaussfreire, 9 years ago

An update, I have a fix (fixes in fact) for the assertion error, I'll be pushing it as soon as I can confirm it causes no regressions (it did, fixed a few).

comment:439 by Kamedo2, 9 years ago

-q:a 1100k or more value often crash the encoder.

comment:440 by klaussfreire, 9 years ago

That's probably arithmetic overflow

comment:441 by Rostislav Pehlivanov, 9 years ago

Kamedo2:
I've improved TNS and have made it the default (-aac_tns 1). Also MS coding is now automatic by default (-aac_ms -1).
I've also added LTP support for voice or piano music encoding, use -aac_ltp 1 (or -profile:a aac_ltp) to test it. All features are currently in git master.

Claudio has 2 small fixes to merge. Hopefully won't take long.

Last edited 9 years ago by Rostislav Pehlivanov (previous) (diff)

comment:442 by Kamedo2, 9 years ago

I am testing the combination of these options on N-76111-g8c9c8fd.

["-b:a 8k", "-b:a 80k", "-b:a 160k", "-b:a 320k", "-b:a 530k", "-q:a 0.1", "-q:a 2", "-q:a 320k"],
["-profile:a aac_ltp", "-profile:a aac_main -aac_pred 1 -aac_pns 0", "-profile:a aac_low -aac_pns 1", "-profile:a mpeg2_aac_low"],
["-ar 8000", "-ar 11025", "-ar 24000", "-ar 44100", "-ar 48000", "-ar 96000"],
["", "-cutoff 15000", "-cutoff 22050"]

Test tracks are ffmpeg_aacvbr_pulse1.flac(12.12sec), ffmpeg_anmr_error.flac(2.32sec), ffmpeg_96k_error.flac(2.01sec).

-profile:a aac_ltp encoding is currently slower than the realtime, about 0.9x on 44.1kHz. Is it the intended behavior?

comment:443 by Kamedo2, 9 years ago

ffmpeg76324 -y -i ffmpeg_anmr_error2.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4

or

ffmpeg76324 -y -i ffmpeg_anmr_error5.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4

crash the encoder.

comment:444 by Kamedo2, 9 years ago

ffmpeg76851 -y -i ffmpeg_anmr_error2.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4

still crashes the encoder.

ffmpeg version N-76851-ga330430 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55.  9.100 / 55.  9.100
  libavcodec     57. 16.100 / 57. 16.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 15.100 /  6. 15.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_anmr_error2.flac':
  Duration: 00:00:17.95, start: 0.000000, bitrate: 504 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 80 kb/s
    Metadata:
      encoder         : Lavc57.16.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
size=      38kB time=00:00:03.85 bitrate=  81.8kbits/s

comment:445 by klaussfreire, 9 years ago

ANMR isn't getting any love yet.

It will take some time, I discovered some nasty roadblocks in ANMR's approach.

Twoloop keeps giving away lessons that are useful for ANMR too, so my objective is to get twoloop to its full potential before I start massaging ANMR.

in reply to:  444 comment:446 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

ffmpeg76851 -y -i ffmpeg_anmr_error2.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4

still crashes the encoder.

ffmpeg version N-76851-ga330430 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55.  9.100 / 55.  9.100
  libavcodec     57. 16.100 / 57. 16.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 15.100 /  6. 15.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_anmr_error2.flac':
  Duration: 00:00:17.95, start: 0.000000, bitrate: 504 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 80 kb/s
    Metadata:
      encoder         : Lavc57.16.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
size=      38kB time=00:00:03.85 bitrate=  81.8kbits/s

Just pushed a commit which improved LTP and fixes the crash. Give it a try. New version is N-76858-g1e5dbb3.

comment:447 by Kamedo2, 9 years ago

Thank you, but it still crashes on 80kbps. This N-76863-g8000d48 is after the aac_tablegen speed up.

ffmpeg76863 -i ffmpeg_aacvbr_pulse1.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4
ffmpeg76863 -i ffmpeg_aac320k_collapse3.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4

in reply to:  447 comment:448 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

Thank you, but it still crashes on 80kbps. This N-76863-g8000d48 is after the aac_tablegen speed up.

ffmpeg76863 -i ffmpeg_aacvbr_pulse1.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4
ffmpeg76863 -i ffmpeg_aac320k_collapse3.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4

Hm, I can't seem to replicate either. Does it crash at any other bitrate for you?

comment:449 by Kamedo2, 9 years ago

ffmpeg76877 -y -i ffmpeg_aac320k_collapse3.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 128k out.mp4

ffmpeg version N-76877-g861f2b2 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55.  9.100 / 55.  9.100
  libavcodec     57. 16.100 / 57. 16.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 15.100 /  6. 15.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aac320k_collapse3.flac':
  Duration: 00:00:12.56, start: 0.000000, bitrate: 684 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 128 kb/s
    Metadata:
      encoder         : Lavc57.16.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
av_interleaved_write_frame(): Visual C++ CRT: Not enough memory to complete call
 to strerror.
size=       1kB time=00:00:02.02 bitrate=   5.0kbits/s

128kbps leads to av_interleaved_write_frame error, 80kbps just crashes.

by llogan, 9 years ago

Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

comment:450 by llogan, 9 years ago

$ ffmpeg -i assertion_diff_shimoseka.m4a -strict experimental -c:a aac -f null -
ffmpeg version N-76947-gec494e6 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --disable-doc
  libavutil      55.  9.100 / 55.  9.100
  libavcodec     57. 16.101 / 57. 16.101
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 17.100 /  6. 17.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'assertion_diff_shimoseka.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: isomiso2
    encoder         : Lavf57.19.100
  Duration: 00:00:50.13, start: 0.000000, bitrate: 129 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Output #0, null, to 'pipe:':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: isomiso2
    encoder         : Lavf57.19.100
    Stream #0:0(und): Audio: aac, 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc57.16.101 aac
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363
Aborted (core dumped)

Appears to be a regression, but I did not run a bisect. Attached sample input file from Zeranoe forum user Zaoshi.

in reply to:  450 comment:451 by Rostislav Pehlivanov, 9 years ago

Replying to llogan:

Appears to be a regression, but I did not run a bisect. Attached sample input file from Zeranoe forum user Zaoshi.

Bug seems to only happen with intensity stereo enabled. The newest patch by klaussfreire fixes the bug. Might look into a quick fix but it's a lower priority than reviewing that patch, considering how many artifacts it fixes.

comment:452 by Kamedo2, 9 years ago

Two crash bugs on 240kbps on both -profile:a aac_ltp and default.

ffmpeg77126 -i FFmpeg_anmr_error6.flac -c:a aac -profile:a aac_ltp -b:a 240k out.mp4
ffmpeg version N-77126-g357c626 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
File 'out.mp4' already exists. Overwrite ? [y/N] y
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 240 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77126 -y -i FFmpeg_anmr_error6.flac -c:a aac -b:a 240k out.mp4
ffmpeg version N-77126-g357c626 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 240 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:453 by Kamedo2, 9 years ago

Another crash example, probably another arithmetic overflow like comment:440. I think anything above -q:a 3 should be clipped to -q:a 3, because I don't think of any practical use, and to simplify the testing procedure.
http://listening-test.coresv.net/img2/noexp1.png
http://listening-test.coresv.net/img2/noexp2.png
http://listening-test.coresv.net/img2/noexp3.png

ffmpeg77126 -y -i FFmpeg_anmr_error6.flac -c:a aac -q:a 1280k out.mp4
ffmpeg version N-77126-g357c626 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16

comment:454 by Rostislav Pehlivanov, 9 years ago

Fixed the FFmpeg_anmr_error6.flac crashes in git master, version N-77158-g4c5136a. Give it a test.

comment:455 by Kamedo2, 9 years ago

These three still crashes.

ffmpeg77171 -y -i FFmpeg_aacvbr_pulse2.flac -c:a aac -b:a 1 out.mp4
ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_aacvbr_pulse2.flac':
  Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
[aac @ 001b5ba0] Bitrate 1 is extremely low, maybe you mean 1k
The bitrate parameter is set too low. It takes bits/s as argument, not kbits/s
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp (16
 bit), 0 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77171 -y -i FFmpeg_aacvbr_pulse2.flac -c:a aac -ar 11025 -cutoff 5000 -profile:a aac_main -b:a 8k out.mp4
ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_aacvbr_pulse2.flac':
  Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16
 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77171 -y -i FFmpeg_anmr_error6.flac -c:a aac -q:a 1280k out.mp4
ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16

comment:456 by Kamedo2, 9 years ago

ffmpeg77171 -y -i ffmpeg_aacvbr_pulse2.flac -c:a aac -ar 11025 -cutoff 5000 -profile:a aac_main -b:a 8k out.mp4
ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aacvbr_pulse2.flac':
  Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16
 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77171 -y -i sine_tester.flac -c:a aac -ar 11025 -cutoff 5000 -b:a 8k out.mp4
ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 19.100 / 57. 19.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'sine_tester.flac':
  Duration: 00:00:28.00, start: 0.000000, bitrate: 294 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit)
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.19.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (24
 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:457 by Hendrik, 9 years ago

Wasn't one of claudio's changes supposed to get rid of this particular assert for good? The commit message suggested as much.

in reply to:  457 comment:458 by klaussfreire, 9 years ago

Replying to heleppkes:

Wasn't one of claudio's changes supposed to get rid of this particular assert for good? The commit message suggested as much.

It should have. It did in all the cases I tested. I'll have to try and reproduce this particular case later.

comment:459 by Kamedo2, 9 years ago

The new AAC output cannot be decoded properly by the faad decoder.

ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
faad -b 1 out.mp4 out.wav

results in collapsed sounds.

ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
faad -q -b 4 out.mp4 out.wav

decoding to 32bit float also results in collapsed sounds.

ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
ffmpeg77208 -y -i out.mp4 -c:a pcm_s16le out.wav

The same AAC output decoded by the new FFmpeg is OK.

ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
ffmpeg72585 -y -i out.mp4 -c:a pcm_s16le out.wav

The same AAC output decoded by older FFmpeg is also OK.

ffmpeg76735 -y -i abc\compilation2.wav -c:a aac -b:a 96k -strict -2 out.mp4
faad -b 1 out.mp4 out.wav

72585, 76735 was OK, but the 76851, 76877, 76976, 77208 suffer the same problem.

by Kamedo2, 9 years ago

Attachment: ffmpeg_aac_error2.flac added

This causes error on -profile:a aac_ltp -b:a 96k. The error msg are "av_interleaved_write_frame(): Not enough space" or "Audio encoding failed (avcodec_encode_audio2)". The sound is 08._Sarah_McLachlan_Ice_ringing.flac

comment:460 by Kamedo2, 9 years ago

ffmpeg77208 -y -i ffmpeg_aac_error2.flac -c:a aac -profile:a aac_ltp -b:a 96k out.mp4
ffmpeg version N-77208-gb4f1636 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 20.100 /  6. 20.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aac_error2.flac':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
  Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
    Side data:
      replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un
known, album peak - unknown,
Output #0, mp4, to 'out.mp4':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 96 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Audio encoding failed (avcodec_encode_audio2)

comment:461 by Kamedo2, 9 years ago

I am considering a new listening test of -c:a aac and -c:a libfdk_aac at 64k, 96k, and 128kbps. Is the bug comment:459 easy to solve?

comment:462 by Kamedo2, 9 years ago

The bug comment:459 don't reproduce with -profile:a mpeg2_aac_low option.
The bug comment:459 reproduce on -profile:a aac_main, -profile:a aac_low, and -profile:a aac_ltp options.

in reply to:  461 comment:463 by Hendrik, 9 years ago

Replying to Kamedo2:

I am considering a new listening test of -c:a aac and -c:a libfdk_aac at 64k, 96k, and 128kbps. Is the bug comment:459 easy to solve?

First someone would need to determine if its maybe faad thats broken.
No software is ever perfect.

If it doesn't re produce with mpeg2_aac_low, its likely related to PNS, as thats the only feature that gets turned off over aac_low.

comment:464 by Kamedo2, 9 years ago

Apparently, lower bitrate induces the assertion error.

ffmpeg77223 -y -i FFmpeg_anmr_error5.flac -c:a aac -b:a 16k -cutoff 15000 -ar 48000 out.mp4
ffmpeg version N-77233-g28e9b7e Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_anmr_error5.flac':
  Duration: 00:00:05.00, start: 0.000000, bitrate: 229 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp (16
 bit), 16 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77223 -y -i ffmpeg_96k_error.flac -c:a aac -profile:a aac_main -b:a 16k -cutoff 20000 -ar 44100 out.mp4
ffmpeg version N-77233-g28e9b7e Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 10.100 / 55. 10.100
  libavcodec     57. 17.100 / 57. 17.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_96k_error.flac':
  Duration: 00:00:02.01, start: 0.000000, bitrate: 238 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 16 kb/s
    Metadata:
      encoder         : Lavc57.17.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:465 by Rostislav Pehlivanov, 9 years ago

All of the asserts happen because of PNS. Disable it with -aac_pns 0 and you'll see you won't get any more. Claudio and I are working on a fix, it's a hard problem to solve.

in reply to:  459 comment:466 by klaussfreire, 9 years ago

Replying to Kamedo2:

The new AAC output cannot be decoded properly by the faad decoder.

ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
faad -b 1 out.mp4 out.wav

results in collapsed sounds.

Can you attach or email the compilation2.wav file or something that helps reproduce this?

I've tried a few samples with faad and couldn't yet reproduce it.

comment:467 by Kamedo2, 9 years ago

http://downloads.xiph.org/websites/xiph.org/vorbis/listen/compilation2.wav

The encoded sound collapses on FAAD2 ( Ahead Software MPEG-4 AAC Decoder V2.7 ).

I failed to reproduce it on NeroAACDec 1.5.1.0.

comment:468 by klaussfreire, 9 years ago

I believe the assertion failure in comment:456 and comment:464 has been fixed by the last commit.

I managed to reproduce comment:459, but I'm still investigating it. I'm suspecting it is indeed a bug in faad related to either M/S coding or I/S coding.

comment:469 by Kamedo2, 9 years ago

Sadly, it still crashes with some samples.

ffmpeg77436 -y -i "FFmpeg_aacvbr_pulse2.flac" -c:a aac -ar 11025 -cutoff 5000 -profile:a aac_main -b:a 8k out.mp4
ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 11.100 / 55. 11.100
  libavcodec     57. 20.100 / 57. 20.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_aacvbr_pulse2.flac':
  Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16
 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.20.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77436 -y -i "FFmpeg_anmr_error6.flac" -c:a aac -q:a 1280k out.mp4
ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 11.100 / 55. 11.100
  libavcodec     57. 20.100 / 57. 20.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'FFmpeg_anmr_error6.flac':
  Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
ffmpeg77436 -y -i "ffmpeg_aacvbr_pulse2.flac" -c:a aac -ar 11025 -cutoff 5000 -b:a 8k -profile:a aac_main out.mp4
ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 11.100 / 55. 11.100
  libavcodec     57. 20.100 / 57. 20.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aacvbr_pulse2.flac':
  Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16
 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.20.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77436 -y -i "sine_tester.flac" -c:a aac -ar 11025 -cutoff 5000 -b:a 8k out.mp4
ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 11.100 / 55. 11.100
  libavcodec     57. 20.100 / 57. 20.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'sine_tester.flac':
  Duration: 00:00:28.00, start: 0.000000, bitrate: 294 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit)
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (24
 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.20.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
ffmpeg77436 -y -i "ffmpeg_96k_error.flac" -c:a aac -profile:a aac_main -b:a 16k -ar 44100 -cutoff 20000 out.mp4
ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 11.100 / 55. 11.100
  libavcodec     57. 20.100 / 57. 20.100
  libavformat    57. 20.100 / 57. 20.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 21.100 /  6. 21.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_96k_error.flac':
  Duration: 00:00:02.01, start: 0.000000, bitrate: 238 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.20.100
    Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16
 bit), 16 kb/s
    Metadata:
      encoder         : Lavc57.20.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:470 by klaussfreire, 9 years ago

Oh, I hadn't seen the -profile:a aac_main in comment:456, without it, it doesn't crash anymore, but with it it does.

AFAIK, the only difference there is main prediction (aac_main enables main prediction).

I'll look into it later.

comment:471 by Kamedo2, 9 years ago

I'm going to start a new listening test of -c:a aac and -c:a libfdk_aac at 64k, 96k, and 128kbps in 2016/01/05. I hope this encoder will be stable until then.

in reply to:  471 comment:472 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

I'm going to start a new listening test of -c:a aac and -c:a libfdk_aac at 64k, 96k, and 128kbps in 2016/01/05. I hope this encoder will be stable until then.

It's perfectly stable under normal operating bitrates and settings until you start testing it to the extremes with variable bit rate (please don't use it) and aac_main (don't use this either).
Considering it's already used in professional broadcasting (with aac_pns 0 since that's what causes the instability) I say it's stable. It survived a whole week of fuzzing after all.

comment:473 by klaussfreire, 9 years ago

Well, time for an update...

I did some tests, and faad's problem is with correlated PNS bands (PNS + ms_mask). It seems to be applying the M/S transform even though the specs clearly state that when PNS is used in conjunction with ms_mask bits, it should not.

I'd consider that a faad bug, but we are indeed producing "weird" bitstreams (we signal correlated PNS when only one side uses PNS, which makes the ms_mask unnecessary). Avoiding that weirdness works around faad's bug (and avoids possibly triggering similar bugs in other decoders).

I'm working on that patch now (thoroughly testing now).

comment:474 by Kamedo2, 9 years ago

Is the faad's bug work-around ready?
I have thoroughly tested ffmpeg77652. In LC profile above 22kHz and 32kbps, the native encoder seems to be stable.

in reply to:  474 comment:475 by klaussfreire, 9 years ago

Replying to Kamedo2:

Is the faad's bug work-around ready?
I have thoroughly tested ffmpeg77652. In LC profile above 22kHz and 32kbps, the native encoder seems to be stable.

It's in the pipeline.

Just doing some regression ABX testing, since the objective (PSNR) A/B script pointed out some seemingly significant regressions (until now I couldn't confirm anyone with ABX but I'm not done testing yet)

comment:476 by Kamedo2, 9 years ago

Klaussfreire, thank you for the explanation.

http://listening-test.coresv.net/img2/encodespeed.png

comment:477 by Kamedo2, 9 years ago

https://www.ffmpeg.org/ffmpeg-codecs.html#Options-5
'aac_pred'

Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.

I believe it should be 'aac_main'.

comment:478 by Hendrik, 9 years ago

aac_pred enables the prediction feature, the profile is controlled by "-profile aac_main".
So yes, the docs seem buggy.

Last edited 9 years ago by Hendrik (previous) (diff)

in reply to:  477 comment:479 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

https://www.ffmpeg.org/ffmpeg-codecs.html#Options-5
'aac_pred'

Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.

I believe it should be 'aac_main'.

It's correct as it is. You can enable AAC-Main (and thus prediction) in two ways: set the profile via -profile:a aac_main or set the prediction flag via -aac_pred 1. Setting one will set the other as well, since you can't have prediction without the profile being set and you can't have the profile set without prediction (well you can but it would be a hack as you'd just set all scalefactor bands to disable prediction).

comment:480 by Hendrik, 9 years ago

I think he is referring to the aac_pred bullet point under the "profiles" section, which is not quite correct, since aac_main is the name of the profile.

in reply to:  480 comment:481 by Rostislav Pehlivanov, 9 years ago

Replying to heleppkes:

I think he is referring to the aac_pred bullet point under the "profiles" section, which is not quite correct, since aac_main is the name of the profile.

It is correct since the option aac_pred will enable AAC-Main prediction, even though it's not the name of the profile. Hence why it's listed there.

-aac_pred 1 enables -profile:a aac_main and -profile:a aac_main enables -aac_pred 1

comment:482 by Kamedo2, 9 years ago

ffmpeg77758 -i in.wav -c:a aac -profile:a aac_pred out.mp4
It fails.
The document ​https://www.ffmpeg.org/ffmpeg-codecs.html should be:

'aac_main'

Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.

in reply to:  482 comment:483 by Rostislav Pehlivanov, 9 years ago

Replying to Kamedo2:

ffmpeg77758 -i in.wav -c:a aac -profile:a aac_pred out.mp4
It fails.
The document ​https://www.ffmpeg.org/ffmpeg-codecs.html should be:

'aac_main'

Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.

I fixed it 3 hours ago. I didn't understand where the typo was.

comment:484 by Kamedo2, 9 years ago

Another assertion failure.

ffmpeg77758 -y -i ffmpeg_aac_error2.flac -c:a aac -profile:a aac_ltp -cutoff 15000 out.mp4
ffmpeg version N-77758-g6e24946 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 13.100 / 55. 13.100
  libavcodec     57. 22.100 / 57. 22.100
  libavformat    57. 21.101 / 57. 21.101
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 23.100 /  6. 23.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aac_error2.flac':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
  Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
    Side data:
      replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un
known, album peak - unknown,
Output #0, mp4, to 'out.mp4':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
    encoder         : Lavf57.21.101
    Stream #0:0: Audio: aac (LTP) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fl
tp (16 bit), 128 kb/s
    Metadata:
      encoder         : Lavc57.22.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion afq->remaining_samples == afq->remaining_delay failed at libavcodec/au
dio_frame_queue.c:106

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:485 by Kamedo2, 9 years ago

I reproduced the aacenc.c assertion errors on ARM, but not the audio_frame_queue.c assertion error on comment:484.

pi@raspberrypi:~/ffmpeg160112 $ time ./ffmpeg -y -i ffmpeg_96k_error.flac -c:a aac -profile:a aac_main -b:a 16k -ar 44100 -cutoff 20000 out.mp4
ffmpeg version N-77804-gd64d6ed Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.9.2 (Raspbian 4.9.2-10)
  configuration: 
  libavutil      55. 13.100 / 55. 13.100
  libavcodec     57. 22.100 / 57. 22.100
  libavformat    57. 21.101 / 57. 21.101
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 23.100 /  6. 23.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
Input #0, flac, from 'ffmpeg_96k_error.flac':
  Duration: 00:00:02.01, start: 0.000000, bitrate: 238 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.21.101
    Stream #0:0: Audio: aac (Main) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 16 kb/s
    Metadata:
      encoder         : Lavc57.22.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363   
Aborted

real	0m8.021s
user	0m7.960s
sys	0m0.060s
pi@raspberrypi:~/ffmpeg160112 $ time ./ffmpeg -y -i ffmpeg_aacvbr_pulse2.flac -c:a aac -ar 11025 -cutoff 5000 -b:a 8k -profile:a aac_main out.mp4
ffmpeg version N-77804-gd64d6ed Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.9.2 (Raspbian 4.9.2-10)
  configuration: 
  libavutil      55. 13.100 / 55. 13.100
  libavcodec     57. 22.100 / 57. 22.100
  libavformat    57. 21.101 / 57. 21.101
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 23.100 /  6. 23.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
Input #0, flac, from 'ffmpeg_aacvbr_pulse2.flac':
  Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.21.101
    Stream #0:0: Audio: aac (Main) ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.22.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363  
Aborted

real	0m11.663s
user	0m11.660s
sys	0m0.100s

comment:486 by Kamedo2, 9 years ago

Oops. I tested the old version. I will test the latest version later.

comment:487 by Kamedo2, 9 years ago

Now the faad decode and the low bitrate encodes above are properly working on x86.
But this will fail.

ffmpeg77827 -y -i short_block_test_2.flac -c:a aac -b:a 8k -cutoff 15000 -ar 48000 out.mp4
ffmpeg version N-77827-g9006567 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration:
  libavutil      55. 13.100 / 55. 13.100
  libavcodec     57. 22.100 / 57. 22.100
  libavformat    57. 21.101 / 57. 21.101
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 23.100 /  6. 23.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
Input #0, flac, from 'short_block_test_2.flac':
  Duration: 00:00:15.00, start: 0.000000, bitrate: 91 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.21.101
    Stream #0:0: Audio: aac (LC) ([64][0][0][0] / 0x0040), 48000 Hz, stereo, flt
p (16 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.22.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:488 by klaussfreire, 9 years ago

It's never ending.

Can you attach the short_block_test_2.flac?

by Kamedo2, 9 years ago

Attachment: short_block_test_2.flac added

comment:489 by Kamedo2, 9 years ago

I started a blind listening test of -c:a aac and -c:a libfdk_aac at 64k, 96k, and 128kbps. The progress is 33% now.

comment:490 by klaussfreire, 9 years ago

I just pushed a fix for the assertion failure on short_block_test_2, and a few other artifacts that were exposed by that sample. There are some artifacts remaining still, but I'm having a hard time pinpointing where they come from, so I thought I should push before you're done with your listening test ;)

comment:491 by Kamedo2, 9 years ago

This will output error and stop if the bitrate is 8k, 16k, 24k, 32k, and 48k.
40k 64k 72k 80k 88k 96k 104k 112k 120k 128k is encodable.

ffmpeg77914 -y -i ffmpeg_aac_error2.flac -c:a aac -ac 1 -profile:a aac_ltp -b:a 8k -cutoff 15000 out.mp4
ffmpeg version N-77914-g03d83ba Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 13.100 / 55. 13.100
  libavcodec     57. 22.100 / 57. 22.100
  libavformat    57. 21.101 / 57. 21.101
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 25.100 /  6. 25.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aac_error2.flac':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
  Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
    Side data:
      replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un
known, album peak - unknown,
Output #0, mp4, to 'out.mp4':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
    encoder         : Lavf57.21.101
    Stream #0:0: Audio: aac (LTP) ([64][0][0][0] / 0x0040), 44100 Hz, mono, fltp
 (16 bit), 8 kb/s
    Metadata:
      encoder         : Lavc57.22.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
[mp4 @ 005c48e0] Application provided duration: 4539201763687527334 / timestamp:
 4539201763687792550 is out of range for mov/mp4 format
[mp4 @ 005c48e0] pts has no value
[aac @ 005c5940] Queue input is backward in time
[mp4 @ 005c48e0] Non-monotonous DTS in output stream 0:0; previous: 453920176368
7792550, current: 4535156856773993315; changing to 4539201763687792551. This may
 result in incorrect timestamps in the output file.
[mp4 @ 005c48e0] Application provided duration: 4539201763687527334 / timestamp:
 4539201763687792551 is out of range for mov/mp4 format
[mp4 @ 005c48e0] pts has no value

ffmpeg_aacvbr_pulse1.flac also have this error.

comment:492 by Kamedo2, 9 years ago

Sadly, this will also fail.

ffmpeg77914 -y -i ffmpeg_aac_error2.flac -c:a aac -profile:a aac_ltp out.mp4
ffmpeg version N-77914-g03d83ba Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3
lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th
reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2
  libavutil      55. 13.100 / 55. 13.100
  libavcodec     57. 22.100 / 57. 22.100
  libavformat    57. 21.101 / 57. 21.101
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 25.100 /  6. 25.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, flac, from 'ffmpeg_aac_error2.flac':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
  Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
    Side data:
      replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un
known, album peak - unknown,
Output #0, mp4, to 'out.mp4':
  Metadata:
    REPLAYGAIN_TRACK_GAIN: -0.57 dB
    REPLAYGAIN_TRACK_PEAK: 0.474701
    encoder         : Lavf57.21.101
    Stream #0:0: Audio: aac (LTP) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fl
tp (16 bit), 128 kb/s
    Metadata:
      encoder         : Lavc57.22.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> aac (native))
Press [q] to stop, [?] for help
Assertion afq->remaining_samples == afq->remaining_delay failed at libavcodec/au
dio_frame_queue.c:106

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

comment:493 by klaussfreire, 9 years ago

Clearly ltp has problems, I haven't gotten around to solving them yet.

I did find the last of TNS issues, I'll push after some further testing, but it looks good.

comment:494 by Rostislav Pehlivanov, 9 years ago

Flagged the LTP profile as experimental. We have enough bug reports to work on to fix LTP. Crashes on the aac_low profile are the main priority right now.
Also removed the FAAC-like coder, since it has been marked for removal for over a month.

comment:495 by klaussfreire, 9 years ago

I thought the timescale of those things (removal) was measured in releases.

comment:496 by Hendrik, 9 years ago

Generally yes, however aacenc was experimental in the last release, which kind of exempts it from stability rules.

comment:497 by Kamedo2, 9 years ago

A Bad news.
FDK-AAC still beats FFmpeg's native AAC encoder by a significant margin.
http://listening-test.coresv.net/img2/ffaac_fdk_compare_en.png
http://listening-test.coresv.net/img2/ffaac_fdk_compare_en2.png
http://listening-test.coresv.net/img2/ffaac_fdk_compare_en3.png

comment:498 by klaussfreire, 9 years ago

Which version?

The bugs on TNS that were fixed recently make a big difference in perceived quality (not so much on PSNR)

comment:499 by RiCON, 9 years ago

757248e, which is before the TNS fixes.

comment:500 by klaussfreire, 9 years ago

I suggest then you revalidate the results. No need to do the whole test again, start with a canary (a sample that was particularly troublesome), to see if it now compares better.

comment:501 by klaussfreire, 9 years ago

Update: I found the main reason why fdk is so far ahead - basically, I/S is way too conservating, to the point where it barely gets used. I'm toying with making it far more aggressive.

in reply to:  501 comment:502 by Kamedo2, 9 years ago

Replying to klaussfreire:

Update: I found the main reason why fdk is so far ahead - basically, I/S is way too conservating, to the point where it barely gets used. I'm toying with making it far more aggressive.

OK, I will retest the new one.

comment:503 by klaussfreire, 9 years ago

Hold your horses, I haven't pushed anything for I/S yet, and it will take a while (it's a big change that I want to properly test first)

comment:504 by Kamedo2, 9 years ago

In this commit http://git.videolan.org/?p=ffmpeg.git;a=commit;h=66edd8656b851a0c85ba25ec293cc66192c363ae
I guess libavcodec/lpc.c line 179 is meant to be i < len / 2;.

 170 double ff_lpc_calc_ref_coefs_f(LPCContext *s, const float *samples, int len,
 171                                int order, double *ref)
 172 {
 173     int i;
 174     double signal = 0.0f, avg_err = 0.0f;
 175     double autoc[MAX_LPC_ORDER+1] = {0}, error[MAX_LPC_ORDER+1] = {0};
 176     const double a = 0.5f, b = 1.0f - a;
 177 
 178     /* Apply windowing */
 179     for (i = 0; i <= len / 2; i++) {
 180         double weight = a - b*cos((2*M_PI*i)/(len - 1));
 181         s->windowed_samples[i] = weight*samples[i];
 182         s->windowed_samples[len-1-i] = weight*samples[len-1-i];
 183     }
 184 
 185     s->lpc_compute_autocorr(s->windowed_samples, len, order, autoc);
 186     signal = autoc[0];
 187     compute_ref_coefs(autoc, order, ref, error);
 188     for (i = 0; i < order; i++)
 189         avg_err = (avg_err + error[i])/2.0f;
 190     return signal/avg_err;
 191 }

And we can get a 1.2% speedup if we exclude cos function from the loop.

    /* Apply windowing */
    double cos_onestep = cos((2*M_PI)/(len - 1));
    double sin_onestep = sin((2*M_PI)/(len - 1));
    double cos_isteps = b;
    double sin_isteps = 0;
    for (i = 0; i < len / 2; i++) {
        double sin_newsteps;
        double weight = a - cos_isteps;
        s->windowed_samples[i] = weight*samples[i];
        s->windowed_samples[len-1-i] = weight*samples[len-1-i];
        sin_newsteps = sin_isteps*cos_onestep + cos_isteps*sin_onestep;
        cos_isteps = cos_isteps*cos_onestep - sin_isteps*sin_onestep;
        sin_isteps = sin_newsteps;
    }

comment:505 by Kamedo2, 9 years ago

This command crashes FFmpeg, after the commit "AAC encoder: fix undefined behavior".

cores\ffmpeg79177 -i "ffmpeg_aac320k_collapse3.flac" -c:a aac -strict experimental -b:a 4k out.mp4

Past versions before the commit, such as N-79171-ga35a4a5 was safe.

comment:506 by klaussfreire, 9 years ago

I could never understand those commit numbers. Which repo are they referencing? My checkout has no such commit hash.

There are two commits about undefined behavior. I'm guessing you're referring to the second one. I have actually checked with an automated script that same sample, albeit not with 4kbps. It's a bit low to be included in standard A/B tests, but I'll add it and re-run.

in reply to:  506 comment:507 by Carl Eugen Hoyos, 9 years ago

Replying to klaussfreire:

I could never understand those commit numbers. Which repo are they referencing? My checkout has no such commit hash.

a677121cc568db7c101ebf3a797a779a983fc668: N-79177-ga677121
a35a4a5774a196f8eefc8ef2994979a6c563e0c2: N-79171-ga35a4a5

comment:508 by klaussfreire, 9 years ago

I see, silly me, g isn't a hex digit.

comment:509 by Rostislav Pehlivanov, 8 years ago

Analyzed by developer: set
Resolution: fixed
Status: openclosed

Thanks everyone, I think it's time to close this ticket.
Hopefully soon there'll be an Opus encoder to keep Kamedo2 busy :)

Note: See TracTickets for help on using tickets.