Opened 11 years ago
Closed 8 years ago
#2686 closed defect (fixed)
Native AAC encoder collapses at high bitrates on some samples
Reported by: | Kamedo2 | Owned by: | klaussfreire |
---|---|---|---|
Priority: | normal | Component: | avcodec |
Version: | git-master | Keywords: | aac regression |
Cc: | klaussfreire@gmail.com, timothygu99@gmail.com, atomnuker@gmail.com, rodger.combs@gmail.com | Blocked By: | |
Blocking: | Reproduced by developer: | yes | |
Analyzed by developer: | yes |
Description
Summary of the bug:
FFmpeg native aac encoder outputs horrible sound around 256kbps or more on particular samples. It happens at higher bitrates. The quality degrades as I increase the bitrates, and become most degraded at 320-400kbps.
How to reproduce:
ffmpeg -i ffmpeg_aac320k_collapse.flac -vn -c:a aac -strict experimental -b:a 320k ffmpeg_aac320k_collapse.mp4
I couldn't reproduce the results when I trimmed the most problematic sample down to 8 seconds, but by adding 10 seconds of silence before the sample, the bug could be reproduced. So I'm going to upload the sample with 10 seconds of silence attached. The native aac encoder was ok on many music clips at 320kbps, and only some clips exhibit noticeably bad quality aac files, to an extent I'd call it 'bug'.
Console Output:
ffmpeg version N-54096-ge41bf19 Copyright (c) 2000-2013 the FFmpeg developers built on Jun 19 2013 00:20:06 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-libmp3lame --enable-lib vorbis --enable-nonfree --enable-libfdk-aac --enable-libvo_aacenc --enable-libfa ac --extra-ldflags=-static --extra-cflags='-march=nocona -mfpmath=sse' --optflag s=-O2 libavutil 52. 37.101 / 52. 37.101 libavcodec 55. 16.100 / 55. 16.100 libavformat 55. 9.100 / 55. 9.100 libavdevice 55. 2.100 / 55. 2.100 libavfilter 3. 77.101 / 3. 77.101 libswscale 2. 3.100 / 2. 3.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 3.100 / 52. 3.100 [flac @ 0003f160] max_analyze_duration 5000000 reached at 5015510 microseconds Input #0, flac, from '05-true_my_heart_2m50s.flac': Duration: 00:00:18.01, bitrate: 573 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to '05-true_my_heart_2m50s_320k.mp4': Metadata: encoder : Lavf55.9.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 32 0 kb/s Stream mapping: Stream #0:0 -> #0:0 (flac -> aac) Press [q] to stop, [?] for help size= 331kB time=00:00:18.01 bitrate= 150.4kbits/s video:0kB audio:327kB subtitle:0 global headers:0kB muxing overhead 1.151111%
Attachments (68)
Change History (577)
by , 11 years ago
Attachment: | ffmpeg_aac320k_collapse.flac added |
---|
by , 11 years ago
Attachment: | ffmpeg_aac320k_collapse2.flac added |
---|
A sound that degrades on FFmpeg native aac encoder. Sounds like a spray can. Billie Holiday : I'm A Fool To Want You (trimmed to 20sec, first and last)
comment:1 by , 11 years ago
Component: | FFmpeg → avcodec |
---|---|
Keywords: | native encoder sound quality 256kbps 320kbps removed |
Version: | 1.0.7 → git-master |
Did the output (aac) files sound better with the (original!) release 1.2?
(Not a later release of the 1.2 series.)
comment:2 by , 11 years ago
Yes, the output aac files sounded better with release 1.2.1 I've downloaded from
http://www.ffmpeg.org/releases/ffmpeg-1.2.1.tar.bz2
Still, the quality of the native aac at 320kbps is poorer than the native aac 256kbps.
ffmpeg version 1.2.1 Copyright (c) 2000-2013 the FFmpeg developers built on Jun 19 2013 12:38:13 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-libmp3lame --enable-lib vorbis --enable-nonfree --enable-libfdk-aac --enable-libvo_aacenc --enable-libfa ac --extra-ldflags=-static --extra-cflags='-march=nocona -mfpmath=sse' --optflag s=-O2 libavutil 52. 18.100 / 52. 18.100 libavcodec 54. 92.100 / 54. 92.100 libavformat 54. 63.104 / 54. 63.104 libavdevice 54. 3.103 / 54. 3.103 libavfilter 3. 42.103 / 3. 42.103 libswscale 2. 2.100 / 2. 2.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 2.100 / 52. 2.100 [flac @ 01405c20] max_analyze_duration 5000000 reached at 5015510 microseconds Input #0, flac, from 'ffmpeg_aac320k_collapse.flac ': Duration: 00:00:18.01, bitrate: 573 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'ffmpeg_aac320k_collapse.mp4': Metadata: encoder : Lavf54.63.104 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 32 0 kb/s Stream mapping: Stream #0:0 -> #0:0 (flac -> aac) Press [q] to stop, [?] for help size= 289kB time=00:00:18.01 bitrate= 131.3kbits/s video:0kB audio:285kB subtitle:0 global headers:0kB muxing overhead 1.321136%
comment:3 by , 11 years ago
Oops, you said original release 1.2.
Release 1.2 and 1.2.1 had the same behavior -- the first sample collapses at 432-464kbps.
As for N-54096-ge41bf19 I've got from git -- the first sample collapses at 256-432kbps.
These two groups have the distinct "degradation range". Release 1.2 and 1.2.1 have much narrower degradation range, and the 1.2* is less severe at the range. N-54096-ge41bf19 at 352kbps is the worst quality.
ffmpeg version 1.2 Copyright (c) 2000-2013 the FFmpeg developers built on Jun 20 2013 03:06:34 with gcc 4.8.1 (GCC) configuration: --enable-version3 --enable-nonfree --enable-libfdk-aac --extra- ldflags=-static --extra-cflags='-march=native' --optflags=-O2 libavutil 52. 18.100 / 52. 18.100 libavcodec 54. 92.100 / 54. 92.100 libavformat 54. 63.104 / 54. 63.104 libavdevice 54. 3.103 / 54. 3.103 libavfilter 3. 42.103 / 3. 42.103 libswscale 2. 2.100 / 2. 2.100 libswresample 0. 17.102 / 0. 17.102 [flac @ 03295c20] max_analyze_duration 5000000 reached at 5015510 microseconds Input #0, flac, from 'C:\Users\PCC\Documents\ABC-HR\ffmpeg_aac320k_collapse.flac ': Duration: 00:00:18.01, bitrate: 573 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'C:\Users\PCC\Documents\ABC-HR\05-true_my_heart_2m50s_320k_12 .mp4': Metadata: encoder : Lavf54.63.104 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 32 0 kb/s Stream mapping: Stream #0:0 -> #0:0 (flac -> aac) Press [q] to stop, [?] for help size= 289kB time=00:00:18.01 bitrate= 131.3kbits/s video:0kB audio:285kB subtitle:0 global headers:0kB muxing overhead 1.321136%
comment:4 by , 11 years ago
This patch I'm going to attach fixes both issues. But I must warn that it's a WIP, I still have to split it into individual issues and fix a bug it exhibits in rare circumstances when working in VBR mode.
by , 11 years ago
Attachment: | aac-improvements-wip.patch added |
---|
AAC native encoder improvements, work in progress
comment:5 by , 11 years ago
Keywords: | regression added |
---|---|
Reproduced by developer: | set |
Status: | new → open |
comment:6 by , 11 years ago
I appreciate your effort, klaussfreire.
I want to test the aac-improvements-wip.patch, but how can I do that?
/c/mingw/ffmpeg/ffmpeg-1.2 $ patch -u -p1 < aac-improvements-wip.patch patching file libavcodec/aaccoder.c Hunk #3 FAILED at 711. Hunk #4 succeeded at 776 (offset -5 lines). Hunk #5 succeeded at 818 (offset -5 lines). Hunk #6 FAILED at 845. Hunk #7 FAILED at 1055. Hunk #8 FAILED at 1068. Hunk #9 FAILED at 1092. Hunk #10 FAILED at 1110. 6 out of 10 hunks FAILED -- saving rejects to file libavcodec/aaccoder.c.rej patching file libavcodec/aacenc.c Hunk #3 FAILED at 622. 1 out of 3 hunks FAILED -- saving rejects to file libavcodec/aacenc.c.rej patching file libavcodec/aacpsy.c Hunk #1 succeeded at 293 (offset -4 lines). Hunk #2 succeeded at 385 (offset -4 lines). Hunk #3 succeeded at 646 (offset -33 lines). patching file libavcodec/psymodel.h
comment:7 by , 11 years ago
Without trying myself, I would bet that the patch only applies to current git head.
comment:8 by , 11 years ago
I tried $ git clone git://source.ffmpeg.org/ffmpeg.git, but still, the patch fails.
comment:10 by , 11 years ago
I tried the wip patch again. No good. I think the patch is broken.
$ patch -p1 < aac-improvements-wip.patch patching file libavcodec/aaccoder.c Hunk #3 FAILED at 711. Hunk #4 succeeded at 776 (offset -5 lines). Hunk #5 succeeded at 818 (offset -5 lines). Hunk #6 FAILED at 845. Hunk #7 FAILED at 1055. Hunk #8 FAILED at 1068. Hunk #9 FAILED at 1092. Hunk #10 FAILED at 1110. 6 out of 10 hunks FAILED -- saving rejects to file libavcodec/aaccoder.c.rej patching file libavcodec/aacenc.c Hunk #1 FAILED at 591. Hunk #2 FAILED at 609. Hunk #3 FAILED at 621. 3 out of 3 hunks FAILED -- saving rejects to file libavcodec/aacenc.c.rej patching file libavcodec/aacpsy.c Hunk #1 succeeded at 299 (offset 2 lines). Hunk #2 succeeded at 391 (offset 2 lines). Hunk #3 succeeded at 681 (offset 2 lines). patching file libavcodec/psymodel.h
by , 11 years ago
Attachment: | ffmpeg_aac320k_collapse3.flac added |
---|
A sound that degrades on FFmpeg native aac encoder. Euphoria - Yui Makino [VTCL-35073][06.4.26] Track04 Amefuribana(inst.) 2:45~2:55
comment:11 by , 11 years ago
I successfully applied the patch. klaussfreire's repository is in here. http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143216.html
Or, you can use https://dl.dropboxusercontent.com/u/81238453/aac.patch (Thank you Takuan @K4095) to patch from current git head.
However, still, it has a distinctive bug. The sound disappears partially when the sound is white noise-like.
The bug #2706 was that the sound warbles when the sound was a sine wave. That was solved by this patch, but this creates new problem.
ffmpeg54292 -v 9 -loglevel 99 -filter_complex "aevalsrc=-0.5+random(0)" -c:a aac -strict experimental -ar 4 4100 -ac 2 -b:a 256k -t 4 "C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4" ffmpeg version N-54292-g97947d9 Copyright (c) 2000-2013 the FFmpeg developers built on Jun 30 2013 20:34:13 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk -aac --extra-ldflags=-static --extra-cflags='-march=nocona -mfpmath=sse' --optfl ags=-O2 libavutil 52. 38.100 / 52. 38.100 libavcodec 55. 18.100 / 55. 18.100 libavformat 55. 10.100 / 55. 10.100 libavdevice 55. 2.100 / 55. 2.100 libavfilter 3. 77.101 / 3. 77.101 libswscale 2. 3.100 / 2. 3.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 3.100 / 52. 3.100 Splitting the commandline. Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'. Reading option '-filter_complex' ... matched as option 'filter_complex' (create a complex filtergraph) with argument 'aevalsrc=-0.5+random(0)'. Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac' . Reading option '-strict' ... matched as AVOption 'strict' with argument 'experim ental'. Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz) ) with argument '44100'. Reading option '-ac' ... matched as option 'ac' (set number of audio channels) w ith argument '2'. Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v) ) with argument '256k'. Reading option '-t' ... matched as option 't' (record or transcode "duration" se conds of audio/video) with argument '4'. Reading option 'C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4' ... matched a s output file. Finished splitting the commandline. Parsing a group of options: global . Applying option v (set logging level) with argument 9. Applying option filter_complex (create a complex filtergraph) with argument aeva lsrc=-0.5+random(0). Successfully parsed a group of options. Parsing a group of options: output file C:\Users\PCC\Documents\ABC-HR\whitenoise _256k.mp4. Applying option c:a (codec name) with argument aac. Applying option ar (set audio sampling rate (in Hz)) with argument 44100. Applying option ac (set number of audio channels) with argument 2. Applying option b:a (video bitrate (please use -b:v)) with argument 256k. Applying option t (record or transcode "duration" seconds of audio/video) with a rgument 4. Successfully parsed a group of options. Opening an output file: C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4. detected 8 logical cores [Parsed_aevalsrc_0 @ 0140bea0] compat: called with args=[-0.5+random(0)] [Parsed_aevalsrc_0 @ 0140bea0] Setting 'exprs' to value '-0.5+random(0)' [audio format for output stream 0:0 @ 01412880] Setting 'sample_fmts' to value ' fltp' [audio format for output stream 0:0 @ 01412880] Setting 'sample_rates' to value '44100' [audio format for output stream 0:0 @ 01412880] Setting 'channel_layouts' to val ue '0x3' Successfully opened the file. [audio format for output stream 0:0 @ 01412880] auto-inserting filter 'auto-inse rted resampler 0' between the filter 'Parsed_aevalsrc_0' and the filter 'audio f ormat for output stream 0:0' [AVFilterGraph @ 0039f3c0] query_formats: 3 queried, 6 merged, 3 already done, 0 delayed [Parsed_aevalsrc_0 @ 0140bea0] sample_rate:44100 chlayout:mono duration:-1.00000 0 [auto-inserted resampler 0 @ 0039f2a0] [SWR @ 00393160] Using double precision m ode 0.707107 0.707107 [auto-inserted resampler 0 @ 0039f2a0] ch:1 chl:mono fmt:dblp r:44100Hz -> ch:2 chl:stereo fmt:fltp r:44100Hz Output #0, mp4, to 'C:\Users\PCC\Documents\ABC-HR\whitenoise_256k.mp4': Metadata: encoder : Lavf55.10.100 Stream #0:0, 0, 1/44100: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, ster eo, fltp, 256 kb/s Stream mapping: aevalsrc -> Stream #0:0 (aac) Press [q] to stop, [?] for help No more output streams to write to, finishing. size= 141kB time=00:00:04.01 bitrate= 288.4kbits/s video:0kB audio:140kB subtitle:0 global headers:0kB muxing overhead 1.001409% 0 frames successfully decoded, 0 decoding errors [AVIOContext @ 0141b640] Statistics: 30 seeks, 197 writeouts
The output mp4 I'm going to post sounds nothing like white noise.
by , 11 years ago
Attachment: | whitenoise_256k.mp4 added |
---|
White noise, encoded by native aac encoder at 256kbps. The sound is obviously collapsed.
comment:12 by , 11 years ago
Another bug, typically happens when hi-hats are present. The sound disappears for about 20ms.
Short, but it's still audible and sounds like a annoying pulse.
When these problems are solved, I'm going to conduct an extensive blind listening test, to assess sound quality of AAC encoders available from FFmpeg.
comment:13 by , 11 years ago
follow-up: 15 comment:14 by , 11 years ago
Sorry, I expected to get email notifications, but got none.
That bug is probably a ratecontrol bug I thought I had erradicated. I'll try to test with white noise, but just in case the exact input matters, can you attach a flac version?
comment:15 by , 11 years ago
Replying to klaussfreire:
Sorry, I expected to get email notifications, but got none.
You will get them if you add yourself to CC.
follow-up: 19 comment:16 by , 11 years ago
Cc: | added |
---|
comment:17 by , 11 years ago
In aacenc.c, changing
s->lambda *= ratio
by
s->lambda *= sqrtf(sqrtf(ratio));
Fixes the white nose thing, so indeed it's RC messup.
But that brings some other trouble in more normal signals, so I guess I'll have to play with RC a little bit more.
by , 11 years ago
Attachment: | Whitenoise.flac added |
---|
White noise, created by SoundEngine Free ver.4.59. Using aevalsrc as in comment:11 do the same job.
comment:19 by , 11 years ago
Replying to klaussfreire:
You may also want to look at ticket #2706.
(Is it a duplicate of this ticket?)
comment:20 by , 11 years ago
Replying to klaussfreire:
I think AAC's ratecontrol needs a lookahead buffer.
Can you implement the feature until July 13th?
I'm going to be free and have time to do some double-blind listening tests of the codec.
Results will be like this: http://www.hydrogenaudio.org/forums/index.php?showtopic=100896
follow-up: 22 comment:21 by , 11 years ago
Maybe a very simple one-block one. I've been thinking such a simple lookahead might be enough to fix the bugs, with a better one perhaps for a further patch.
I'll give this high priority, but we're only 3 days away from that deadline you know...
comment:22 by , 11 years ago
Replying to klaussfreire:
Maybe a very simple one-block one. I've been thinking such a simple lookahead might be enough to fix the bugs, with a better one perhaps for a further patch.
I'll give this high priority, but we're only 3 days away from that deadline you know...
Thank you very much! A delay of some days is acceptable.
follow-up: 24 comment:23 by , 11 years ago
Alright, attaching another version. This seems to work better, but it's a bit rushed. I'll try to improve on it, but if I delay, feel free to test this version.
by , 11 years ago
Attachment: | aac-improvements-wip-v2-rclookahead.patch added |
---|
Second version of AAC improvements, with improvements on rate control, hopefully gets rid of all remaining "collapsations on high bit rates". Tested various music tracks on 64k, 128k, 256k and 384k.
comment:24 by , 11 years ago
Replying to klaussfreire:
Alright, attaching another version.
The patch does not apply here to current git head.
comment:25 by , 11 years ago
The patch does not apply, neither. I read http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143216.html and http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143222.html and guessed what should I do, but still, it fails.
by , 11 years ago
Attachment: | aac-improvements-wip-v2-rclookahead.2.patch added |
---|
Second version of AAC improvements, with improvements on rate control, hopefully gets rid of all remaining "collapsations on high bit rates". Tested various music tracks on 64k, 128k, 256k and 384k.
comment:26 by , 11 years ago
Yes, sorry, I'm not working on a clean checkout.
I should move to a clean checkout.
There I attached a rebased patch.
comment:27 by , 11 years ago
Very good one! The only serious artifact I've heard so far is whitenoise.flac at 8, 16, 24, 32kbps and 192kbps.
follow-up: 29 comment:28 by , 11 years ago
Whitenoise.flac at 384kbps, ffmpeg_aac320k_collapse.flac at 320kbps is strange, too.
comment:29 by , 11 years ago
Replying to Kamedo2:
Whitenoise.flac at 384kbps, ffmpeg_aac320k_collapse.flac at 320kbps is strange, too.
I didn't try the collapse ones at 320k, though I tried at 384 and sounded nice. I'll try again when I have a chance though.
However, whitenoise 384 gives me an error, seems 384kbps is too much for mono. The whitenoise I mention is generated with the random generator, I'll try with the flac first chance I get.
comment:30 by , 11 years ago
Isn't the lower spreading function applied too much? The quality of lower frequency is bad when the higher frequency bin is strong. And what makes 320kbps particularly bad? The quality degrades when we have enough ('overkill') bits. I think something fatal is happening, like integer overflow or something.
by , 11 years ago
Attachment: | ffmpeg_aac320k_collapse4.flac added |
---|
A sound that degrades on FFmpeg native aac encoder.
comment:31 by , 11 years ago
Isn't line 334 of libavcodec/aacpsy.c:
for (g = 0; g < ctx->num_bands[j]-1; g++) { AacPsyCoeffs *coeff = &coeffs[g]; float bark_width = coeffs[g+1].barks - coeffs->barks; coeff->spread_low[0] = pow(10.0, -bark_width * PSY_3GPP_THR_SPREAD_LOW); coeff->spread_hi [0] = pow(10.0, -bark_width * PSY_3GPP_THR_SPREAD_HI); coeff->spread_low[1] = pow(10.0, -bark_width * en_spread_low); coeff->spread_hi [1] = pow(10.0, -bark_width * en_spread_hi); pe_min = bark_pe * bark_width; minsnr = exp2(pe_min / band_sizes[g]) - 1.5f; coeff->min_snr = av_clipf(1.0f / minsnr, PSY_SNR_25DB, PSY_SNR_1DB); }
strange? I doubt the sanity of lower spreading function at the highest band, because using -cutoff 18000 option improves the quality on problematic samples, and these problematic samples always includes strong 20-22kHz sounds. (The default cutoff is 18k at 192kbps, 20k at 256kbps, and 22k at 320kbps.)
by , 11 years ago
Attachment: | 18.6_22kHz_noise.flac added |
---|
Partial white noise, clipped by 256th-order lanczos function, to include only signals between 18.6 and 22kHz. the signal wanders around the freq.
comment:32 by , 11 years ago
I've got it. When the native aac encoder calcs a masking curve, almost inaudible sounds like 18kHz, 20kHz, 22kHz is taking into account, and audible sound like 14kHz is masked by the inaudibles. Add the inaudible noise above to the source sound and the encoded sound will be significantly degraded. I recommend that any signals above 16kHz is disregarded in psychoacoustic engines.
comment:33 by , 11 years ago
Alright. Good catch.
I'd recommend not ignoring, because masking within that band will still be important for bit allocation purposes. Rather, back-spreading rolloff (towards the lower frequencies) should be tweaked a bit.
comment:34 by , 11 years ago
Things start to make sense.
Could you tweak the back-spreading and provide the patch for me? I'd like to test that.
by , 11 years ago
Attachment: | ffmpeg_aac320k_collapse5.flac added |
---|
A sound that degrades on FFmpeg native aac encoder.
comment:36 by , 11 years ago
-cutoff 18000 seems to work, but the lowpass filter is too dull, compared to many practical encoders. libavcodec/psymodel.c has the constant FILT_ORDER, and changing the order from 4 to 8 sharpens the filter. But 12 and 16 fails somehow.
comment:37 by , 11 years ago
I hope you're testing with good headphones. HF quality is hard to gauge with speakers, especially since good speakers cost a fortune.
follow-up: 40 comment:39 by , 11 years ago
Replying to Kamedo2:
Yes, I'm testing with good headphones.
The reason I mention this is because, from my experience, FAAC tends to have a low cutoff for some bitrates, that seem optimal with speakers, but sound noticeably dull with headphones.
comment:40 by , 11 years ago
Replying to klaussfreire:
The reason I mention this is because, from my experience, FAAC tends to have a low cutoff for some bitrates, that seem optimal with speakers, but sound noticeably dull with headphones.
Exactly. FAAC cutoff is rather annoyingly low in 96kbps, 64kbps, and 32kbps, and the filter is the major reason why FAAC never beats Nero.
BTW, any prospects for fixing samples 1, 4, 5, and white noise? 4 and 5 is bad at 320kbps and whitenoise.flac is bad at 384kbps. Both regain quality by -cutoff 18000.
comment:41 by , 11 years ago
from line 300:
const int chan_bitrate = ctx->avctx->bit_rate / ((ctx->avctx->flags & CODEC_FLAG_QSCALE) ? 2.0f : ctx->avctx->channels);
to:
const int chan_bitrate = FFMIN(ctx->avctx->bit_rate, 240000) / ((ctx->avctx->flags & CODEC_FLAG_QSCALE) ? 2.0 : ctx->avctx->channels);
significantly improves the quality. Bitrates remain relatively high in this change.
I have not tested all cases, but it works on 256kbps, 320kbps, and 384kbps on many sounds.
comment:42 by , 11 years ago
I've listened to over 100 samples of diverse music and speech records. No problem so far. It works on 96, 112, 128,... 256kbps, but hangs on 288kbps.
follow-up: 44 comment:43 by , 11 years ago
Yeah, but because you're capping psy's bitrate target to non-problematic rates. I don't think that's ideal, though that indeed proves the problem lies in psy.
comment:44 by , 11 years ago
Replying to klaussfreire:
Yeah, but because you're capping psy's bitrate target to non-problematic rates. I don't think that's ideal, though that indeed proves the problem lies in psy.
Rates go up even after capping. So it's not merely a cap. I think we're close to the solution.
comment:45 by , 11 years ago
They go up because twoloop will push all scalefactors down uniformly until it achieves the desired bitrate, but:
- It won't work with VBR, VBR almost wholly depends on psy to dictate scalefactor band noise floors. Twoloop will push scalefactors down a bit more I think but not much at those high bitrates
- It's still suboptimal, it's better to let psy decide, since psy understands perceptual entropy better
Sadly, I didn't have time today to work on it. Lets hope I can do so tomorrow. With your analysis I'm confident I can patch psy without having to cap anything.
comment:47 by , 11 years ago
Reading the specs right now. I had a hunch that the spec might say something about this.
comment:48 by , 11 years ago
There. Line 308:
pctx->frame_bits = chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate;
Must be
pctx->frame_bits = FFMIN(3000, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);
That is indeed said on the spec.
Step 15 of subpart 4: Steps in threshold calculation: then bit allocation is limited to 0 < bit_allocation < 3000. It seems they thought of it all.
comment:49 by , 11 years ago
Great! I'm goint to have time to test that improvement 5 hours later, so I'm going to test that. Extensively. And I think I have to look for ways to sharpen the LPF, using more order, at the cost of more computational time. Currently it's not very clear cut.
comment:50 by , 11 years ago
2560 (the number you found) works better for us though. That's certainly in relation to some deficiency in twoloop, but hey. Lets just document that this should be a 3000 but can't and be done.
comment:51 by , 11 years ago
The LPF could be accomplished by zeroing the coefficients in the FFT. To get the lowest possible ripple, the boundary coefficient needs some care, but AFAIR it's the best method, and it's free for something that's already doing FFT.
comment:52 by , 11 years ago
It's not a regression, but surround bitrate seems to be capped and do not change by -b:a 256k, 320k, 384k.
Surround sample file is in here. http://people.xiph.org/~xiphmont/demo/opus/demo3.shtml
I'm currently using tx->frame_bits = FFMIN(3000,...
No obvious bugs so far.
comment:53 by , 11 years ago
I used tx->frame_bits = FFMIN(2560, and psymodel.h line 32:
#define AAC_CUTOFF(s) (s->bit_rate ? FFMIN3(FFMIN3(s->bit_rate/s->channels/2, 4000 + s->bit_rate/s->channels/4, 12000 + s->bit_rate/s->channels/16), 20000, s->sample_rate / 2): (s->sample_rate / 2))
This is better on mono, surround, and on very low bitrates(such as 32kbps stereo).
truncut.wav has few HF content, so the bitrate saturates in 172kbps.
comment:54 by , 11 years ago
In 4 hours of hearing more than 100 musical, vocal, ambient and artificial sounds, on 64-480kbps, 44.1kHz, 48kHz, stereo, surround, I have found no problematic samples. This solution is great. Thank you for fixing, klaussfreire.
I think I'm going to test mono, collecting more surround samples to test, 32kHz or less, and VBR modes tomorrow.
comment:55 by , 11 years ago
comment:56 by , 11 years ago
Should I use ffmpeg_g to spot the bug? Thousands of diverse sound files are now encoded to see whether it doesn't freeze or fail.
follow-up: 60 comment:57 by , 11 years ago
Recommended cutoff frequency for FFmpeg AAC.
psymodel.h line 32:
#define AAC_CUTOFF(s) (s->bit_rate ? FFMIN3(FFMIN3(s->bit_rate/s->channels/2, 3000 + s->bit_rate/s->channels/4, 12000 + s->bit_rate/s->channels/16), 20000, s->sample_rate / 2): (s->sample_rate / 2))
The LPF is not applied in VBR now, resulting in noticeably poor quality.
comment:58 by , 11 years ago
songs: 5 min snippets of pops and jazz, 44.1kHz, stereo
non-music sounds: 16 min of artificial sounds, difficult samples, speech, etc, 48kHz, stereo
LAME equivalent | Bitrate | VBR number |
---|---|---|
16 | 0.029 | |
-V9.9 | 32 | 0.053 |
48 | 0.097 | |
-V9 | 64 | 0.23 |
-V8 | 80 | 0.43 |
-V7 | 96 | 0.55 |
-V6 | 112 | 0.66 |
-V5 | 128 | 0.86 |
144 | 1.06 | |
-V4 | 160 | 1.17 |
-V3 | 176 | 1.29 |
-V2 | 192 | 1.43 |
-V1 | 224 | 2.2 |
-V0 | 256 | 4.3 |
288 | 6.2 | |
320 | 7 | |
352 | 7.7 | |
384 | 10 |
comment:59 by , 11 years ago
How about the subjective quality on the various VBR modes, as compared to CBR (actually ABR, since a CBR setting in AAC produces ABR).
I worked hard to get good results, but there's still problematic samples, that sound better on equivalent ABR than VBR.
follow-ups: 61 69 comment:60 by , 11 years ago
Replying to Kamedo2:
psymodel.h line 32:
#define AAC_CUTOFF(s) (s->bit_rate ? FFMIN3(FFMIN3(s->bit_rate/s->channels/2, 3000 + s->bit_rate/s->channels/4, 12000 + s->bit_rate/s->channels/16), 20000, s->sample_rate / 2): (s->sample_rate / 2))The LPF is not applied in VBR now, resulting in noticeably poor quality.
Try this cutoff:
#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \ bit_rate/channels, \ 3000 + bit_rate/channels/2, \ 16000 + bit_rate/channels/8), \ 20000, \ sample_rate / 2): (sample_rate / 2)) #define AAC_CUTOFF(s) ( \ (s->flags & CODEC_FLAG_QSCALE) \ ? _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \ : _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \ )
I find it works better, the other was was pretty dull for 64k/ch, which ought to be transparent for AAC. This one also works on VBR.
by , 11 years ago
Attachment: | ffmpeg_aacvbr_pulse1.flac added |
---|
Sound disappears for about 20ms in VBR mode -q:a 5, -q:a 10. Sounds like an annoying pulse.
comment:61 by , 11 years ago
Replying to klaussfreire:
Try this cutoff:
#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \ bit_rate/channels, \ 3000 + bit_rate/channels/2, \ 16000 + bit_rate/channels/8), \ 20000, \ sample_rate / 2): (sample_rate / 2)) #define AAC_CUTOFF(s) ( \ (s->flags & CODEC_FLAG_QSCALE) \ ? _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \ : _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \ )
I tried, but isn't this cutoff strange? It sounds like the lowpass is always 20kHz.
The problem of ffmpeg_aacvbr_pulse1.flac is solved by this.
I'm using current git head 54813 + aac-improvements-wip-v2-rclookahead.2.patch + aacpsy.c Line 308
pctx->frame_bits = FFMIN(2560, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);
comment:62 by , 11 years ago
LOL, sorry, the VBR condition is backwards. An old idiocy of mine, I always reverse if conditions. Kinda like coding dyslexia.
It should be
#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \ bit_rate/channels, \ 3000 + bit_rate/channels/2, \ 12000 + bit_rate/channels/8), \ 20000, \ sample_rate / 2): (sample_rate / 2)) #define AAC_CUTOFF(s) ( \ (s->flags & CODEC_FLAG_QSCALE) \ ? _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \ : _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \ )
Though I'm getting some weird results with very low quality settings.
follow-up: 65 comment:63 by , 11 years ago
Aren't you trying to access s->bit_rate when it's VBR? Or am I missing something?
follow-up: 66 comment:64 by , 11 years ago
Is s->global_quality different from VBR number -q:a x?
LAME equivalent | Stereo Bitrate | VBR number | Recommended cutoff |
---|---|---|---|
16 | 0.029 | 4000 | |
-V9.9 | 32 | 0.053 | 7000 |
48 | 0.097 | 9000 | |
-V9 | 64 | 0.23 | 11000 |
-V8 | 80 | 0.43 | 13000 |
-V7 | 96 | 0.55 | 15000 |
-V6 | 112 | 0.66 | 15500 |
-V5 | 128 | 0.86 | 16000 |
144 | 1.06 | 16500 | |
-V4 | 160 | 1.17 | 17000 |
-V3 | 176 | 1.29 | 17500 |
-V2 | 192 | 1.43 | 18000 |
-V1 | 224 | 2.2 | 19000 |
-V0 | 256 | 4.3 | 20000 |
288 | 6.2 | 20000 | |
320 | 7 | 20000 | |
352 | 7.7 | 20000 | |
384 | 10 | 20000 |
follow-up: 68 comment:65 by , 11 years ago
Replying to Kamedo2:
Aren't you trying to access s->bit_rate when it's VBR? Or am I missing something?
Yes, bit_rate in that case holds the default of 128kbps. Psy does the same, but it works well since that's considered to be AAC's transparent rate. So, for VBR, you make psy work at transparent settings, and compensate bit allocation based on RD scaling.
comment:66 by , 11 years ago
comment:67 by , 11 years ago
I think I finally got VBR to talk to psy.
It's looking good. I'll post an updated patch with all this in a while (still lots of tests to perform)
comment:68 by , 11 years ago
Replying to klaussfreire:
Yes, bit_rate in that case holds the default of 128kbps. Psy does the same, but it works well since that's considered to be AAC's transparent rate.
AAC is not transparent in 128kbps stereo, although Apple used to advertise that way. http://d.hatena.ne.jp/kamedo2/20111029/1319840519
comment:69 by , 11 years ago
Replying to klaussfreire:
#define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \ bit_rate/channels, \ 3000 + bit_rate/channels/2, \ 16000 + bit_rate/channels/8), \ 20000, \ sample_rate / 2): (sample_rate / 2)) #define AAC_CUTOFF(s) ( \ (s->flags & CODEC_FLAG_QSCALE) \ ? _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \ : _AAC_CUTOFF((int)(s->bit_rate * (s->global_quality ? s->global_quality : 120) / 120.0), 2, s->sample_rate) \ )I find it works better, the other was was pretty dull for 64k/ch, which ought to be transparent for AAC. This one also works on VBR.
The high cutoff causes trouble for whitenoise.flac below 55kbps.
And I'm almost certain 16kHz is optimal at 128kbps stereo.
http://d.hatena.ne.jp/kamedo2/20120221/1329845124
http://d.hatena.ne.jp/kamedo2/20120729/1343545890
comment:70 by , 11 years ago
I recommend psymodel.h line 24 to be:
#include "libavutil/libm.h" #include "avcodec.h" /** maximum possible number of bands */ #define PSY_MAX_BANDS 128 /** maximum number of channels */ #define PSY_MAX_CHANS 24 #define _AAC_CUTOFF(bit_rate,channels,sample_rate) (bit_rate ? FFMIN3(FFMIN3( \ bit_rate/channels/2, \ 3000 + bit_rate/channels/4, \ 12000 + bit_rate/channels/16), \ 20000, \ sample_rate / 2): (sample_rate / 2)) #define AAC_CUTOFF(s) ( \ (s->flags & CODEC_FLAG_QSCALE) \ ? _AAC_CUTOFF(((int)(135000.0f*sqrtf(s->global_quality ? s->global_quality/120.0f : 1.0f))), 2, s->sample_rate) \ : _AAC_CUTOFF(s->bit_rate, s->channels, s->sample_rate) \ )
In this way, I can set cutoff to VBR modes as well.
PSY_MAX_CHANS 24 is to accommodate NHK 22.2ch.
I notice that in -q:a 0.2 and -q:a 0.4, the lower freq is in trouble. It sounds like a thunder far away.
comment:71 by , 11 years ago
Yes, I'm fixing the lower frequency right now. It's a matter with tonal band priorization that in VBR doesn't really work as intended. I'm preparing a better patch now. I'll test your cutoffs.
comment:72 by , 11 years ago
After applying the new LPF at comment:70, the result bitrate of music changed a bit. I think I have to replot the graph. And one more problem. -q:a 0.029 or -q:a 10 is unfriendly for an average user. I think the value should be roughly equivalent of LAME. I mean, if one use -q:a 2, the result of average sound is roughly 96kbps/channel, which is the same behavior as LAME -V2. Is applying new LPF method comment:51 easy?
comment:73 by , 11 years ago
After two days of toying around, the butterworth filter used in psy is actually counterproductive. Keeping all things equal, lowering the cutoff actually increases bitrate, if a fixed RD is forced. So, for VBR, it's a no-no.
I'm trying an FFT-based LP by simply zeroing coeffs, with care at the boundary to minimize ripple, and it seems to work a lot better, at least for VBR.
Right now, the implementation is just a POC. It's very dirty. But I'm getting convinced this is the way for VBR... and maybe for ABR too. I'm not sure.
Edit: And, to boot, an FFT is phase-linear. I can actually hear group delay with the butterwroth. Ugly.
comment:74 by , 11 years ago
Is that FFT, not MDCT?
I'm guessing that lowering the cutoff increases the bitrate is the effect of comment:32. Very strange, as HF contents usually takes up more bits, but it makes sense.
comment:75 by , 11 years ago
You're right, the one I have done right now is MDCT, because it's done within the bit allocator. But I've been meaning to implement an actual FFT filter later on, if not too hard, and if the technique pans out.
comment:76 by , 11 years ago
Thing is, the butterworth doesn't really remove that much content, and it changes the masking thresholds in a way that actually requires more bits to encode. A higher-order butterworth might work, but it would have way too much group delay.
comment:77 by , 11 years ago
BTW, wait before you redo that graph, I have a much better VBR patch almost ready.
comment:78 by , 11 years ago
Alright, i'm attaching a new VBR patch. CBR/ABR shouldn't have changed (shouldn't, but might). I will probably want to apply the same logic to CBR/ABR as well, since it works very well (ie: cutoff not with a filter but with the bit allocator, stop spending bits on HF if we're starving for bits).
A heads-up: VBR's q-to-kbps curve has changed, and there's some artifacts that sound like scratchy noises (especially audible in the sine sample), that are due to clipping. I think it's not specific to this patch, but I just noticed it. I'm not sure how to attack it. Normally, I'd apply compression on the IMDCT stage, but since that's on the decoder side, I'll probably have to find a clever way to predict clipping on the encoder and compensate. Craptastic.
Anyway, I do think VBR has been greatly improved on this patch. Let me know what you think.
by , 11 years ago
Attachment: | aac-improvements-wip-v3-vbr.patch added |
---|
VBR improvements over wip-v2-rclookahead
comment:79 by , 11 years ago
I believe your latest patch contains trailing whitespace (that cannot be committed to FFmpeg git), consider running tools/patcheck over the diff.
comment:80 by , 11 years ago
I successfully applied the patch from latest git head N-54889-g47d57f2.
comment:81 by , 11 years ago
comment:82 by , 11 years ago
Yeah it seems to have an anomaly around 1. I had only tested whitenoise up to 0.7. I'll try to patch it up.
comment:83 by , 11 years ago
Ah, yeah, I know. It's probably the scaler offset. It must be unpredictable in whitenoise because of how flat the envelope is.
follow-up: 86 comment:84 by , 11 years ago
I don't recommend to ambitiously try to save the HF content above 18kHz when there are enough bits. It sounds unstable. Some 1990s early MP3 encoders had the tactic, but none of them were good. Rather, clean, fixed LPF should be applied at all time. Avoid the situation that one can hear the 12-20kHz content in some part of the music, and hearing the dull 12kHz LPF-like sound in the other part of the music.
As for
pctx->frame_bits = FFMIN(2560, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);
do we get more stable results when the number 2560 is lowered?
(240kbps is a 'megadose' or 'overkill' bitrate for AAC, so slight degradation is not a major problem.)
comment:85 by , 11 years ago
follow-up: 87 comment:86 by , 11 years ago
Replying to Kamedo2:
I don't recommend to ambitiously try to save the HF content above 18kHz when there are enough bits. It sounds unstable. Some 1990s early MP3 encoders had the tactic, but none of them were good. Rather, clean, fixed LPF should be applied at all time. Avoid the situation that one can hear the 12-20kHz content in some part of the music, and hearing the dull 12kHz LPF-like sound in the other part of the music.
I just want to preserve the HF component of transients. There might be better ways of doing that. I guess I'll keep iterating on it. However, I believe the way it's being done now works well. If you check, the LP cutoff is chosen from the allocation given by psy. Psy contains bit reservoir logic, which means it will momentarily increase bits (and cutoff) for some difficult transients. Right now, it works wonders for hi-hats.
I will probably have to be stricter about the cutoff, though. As you say, when the signal by itself (not by psy's indication, but signal strength alone) suddenly jumps in HF content, the result is unpleasant. I think I have cleaned up most of those cases, but who knows. It's hard to discern those from actual transients.
As for
pctx->frame_bits = FFMIN(2560, chan_bitrate * AAC_BLOCK_SIZE_LONG / ctx->avctx->sample_rate);do we get more stable results when the number 2560 is lowered?
(240kbps is a 'megadose' or 'overkill' bitrate for AAC, so slight degradation is not a major problem.)
If it doesn't limit the ability to increase allocation for transients, it might. I'll look into it.
follow-up: 88 comment:87 by , 11 years ago
Replying to klaussfreire:
I just want to preserve the HF component of transients. There might be better ways of doing that. I guess I'll keep iterating on it. However, I believe the way it's being done now works well. If you check, the LP cutoff is chosen from the allocation given by psy. Psy contains bit reservoir logic, which means it will momentarily increase bits (and cutoff) for some difficult transients. Right now, it works wonders for hi-hats.
So, if there is a group of beat sounds that is on the threshold of tonal/transients, the LPF is sometimes on and sometimes off? Currently, the on/off switch itself is audible and is quite annoying. It sounds like a stopwatch.
I will probably have to be stricter about the cutoff, though. As you say, when the signal by itself (not by psy's indication, but signal strength alone) suddenly jumps in HF content, the result is unpleasant. I think I have cleaned up most of those cases, but who knows. It's hard to discern those from actual transients.
ffmpeg_aacvbr_pulse1.flac at -q:a 0.25 produces strange HF sounds.
by , 11 years ago
Attachment: | ffmpeg_aacvbr_pulse2.flac added |
---|
Partial white noise, splitted by 256th lanczos filter. HF pulse noise that sounds like stopwatch is added in VBR around -a:q 0.3
comment:88 by , 11 years ago
Replying to Kamedo2:
Replying to klaussfreire:
I just want to preserve the HF component of transients. There might be better ways of doing that. I guess I'll keep iterating on it. However, I believe the way it's being done now works well. If you check, the LP cutoff is chosen from the allocation given by psy. Psy contains bit reservoir logic, which means it will momentarily increase bits (and cutoff) for some difficult transients. Right now, it works wonders for hi-hats.
So, if there is a group of beat sounds that is on the threshold of tonal/transients, the LPF is sometimes on and sometimes off? Currently, the on/off switch itself is audible and is quite annoying. It sounds like a stopwatch.
No, the cutoff moves up and down, but the LP remains on.
I'll have to check the sample
follow-up: 90 comment:89 by , 11 years ago
You seems to be using the heuristics that transients HF components are loud and tonal HF components are quiet.
comment:90 by , 11 years ago
Replying to Kamedo2:
You seems to be using the heuristics that transients HF components are loud and tonal HF components are quiet.
No, I let psy detect the transients. The only heuristic, is that I attempt to encode a little bit more of the HF with decreased quality.
Ie, from 0-cutoff, normal quantization. From cutoff-cutoff * 1.2, coarse (progressively coarser in fact) quantization. Now, I let bit allocation zero out beyond 1.2. I may have to force it to avoid the artifacts you mention.
comment:91 by , 11 years ago
Seeing the spectrogram, sometimes, up to 22kHz is encoded. No way we can hear that high. However, because of your algorithm, the cutoff seems to be much higher than it actually is, and the sound is much clearer in typical cases. But we have to be careful of exceptions. I think I feel strange when the encoded_highest_sound - normal_cutoff is more than 3kHz. Sounds something like plip, plip. Is coarse quantization at cutoff~cutoff*1.2 applied only to transients?
comment:92 by , 11 years ago
No, that's applied to tonal signals as well. A way to squeeze a little extra bandwidth. It proved to be a winning move for music, though I didn't test that much with noise.
follow-up: 95 comment:93 by , 11 years ago
Is that included in a wip-v3-vbr.patch, or a new feature? It sounds like the extra HF content encode is only on transients. And some transients are indeed encoded up to 22kHz.
Are HF contents over cutoff*1.2 totally discarded? (I believe this is the best move.)
comment:94 by , 11 years ago
The LAME sometimes acts like your algorithm, but within 2kHz or so. It's related to -Y switch, and LAME sometimes encodes 16~18kHz contents.
comment:95 by , 11 years ago
Replying to Kamedo2:
Is that included in a wip-v3-vbr.patch
Yes
Are HF contents over cutoff*1.2 totally discarded? (I believe this is the best move.)
No, and maybe that's the problem. 1.2 just happens to be the point at which the increased quantization floor starts zeroing out all components. Until that, RD optimization brings down the quantization floor to maintain acceptable quality, so you don't notice the floor rising (and it fact it doesn't for fully tonal bands, that's what RD optimization is about, whereas it does rise for noisy ones).
So, in essence, up to cutoff * 1.2, tonal components are retained at the expense of HF noise, which seems like a sensible tradeoff.
What must be happening, is that, on some signals, the zeroing point happens above 1.2, significantly above. So it's perhaps wise to hardcode that 1.2 value, and force a zero on those bands instead.
comment:96 by , 11 years ago
I think we should hardcode min(cutoff+2500, cutoff*1.2). When cutoff is 18kHz, cutoff*1.2 is 21.6kHz which is too high. Could you provide the relation between -q:a value and cutoff so we can have better grasp on what's happening?
comment:97 by , 11 years ago
So, I tracked the anomaly near -q:a 1 to the ESC_BT codebook. It seems when noise floors are too low, the coefficients can't be properly encoded, and all kinds of bad things ensue. I'll see how to fix it.
comment:98 by , 11 years ago
I noticed that this new VBR encoder has zero delay. ABR encoder at 64kbps stereo has 1 sample delay. Probably because the lack of the butterworth LPF.
comment:99 by , 11 years ago
That's why I want to get rid of the butterworth. It's good, but FFT is better, since it's phase-linear. With all the quantization noise I don't think we care that much about ripple, but even if we did, FFTs can be made to minimize it.
comment:100 by , 11 years ago
I think I can start the blind test from August 3rd. With the results, we can overwrite the outdated FFmpeg AAC Encoding Guide. https://trac.ffmpeg.org/wiki/AACEncodingGuide
comment:101 by , 11 years ago
Is the comment:97 fixable? I think it will contribute to higher quality in 160kbps and 192kbps. Currently, it is still worse than the mighty Apple AAC.
I assume most blocks are long(1024 samples) tonal blocks, and short, transient blocks are rare, that are apparently causing problems, am I right?
comment:102 by , 11 years ago
Yes, I have a fix in the works. That limitation is the reason the standard limits allocation to 3000 bits, most likely.
comment:103 by , 11 years ago
Isn't aaccoder.c line 787~795 strange? I believe somewhere making cutoff value or using cutoff value should be the source of the trouble, which causes weird sounds in low bitrates such as -q:a 0.25.
comment:104 by , 11 years ago
So, I tried a whole new approach, and it seems vastly superior.
I modified psy's "Rate control" to work differently for VBR. Instead of using the bit reservoir, it just computes the optimum PE and scales it by quality. And it works nicely. I still had to push scalers a bit more on the allocator and do the LP filtering to reach the very low bit rates with VBR, but it's sounding a lot better.
I'll do some more testing and then upload the updated patch.
comment:106 by , 11 years ago
I inserted
av_log(NULL, AV_LOG_DEBUG, "\n cutoff=%d, lambda=%f, frame_bit_rate=%d, bandwidth=%d\n",cutoff,lambda,frame_bit_rate,bandwidth);
in aaccoder.c twoloop line 795, and found cutoff differs between different frames. I used -q:a 0.4, stereo 44.1kHz. I assume <99 cutoffs are the short blocks and 500< cutoffs are the long tonal blocks. The cutoff varies throughout the same music. 11.7k~13.6k for the short blocks, 11.5k~13.2k for the long blocks. (Calculated from the 25 raw examples below)
cutoff=77, lambda=47.000000, frame_bit_rate=46034, bandwidth=14508 cutoff=614, lambda=47.000000, frame_bit_rate=45648, bandwidth=14412 cutoff=76, lambda=47.000000, frame_bit_rate=45648, bandwidth=14412 cutoff=612, lambda=47.000000, frame_bit_rate=45417, bandwidth=14354 cutoff=76, lambda=47.000000, frame_bit_rate=45417, bandwidth=14354 cutoff=532, lambda=47.000000, frame_bit_rate=37937, bandwidth=12484 Last message repeated 1 times cutoff=538, lambda=47.000000, frame_bit_rate=38477, bandwidth=12619 Last message repeated 1 times size= 242kB time=00:00:15.80 bitrate= 125.2kbits/s cutoff=68, lambda=47.000000, frame_bit_rate=39017, bandwidth=12754 cutoff=544, lambda=47.000000, frame_bit_rate=39017, bandwidth=12754 cutoff=548, lambda=47.000000, frame_bit_rate=39402, bandwidth=12850 Last message repeated 1 times cutoff=551, lambda=47.000000, frame_bit_rate=39711, bandwidth=12927 Last message repeated 1 times cutoff=554, lambda=47.000000, frame_bit_rate=39942, bandwidth=12985 Last message repeated 1 times cutoff=69, lambda=47.000000, frame_bit_rate=40173, bandwidth=13043 cutoff=556, lambda=47.000000, frame_bit_rate=40173, bandwidth=13043 cutoff=69, lambda=47.000000, frame_bit_rate=40405, bandwidth=13101 cutoff=558, lambda=47.000000, frame_bit_rate=40405, bandwidth=13101 cutoff=561, lambda=47.000000, frame_bit_rate=40636, bandwidth=13159 Last message repeated 1 times cutoff=562, lambda=47.000000, frame_bit_rate=40713, bandwidth=13178 Last message repeated 1 times cutoff=71, lambda=47.000000, frame_bit_rate=41870, bandwidth=13467 cutoff=574, lambda=47.000000, frame_bit_rate=41870, bandwidth=13467 cutoff=79, lambda=47.000000, frame_bit_rate=47653, bandwidth=14913 Last message repeated 1 times cutoff=78, lambda=47.000000, frame_bit_rate=46651, bandwidth=14662 Last message repeated 1 times cutoff=76, lambda=47.000000, frame_bit_rate=45031, bandwidth=14257 Last message repeated 1 times [output stream 0:0 @ 04adab60] EOF on sink link output stream 0:0:default. No more output streams to write to, finishing. cutoff=75, lambda=47.000000, frame_bit_rate=44337, bandwidth=14084 Last message repeated 1 times cutoff=68, lambda=47.000000, frame_bit_rate=39711, bandwidth=12927 Last message repeated 1 times [aac @ 04aaf580] Trying to remove 504 more samples than there are in the queue size= 253kB time=00:00:16.10 bitrate= 128.9kbits/s video:0kB audio:250kB subtitle:0 global headers:0kB muxing overhead 1.475195% 755 frames successfully decoded, 0 decoding errors [AVIOContext @ 04ad0440] Statistics: 30 seeks, 779 writeouts [AVIOContext @ 04d6f8a0] Statistics: 3123324 bytes read, 2 seeks
ffmpeg54890g.exe -v 9 -loglevel 99 -i ffmpeg_aacvbr_pulse2.wav -c:a aac -strict experimental -q:a 0.4 ffmpeg_aacvbr_pulse2.mp4
I tried to automate it by batch script, including preserving the av_log output but somehow it freezes.
comment:107 by , 11 years ago
Don't worry, for the new patch I'm using refbits instead of destbits, refbits is a direct derivation of lambda, so it won't change. I couldn't make the changing bandwidth work in a stable fashion without a lot more work, so I'll reserve that for a further patch, maybe.
comment:110 by , 11 years ago
Patience. Later today, or perhaps tomorrow, depending on your time zone
comment:111 by , 11 years ago
Damn. The patch works wonderfully well in VBR, but breaks CBR. I'll have to look into it during the weekend.
Patience indeed.
comment:112 by , 11 years ago
Yes, the VBR sounds dull and is currently(at v3) poorer than CBR, and it should have a lot of room to improve.
comment:113 by , 11 years ago
I've encoded weeks of AACs using v3 patch, using diverse samples and diverse bitrates and there were no problem(empty files, return with errors, freezes).
comment:114 by , 11 years ago
klaussfreire, could you provide the VBR-only patch? I'd like to test it. I may be able to detect the problem(s).
by , 11 years ago
Attachment: | aac-improvements-wip-v4-vbr.patch added |
---|
Improved VBR, fixed psy threshold reduction bug
comment:115 by , 11 years ago
Attached the current WIP.
An explanation of what caused the bug for high q values: there was a bug in psy's threshold reduction for hole avoidance. When a second pass was needed, it would accumulate errors due to a simple typo (reduction += instead of reduction =).
I don't have the 3GPP spec to check, but I just noticed the code made no sense with the +=, but did with =.
Then there's the ESC_BT thing.
I think most serious anomalies have been fixed in this bug, I haven't had time to properly test CBR, but it seems to mostly work now. That was very subtle bit reservoir a bug on my "lookahead" patch that didn't surface until I fixed psy.
Anyway, I still would like to make VBR achieve lower bitrates without having to resort to LP filtering. I somehow sense it should be possible. In any case, I made CBR also use the same scalefactor-band-based LP filtering to remove the need for the butterworth that didn't save many bits anyway, and now it responds to the -cutoff argument, so if you don't like the default cutoff you can override yourself. It seemed worth parameterizing since I've found some sources that sound better at low bit rates with higher cutoffs, and some that don't. So it's source-dependent.
Anyway, enjoy the patch, I'm not sure I'll have time to work on a more permanent (one that I'd push to trunk) one till next weekend.
comment:116 by , 11 years ago
Yes, the cutoff is quite source-dependent, and listener-dependent too. Older people may prefer lower cutoffs. BTW, I'm 25 yrs old.
comment:118 by , 11 years ago
? (refbits * 1.6f * avctx->sample_rate / 1024)
to
? (refbits * 2.5f * avctx->sample_rate / 1024)
raises the LPF and the sound is much clearer(at the cost of more noise, but it's certainly better per real bitrate).
I feel the sound is bad in only tonal part of the music in VBR. And this encoder uses fewer bits, sometimes nearly half less, for the tonal part, unlike Opus, which has a distinctive tonality boost function.
comment:119 by , 11 years ago
Yes, I was in the middle of tweaking rdlambda scale for VBR (which is what gives the tonality boost). It seems way off target for VBR, since a lambda that in VBR results in 64kbps, in CBR it will give you about 32 or less.
With that properly tweaked, we can save lots of bits from noisy bands and put them to better use on tonal bands. For VBR, that means lower bitrates for the same quality level.
Increasing cutoff like you did there has the unwanted side effect of lowering quality a bit too much on tonal bands, for a set file size. I do my tests by searching through -q:a until I get a file roughly the same size as a reference CBR-encoded version, and comparing quality among those. With higher cutoffs, that procedure resulted in noticeable distortion on the HF bands, which is why I left it at 1.6, and it's what I believe will be fixed by tweaking rdlambda for VBR.
It can also be fixed by implementing codebook 13. But that's for another (future, way future) patch, since I see no easy way to implement CB 13 with twoloop, so I'll have to rewrite it.
comment:120 by , 11 years ago
This paper, fig. 6 shows bit allocation curves, although this is Opus.
http://jmvalin.ca/papers/aes135_opus_celt.pdf
comment:122 by , 11 years ago
Is aaccoder.c line 829:
if (start >= cutoff || band->energy <= (band->threshold * zeroscale) || band->threshold == 0.0) {
correct? Not start >= cutoff+cutoff/5?
comment:123 by , 11 years ago
Yep, the cutoff is used as-is in this patch, the offset is already accounted for in its computation above that.
follow-up: 130 comment:124 by , 11 years ago
I've encoded weeks of AACs using v4 patch, using diverse samples and diverse bitrates and there were no problem(empty files, return with errors, freezes).
Is 'tweaking rdlambda for VBR' ready? If not, I think I should test v4 ABR first, because it's stable, have less artifacts in tonal samples. The blind test will be conducted in ABC/HR methodology, and there should be some opponents. I'm thinking of...
- current git head with no patch, abr
- v4 patch(or anything latest), abr
- fdk-aac, abr
The bitrate will be 96kbps and 128kbps.
comment:125 by , 11 years ago
Or, I can drop fdk-aac and instead test on 3 bitrates. Do you have any idea?
comment:127 by , 11 years ago
Replying to cehoyos:
Comparing with libfaac would be useful...
Is comment:69 not enough? (The test was in 2012 July.)
comment:128 by , 11 years ago
I thought that additional improvements were made since (and if ffaac does not beat libfaac and assuming fdk-aac beats libfaac, it might make more sense to compare with libfaac) but please don't let me misguide you.
comment:129 by , 11 years ago
I don't think many people will use libfaac. Both libfaac and libfdk_aac are non-free, and if many people prefer fdk-aac over faac, the new results of the new fdk-aac is more interesting than the another results of the old faac. (As far as I know, there are no blind test of fdk-aac.)
comment:130 by , 11 years ago
comment:131 by , 11 years ago
This is not my last test, and for a desire to compare this encoder with other encoders, I can do so later. By that time, I hope the new VBR is the state-of-the-art encoder.
comment:132 by , 11 years ago
I'm going to use these 20 samples below. There are six opponents(the first 3 are 96kbps, and the last 3 are 128kbps), so I have to score 6*20=120 sounds. The test is ready.
http://www.hydrogenaudio.org/forums/index.php?showtopic=98003
comment:133 by , 11 years ago
Hi All,
Great to see that the native AAC encoder is getting some attention, and trying to make it mainstream. Using Windows 7 and Zeranoe's FFmpeg builds, I only get a choice of "The Native Encoder" or "libvo_aacenc".
From what I have read "libvo_aacenc" only seams to support sterio not 5.1 or higher.
I am no audiophile and a little hard of hearing so I cannot find fault with the Native Encoder but I can tell the difference between 2 and 6 channels :-)
Keep up the good work on a great piece of software.
Regards,
Mark
comment:134 by , 11 years ago
ffmpeg55212 -y -i input.wav -c:a aac -strict experimental -b:a 96k output.mp4 ffmpeg55212_patchv4 -y -i input.wav -c:a aac -strict experimental -b:a 96k output.mp4 ffmpeg55212 -y -i input.wav -c:a libfdk_aac -b:a 96k -afterburner 1 output.mp4 ffmpeg55212 -y -i input.wav -c:a aac -strict experimental -b:a 128k output.mp4 ffmpeg55212_patchv4 -y -i input.wav -c:a aac -strict experimental -b:a 128k output.mp4 ffmpeg55212 -y -i input.wav -c:a libfdk_aac -b:a 128k -afterburner 1 output.mp4 faad -b 4 -o output.float.wav output.mp4
The ABC/HR test is ongoing. These six outputs were shuffled and I listen to them without knowing which is which. I've done 2 samples out of 20. 10% done.
by , 11 years ago
Attachment: | fdkaac_10_12.zip added |
---|
by , 11 years ago
Attachment: | fdkaac_13_16.zip added |
---|
samples # 10 - # 12 encoded by fdkaac. *2.mp4 are the 128kbps samples, the others are the 96kbps samples.
comment:138 by , 11 years ago
I think I've found the source of most of the "annoying" artifacts. With the recent fix to psy's hole avoidance, lots of the rate control hacks in the lookahead code are no longer necessary, since the bit reservoir now actually works. Though if I do completely disable them, the target bit rate is largely missed, so some RC stuff is still needed.
In short, RC hacks screw up on transients. I guess I'll have to explicitly limit RC hacks to non-transients (with perhaps some hysteresis). I'm working on a v5 fixing that.
Still, to get to fdk quality, I think we'll need to fix M/S encoding (which still has some artifacts, if it didn't, it can be a big efficiency bost) and implement codebook 13 (which fdk seems to use, though I haven't confirmed this). That's a much bigger project though.
comment:139 by , 11 years ago
Great, I'm guessing it's the reason why some samples got much poorer results than the fdk. Should I abort the v4 abr test and instead test on v5 after the release of 5? I'm on holiday now, but after August 26th, I'll move to more quiet place, so I can test more effectively.
comment:140 by , 11 years ago
I think I'll get you the v5 soonish, but I have an office to move this weekend so it may not be as soon as you'd like. In any case, soonish.
comment:142 by , 11 years ago
Replying to Kamedo2:
How is the development of v5?
Sorry, urgent personal issues prevented me from reaching my self-imposed deadline. I'll try to dedicate some time to it as soon as I'm able, though. Next post ought to be a patch.
follow-up: 144 comment:143 by , 11 years ago
I resumed the ABC/HR test, and I've done 13 samples out of 20. How is the development going?
comment:144 by , 11 years ago
Replying to Kamedo2:
I resumed the ABC/HR test, and I've done 13 samples out of 20. How is the development going?
Stalled for now, but I'll be able to resume soon
comment:145 by , 11 years ago
Cc: | added |
---|
comment:147 by , 11 years ago
Yes, please do. I'll make sure to address those concerns as well, and we'll save one round trip
comment:148 by , 11 years ago
You can download the original sound here. http://www.hydrogenaudio.org/forums/index.php?showtopic=98003
comment:149 by , 11 years ago
Oops, -b:a 128k, not -b:a 96k in the 128kbps exp+v4 column.
By the way, why is the FFT used in LPF? Couldn't it use MDCT and simply zeroing higher coefficients? Maybe I am missing something.
follow-up: 151 comment:150 by , 11 years ago
I'll finish the test soon(16/20, 80%). What should be the next opponents in the next blind listening test including the newer patch? I'm thinking of...
- current git head with no patch, abr
- next patch, abr
- next patch, vbr
- fdk-aac, abr
and possibly...
- libopus, vbr
- libmp3lame, vbr
Do you have any idea?
follow-up: 152 comment:151 by , 11 years ago
Replying to Kamedo2:
and possibly...
- libopus, vbr
- libmp3lame, vbr
Do you have any idea?
If you have time, it would be interesting to compare to the quality of other FFmpeg audio encoders, ie ac3, eac3 and mp2.
follow-up: 153 comment:152 by , 11 years ago
Replying to cehoyos:
If you have time, it would be interesting to compare to the quality of other FFmpeg audio encoders, ie ac3, eac3 and mp2.
It may be wrong, but I guess the ac3 is the most used variant. The bitrate will be around 128kbps, so the extremely high bitrate of eac3 will not fit the frame, I think. Are there some important use of eac3 and mp2, other than the BD and VCD encoding? (For BD the space is huge and quality at lower bitrate is insignificant.)
comment:153 by , 11 years ago
Replying to Kamedo2:
Replying to cehoyos:
If you have time, it would be interesting to compare to the quality of other FFmpeg audio encoders, ie ac3, eac3 and mp2.
It may be wrong, but I guess the ac3 is the most used variant. The bitrate will be around 128kbps, so the extremely high bitrate of eac3 will not fit the frame,
I am not sure I understand you.
Afaik, nobody ever made a listening test using different internal FFmpeg encoders (not even a very cursory one). It would be interesting to know that "96kb eac3 ~ 128 kb ac3 ~ 128kb aac ~ 256kb mp2" (I assume this isn't the case, just as an example). Even if done with much less effort than your above tests (if you just mention your impression of each encoder after a few tests), I believe this would be interesting information.
It was sometimes claimed that the wma encoders produce abysmal quality, so your comment on them (possibly with higher bitrates) would also be welcome.
I think. Are there some important use of eac3 and mp2, other than the BD and VCD encoding? (For BD the space is huge and quality at lower bitrate is insignificant.)
I believe that ac3 is a very important codec (WMP plays it out-of-the-box in different containers), knowing if eac3 beats it would be interesting.
comment:154 by , 11 years ago
I don't think of any good use of eac3, other than for BD. BD can have 32Mbps, and eac3 can have up to 6144kbps. If audio quality matters, simply use the maximum bitrate. And having more opponents in parallel slow down the test. However, we need a low anchor and possibly a high anchor. I think libopus will act as a high anchor and aac without patch act as a low anchor.
There are some good uses of wma, such as encoding for an old car stereo that plays MP3/WMA, but WMAEncode 0.2.9b is far more usable. The quality is in between LAME and Apple AAC.
follow-up: 204 comment:155 by , 11 years ago
This document recommends to use -cutoff 15000 option. Too outdated, the cutoff is automatically applied since July 2012.
http://ffmpeg.org/ffmpeg-codecs.html#aac
This is the data I sent in 2012.
By the way, the progress of the listening test is 95%(19/20) now.
comment:156 by , 11 years ago
I finished the test and I uploaded the results.
http://www.hydrogenaudio.org/forums/index.php?showtopic=102699
by , 11 years ago
Attachment: | aac-improvements-wip-v5.patch added |
---|
V5 patch, twoloop RD fixed (I think)
comment:158 by , 11 years ago
So, I attached a patch that moves in the right direction (I think).
Most of the worse-performing samples, I noticed, had to do with hole avoidance being quickly violated when using low bit rates. So I re-did twoloop's RD improvement step to better respect hole avoidance, to be asymmetric in its scale manipulation (ie: to avoid adding all 1 or all 2, which would be quickly undone by the bitrate adjustment step), and everything seemed to work a lot better.
However, on the "asymmetric" little word, there's a huge hack involved. I wouldn't want to waste your time without a warning: this hack can most assuredly be improved. But I don't think I'll waste time improving a hack, since the real solution is to implement a dynamic programming coder, which I intend to do in the future. So while hackish and probably suboptimal, I'll probably leave it as-is since it works well enough.
I haven't tested VBR much. From what I tested, it seems mostly unharmed, but it still needs a better calibrated cutoff. That will take time (lets say it'll be v6).
So, this patch should be good enough for ABR. VBR will need a v6, and some day (time permitting) I'll post the patch with the dynamic coder.
I couldn't quite match FDK performance, but I suspect there's two reasons for this. First, M/S coding isn't as good as it should be. And 2, FDK probably uses a dynamic coder. So I think we'll catch FDK with the dynamic coder (which can also do the M/S part, so it'll fix both with one shot).
However, I tested most of the samples in your session, and they've all improved. Some more than others, of course. So, if not all the samples, you might want to retest the worst offenders.
Edit: I also haven't tested higher bit rates. I will tomorrow.
follow-up: 162 comment:159 by , 11 years ago
The v5 patch is encoding at 15-50x realtime, depending on bitrate and type of music encoded.
comment:160 by , 11 years ago
I changed aaccoder.c line 806 from
? (refbits * 1.6f * avctx->sample_rate / 1024)
to
? (refbits * 2.4f * avctx->sample_rate / 1024)
This is certainly better, although exact optimal value is debatable.
I encoded 2 days of diverse sounds with many settings, and listened to 2 hours of the sounds. This encoder do a relatively good job even in abr 96kbps. It's not a blind test, but I feel the improvement. Also, I compared abr 128kbps vs vbr -q 0.3, but still, abr is better. The vbr exposes its weak point in relatively quiet, tonal sections. Low S/N and stronger LPF effect.
comment:161 by , 11 years ago
I listened to about 8 hours of songs, movies, sine and white noise, and 5.1ch surround source. I'd say that abr is mature.
klaussfreire, could you add a "redirect" feature that when set bitrate is too high, redirect to the maximum bitrate possible, rather than to print the error message and stop. This simplify many batch encodes, including when encoding from hundreds of videos that have various audio frequencies and number of channels. Currently it gets:
[aac @ 013efa60] Too many bits per frame requested
Also, I notice that this commandline
ffmpeg -i ffmpeg_aacvbr_pulse1.wav -c:a aac -strict experimental -q:a 0.1 -ar 8000 -ac 1 ffmpeg_aacvbr_pulse1.mp4
gets the same Too many bits warning, and lowering the quality -q:a don't work. It only works when using -b:a, or setting higher frequency such as -ar 22050. It could be a problem when encoding from a video taken by some old digital cameras with 8kHz pcm audio attached.
The error message:
ffmpeg56470.exe -y -i ffmpeg_aacvbr_pulse1.wav -c:a aac -strict experimental -q:a 0.3 -ar 8000 ffmpeg_aacvbr_pulse1.mp4 ffmpeg version N-56469-gf6622f9 Copyright (c) 2000-2013 the FFmpeg developers built on Sep 20 2013 15:29:55 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk -aac --extra-ldflags=-static --extra-cflags='-march=native -mfpmath=sse' --optfl ags=-O2 libavutil 52. 45.100 / 52. 45.100 libavcodec 55. 33.100 / 55. 33.100 libavformat 55. 18.100 / 55. 18.100 libavdevice 55. 3.100 / 55. 3.100 libavfilter 3. 86.102 / 3. 86.102 libswscale 2. 5.100 / 2. 5.100 libswresample 0. 17.103 / 0. 17.103 libpostproc 52. 3.100 / 52. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'ffmpeg_aacvbr_pulse1.wav': Metadata: encoder : Coderium SoundEngine 4.59 Duration: 00:00:12.12, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16 , 1411 kb/s [aac @ 030cbf00] Too many bits per frame requested Output #0, mp4, to 'ffmpeg_aacvbr_pulse1.mp4': Metadata: encoder : Coderium SoundEngine 4.59 Stream #0:0: Audio: aac, 8000 Hz, stereo, fltp, 128 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le -> aac) Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
I think this is about time we remove the -strict experimental flag.
comment:162 by , 11 years ago
Replying to Kamedo2:
The v5 patch is encoding at 15-50x realtime, depending on bitrate and type of music encoded.
I believe I may have to disappoint you there. One of the optimizations that does that, is acting up on ABR, I noticed improved quality by restricting it, so the v6 with optimized VBR will have that disabled as well (and thus be a tad slower).
I thought that optimization was result-neutral, but it seems it isn't.
comment:163 by , 11 years ago
15x speed is 'tolerable' :)
I've encoded more than 50GB of mp4s, including surround 5.1ch with more than 1Mbps etc... and listened to 12 hours of mainly Pop music. v5 seems to be stable. Is fixing "Too many bits per frame requested" error easy?
comment:164 by , 11 years ago
I can make it only applicable when using ABR, but I think it's a useful message.
I could also turn it into a warning, I think.
comment:165 by , 11 years ago
I prefer warnings, rather than the error messages and stop. Kind, and easier to use.
By the way, I'll be free from September 28th, and I'm considering a listening test of
- v4 abr
- v6 abr
- v6 vbr
- fdk-aac vbr
- ac3 abr
- libmp3lame vbr
I've got a request of testing libfaac, mp2, and eac3, but I'm running out of the "slot".
From my normal non-blind listening of average music, my current impression is:
fdk-aac > libmp3lame > v5 abr >> v4 abr > v5 vbr > ac3
comment:166 by , 11 years ago
v5 vbr is still quite worse than the abr. I feel that whenever tonal sounds are there, the frequency bin around the tone degrades. Tones are poorer at hiding other sounds than the noise, that's why harpsichords remains to be one of the most critical and hardest instruments to code. http://wiki.hydrogenaudio.org/index.php?title=Perceptual_Noise_Substitution
follow-up: 168 comment:167 by , 11 years ago
Well, v6 is almost ready. I just need to clean it up a bit. I'll probably do that tonight.
In v6, my non-blind tests make me believe that v6 vbr > v6 abr > v5 abr
.
Not sure how you compare abr vs vbr, what I do is pick a file or set of files, do a binary search of the quality level that results in the same overall file size, and then compare. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I exclude those, since they're pathological.
When I push the patches to the ML, I'll make most of what makes v6 vbr go crazy on techno (the relatively high peak bit rate allowance) configurable anyway.
follow-up: 169 comment:168 by , 11 years ago
Replying to klaussfreire:
Not sure how you compare abr vs vbr, what I do is pick a file or set of files, do a binary search of the quality level that results in the same overall file size, and then compare. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I exclude those, since they're pathological.
I compare abr vs vbr by a graph. I plot a "q vs bitrate" graph over a "standard" set of large set of sounds I extracted from diverse CDs. Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate. This method is common in the hydrogenaudio.
http://listening-tests.hydrogenaudio.org/sebastian/mp3-128-1/index.htm
When I push the patches to the ML, I'll make most of what makes v6 vbr go crazy on techno (the relatively high peak bit rate allowance) configurable anyway.
I think it's a good idea to automatically "cap" the bitrate based on the q number. 3x of the "standard" bitrate of the q or something.
Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps. (Or 210kbps, 289kbps, etc based on the sound content, but that's fine.) iTunes have that interface, and it's easier to use. This can be controversial as people may refer to some old documents of -q:a option and try to do the same, but the problem can be avoided by moving to a "classic mode" when the value is very small, like -q:a 0.3.
follow-ups: 170 175 comment:169 by , 11 years ago
Replying to Kamedo2:
Replying to klaussfreire:
Not sure how you compare abr vs vbr, what I do is pick a file or set of files, do a binary search of the quality level that results in the same overall file size, and then compare. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I exclude those, since they're pathological.
I compare abr vs vbr by a graph. I plot a "q vs bitrate" graph over a "standard" set of large set of sounds I extracted from diverse CDs.
Yeah, I've seen those
Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate.
Just how do you check bit rate? Because I've noticed ffmpeg -i file
tends to give bogus rates when used on VBR-encoded files (not even average).
Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.
That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).
However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:
ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aac
Although that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).
follow-ups: 171 173 comment:170 by , 11 years ago
Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate.
Just how do you check bit rate? Because I've noticed
ffmpeg -i file
tends to give bogus rates when used on VBR-encoded files (not even average).
filesize[Byte]*8/Sample_length[Sec]
, But be careful of very short files, it can be bogus too.
Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.
That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).
Is LAME breaking the convention?
https://trac.ffmpeg.org/wiki/Encoding%20VBR%20%28Variable%20Bit%20Rate%29%20mp3%20audio
However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:
ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aacAlthough that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).
Thank you for the info. Your behavior seems much like the cvbr(most used mode), Apple iTunes.
comment:171 by , 11 years ago
Replying to Kamedo2:
Then, search a number of q that have the desired bitrate. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate.
Just how do you check bit rate? Because I've noticed
ffmpeg -i file
tends to give bogus rates when used on VBR-encoded files (not even average).
filesize[Byte]*8/Sample_length[Sec]
, But be careful of very short files, it can be bogus too.
As long as you're not also estimating sample_length with ffmpeg, which will also give you bogus, it should be fine ;)
Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.
That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).
Is LAME breaking the convention?
https://trac.ffmpeg.org/wiki/Encoding%20VBR%20%28Variable%20Bit%20Rate%29%20mp3%20audio
I think so. At least, it seems to be backwards (higher q should mean higher quality, but lame does it backwards).
comment:172 by , 11 years ago
libvorbis and libfaac break the convention, too. neroAacEnc.exe have the float quality value which 0 is lowest and 1 is highest, so if unchanged, the native encoder acts much like the nero.
follow-up: 174 comment:173 by , 11 years ago
Replying to Kamedo2:
Also, I think it's beneficial for the end users to set the -q:a value and typically gets a file with the bitrate around the set value. If one sets -q:a 256k, one gets a file of roughly 256kbps.
That's not doable without refactoring ffmpeg. -q:a sets the global_quality parameter, which is specified to have a somewhat standardized interpretation (1.0 = 100%, what 100% means is what some other codec means by it, can't remember which OTOMH).
Is LAME breaking the convention?
https://trac.ffmpeg.org/wiki/Encoding%20VBR%20%28Variable%20Bit%20Rate%29%20mp3%20audio
However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:
ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aacAlthough that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).
Thank you for the info. Your behavior seems much like the cvbr(most used mode), Apple iTunes.
If someone is to implement cvbr, I suggest to do it like the libopus encoder wrapper, where users are allowed to choose a "vbr" option like this http://ffmpeg.org/ffmpeg-codecs.html#Option-Mapping.
comment:174 by , 11 years ago
If someone is to implement cvbr, I suggest to do it like the libopus encoder wrapper, where users are allowed to choose a "vbr" option like this http://ffmpeg.org/ffmpeg-codecs.html#Option-Mapping.
Timothy_Gu, Thank you for the informative link. I'd like to use options like -b:a 256k -vbr.
comment:175 by , 11 years ago
However, you can get (I think) a similar result by specifying both -q:a and -b:a, like so:
ffmpeg -i somefile.flac -c:a aac -b:a 256k -q:a 1 -strict experimental somefile.aacAlthough that seldom gives you 256k. The bitrate there is like a lower bound (aim for 256k, spend more if needed).
I tried it over 128 different songs and the result was:
-b:a 256k -q:a 1
- Average 247kbps
- SD +/-33kbps
- Min 161kbps
- Max 300kbps
-q:a 1
- Average 235kbps
- SD +/-30kbps
- Min 154kbps
- Max 287kbps
(comment:160 change is not applied in this test.)
comment:176 by , 11 years ago
I'm preparing for the next listening test.
# Native aac patch v4 abr ffmpeg55212 -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4 ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav # Native aac patch v5 abr ffmpeg56470 -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4 ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav # Native aac patch v5 vbr ffmpeg56470 -y -i in.wav -c:a aac -strict experimental -q:a 0.3 out.mp4 ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav # FDK-AAC vbr 3 ffmpeg56470 -y -i in.wav -c:a libfdk_aac -vbr 3 out.mp4 ffmpeg56470 -y -i out.mp4 -c:a pcm_s32le out.32bit.wav # LAME vbr -V5 ffmpeg55010 -y -i in.wav -c:a libmp3lame -q:a 5 out.mp3 ffmpeg56470 -y -i out.mp3 -c:a pcm_s32le out.32bit.wav # FFmpeg ac3 cbr ffmpeg56470 -y -i in.wav -c:a ac3 -b:a 128k out.ac3 ffmpeg56470 -y -i out.ac3 -c:a pcm_s32le out.32bit.wav
I thought of using float 32bit as the intermediate format, but FFmpeg's float pcm_f32le had the gain half of what it should be, and even after adjusting gain, much error(average of |lossy-original|) existed, unlike faad or madplay.
This is the statistics of 25 samples I'm going to use in the test.
v4 abr | v5 abr | v5 vbr | FDK vbr | lame V5 | ac3 | |
---|---|---|---|---|---|---|
25 Average | 129 | 129 | 151 | 122 | 135 | 128 |
25 Std.Dev | 5 | 5 | 39 | 20 | 18 | 0 |
25 Min | 107 | 108 | 89 | 86 | 87 | 128 |
25 Max | 131 | 133 | 257 | 173 | 172 | 128 |
Max sample | 25.Reunion Blues | 26.French | 26.French | 10. | 14. | 29. |
Std.Average | 128 | 128 | 127 | 127 | 130 | 128 |
Unit is kbps. Std.Average is the average bitrate of my large collection of CDs encoded.
I've found that v5 vbr boosts bitrate in speech samples. The speech sample 26.French was encoded in 257kbps, more than twice bitrate than the average bitrate of large set of diverse CD sounds. Another speech sample reached 216kbps. It's a problem, hopefully fixed in the next v6 patch.
comment:177 by , 11 years ago
Wait a little bit, I'll get you the v6 patch asap, even if not as clean as I'd like it to be.
comment:178 by , 11 years ago
Yes, the speech bug I noticed, because VBR was unconstrained. v6 uses constained VBR (loosely constrained) and performs much better. That's why I'd prefer you tested v6.
by , 11 years ago
Attachment: | aac-improvements-wip-v6.patch added |
---|
Improved (mostly constrained) VBR, fixed RC bug from v5. There's some dead code that begs to be removed, but it's better to start testing before cleaning.
comment:179 by , 11 years ago
So... latest patch attached. It's not final yet, mostly because it needs some polish. But its performance I find quite acceptable.
comment:180 by , 11 years ago
Thank you. I'm successifully encoding. An extensive stability test is ongoing. I noticed that same -q:a results in almost half of the size of the -q:a in v4.
comment:181 by , 11 years ago
This is the statistics of 25 samples I'm going to use in the test.
v4 abr | v6 abr | v6 vbr q0.7 | FDK vbr3 | lame V5 | ac3 | |
---|---|---|---|---|---|---|
25 Average | 129 | 129 | 144 | 122 | 135 | 128 |
25 Std.Dev | 5 | 5 | 24 | 20 | 18 | 0 |
25 Min | 107 | 108 | 114 | 86 | 87 | 128 |
25 Max | 131 | 133 | 218 | 173 | 172 | 128 |
Max sample | 25.Reunion Blues | 26.French | 26.French | 10. | 14. | 29. |
Std.Average | 128 | 128 | 127 | 127 | 130 | 128 |
Unit is kbps. Std.Average is the average bitrate of my large collection of CDs encoded.
follow-up: 184 comment:182 by , 11 years ago
follow-up: 186 comment:184 by , 11 years ago
Replying to Kamedo2:
My current impression, from non-blind test of non-samples:
v5 abr = v6 abr > v6 vbr >> v4 abr
That's weird (v6 abr > v6 vbr
), because my tests showed the opposite, and you yourself said v6 vbr had decreased bit usage considerably (so it should imply higher efficiency, which is what I noticed).
Do you have an example? Could you describe in that example what you feel is inferior compared to abr?
Also, do you perhaps have something that results in abnormally large bitrates in your std calibration sample? That could be forcing you to pick a lower q to match the 128k average, and thus decrease overall quality. I did fix a few of those in v6, but maybe there's some left, or maybe it has to be further constrained.
comment:185 by , 11 years ago
PS: In aaccoder.c:1007, you can change
//if (mb >= ESC_BT) break;
Into
if (mb >= ESC_BT && sce->sf_idx[w*16+g] <= minscaler) break;
I think that could help, because, I believe, those bitrate peaks are due to abuse of ESC_BT bands. And that could also be the reason why some faint sounds get lost even at high Q, because AAC enforces a maximum dynamic range in scalers, and abusing of ESC_BT bands pushes that dynamic range in detriment of faint sounds.
comment:186 by , 11 years ago
Replying to klaussfreire:
Do you have an example? Could you describe in that example what you feel is inferior compared to abr?
In tonal part of the music, the vbr suffers from lower S/N ratio and more LPF effect, because the cutoff frequency is lower. 96kbps vs -q:a 0.52 is more pronounced.
Also, do you perhaps have something that results in abnormally large bitrates in your std calibration sample? That could be forcing you to pick a lower q to match the 128k average, and thus decrease overall quality. I did fix a few of those in v6, but maybe there's some left, or maybe it has to be further constrained.
No, the std calibration sample is from CDs, and it lacks speech samples. I tried these:
http://www.rarewares.org/test_samples/
and death2, KMFDM-Dogma, male_speech was particularly high in -q:a 0.7 (128kbps), more than 200kbps. Male speech have more bitrates than the female one.
comment:187 by , 11 years ago
Well, I've got something that might work, but since it means unconstraining vbr, it'll need some testing at lower bitrates. It'll take me time.
comment:190 by , 11 years ago
comment:185 was tested, along with qaac(Apple AAC). Apple AAC is a very good encoder, so the bitrate it uses must be close to the optimal.
-q:a 0.3 vs -q:a 0.7 vs -q:a 0.7 vs -vbr 3 vs --tvbr 63 vs -V5
Std.bitrate is 127.5, 127.0, 126.9, 126.9, 126.2, 129.9, all of them are very close to 128k.
comment:191 by , 11 years ago
Which version should I test? With or without comment:185? Or should I wait for something? I'm doing a batch encode, so I can redo anything without much effort.
comment:192 by , 11 years ago
My current plan:
bin\ffmpeg55212_v4patch -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4 bin\ffmpeg56667_v6patch_c185 -y -i in.wav -c:a aac -strict experimental -b:a 128k out.mp4 bin\ffmpeg56667_v6patch_c185 -y -i in.wav -c:a aac -strict experimental -q:a 0.7 out.mp4 bin\ffmpeg56667 -y -i in.wav -c:a libfdk_aac -vbr 3 out.mp4 bin\ffmpeg56667 -y -i in.wav -c:a libfaac -q:a 97 out.mp4 bin\ffmpeg56667 -y -i in.wav -c:a libmp3lame -q:a 5 out.mp3 bin\ffmpeg56667 -y -i in.wav -c:a ac3 -b:a 128k out.ac3
Many people were asking the quality of faac. Weird, but maybe I should include it if many people are wondering. Expect the test to slow down.
The bitrate distribution of vbr encoders. Even when the comment:185 was applied, the speech samples takes up a lot of space.
My current FFmpeg configuration:
$ ./configure --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk-a ac --enable-libfaac --enable-libmp3lame --extra-ldflags=-static --extra-cflags= '-march=nocona -mfpmath=sse' --optflags=-O2
Is there a way to expose more bugs so that we can fix it before the test?
comment:193 by , 11 years ago
Sorry about the delay. I wanted to give you a 6b patch, since I was making good progress, but I got stalled yesterday.
I managed to remove (or, rather, ameliorate) those bitrate outliers, without constraining VBR. I noticed they were related to silence parts. It seems in the absence of any significant signal, it will try to encode the noise, and being noise, it's quite hard to encode.
What I did is I modeled the absolute hearing threshold in aacpsy, and now that's performing better. But there's still a tendency to waste bits on noisy transients. I couldn't quite yet confirm it's a waste, all my attempts at saving bits in those, run afoul quality-wise. As if those bits were really needed. But I suspect there's still some work to be done in that regard.
So in essence, I achieved some extra efficiency by modeling absolute hearing thresholds. Since you never know what SPL will the sample be playing at, I matched the masking curve's lowest point to 16-bit quantization noise. That should correctly match most playback situations, but I'd like people to comment if there was an explicit reason why absolute thresholds haven't been accounted for.
As for your testing plans, you tell me. I can give you the current state of the encoder (I don't expect to do any more progress quickly, I tried lots of things and failed to improve it, so unless I get some kind of inspiration the encoder will remain as is for a while), or you can test the current one. Bitrate-wise, they're similar. The newest one performs a little better since it's unconstrained, but it still has a disadvantage against ABR regarding tonal, quiet passages.
comment:194 by , 11 years ago
I think it's not a major problem to boost rate on noisy transients, they're rare in actual encoding situations, unlike speech samples.
The v6 patch is already a very good one, so I'm very satisfied with the current quality, but fairness can be a problem, so if you have the version that reduces bitrate on speech samples, I'd like to test the one with reduced bitrate.
I'd like to test a version that is worthy to commit, so please be careful of stability issues like memory leaks and such, rather than the quality. I'll try my best to find problems before the test.
BTW, I decided to use cbr for the fdk-aac. The cbr sounds clearer.
comment:197 by , 11 years ago
Just for convenience, The most outlier sample in below is death2 (for v6 + comment:185, q=0.7, 1). The rate is highest, and encodes slowest.
http://www.rarewares.org/test_samples/
by , 11 years ago
Attachment: | ffmpeg_aacvbr_degrade1.flac added |
---|
A sound that degrades on VBR. from GIZA studio Masterpiece BLEND 2001 Disc2 Track3 Stand Up (Mai Kuraki)
comment:198 by , 11 years ago
I encoded over 10,000 AAC mp4s, including really weird samples as the input, very small to very large volumes, totally odd settings, from 8kHz to 48kHz frequencies, many cutoff settings, and -q:a from -130 to 240. No apparent problem.
comment:199 by , 11 years ago
It doesn't encode 7.1ch surround file from here.
http://www-mmsp.ece.mcgill.ca/documents/AudioFormats/WAVE/Samples.html
I used cbr and other 7.1ch files and the results are the same.
ffmpeg_v6c185.exe -v 9 -loglevel 99 -y -i "8_Channel_ID.wav" -c:a aac -strict -2 -q:a 1 8chwav_q1.mp4 ffmpeg version N-56667-g32cde96 Copyright (c) 2000-2013 the FFmpeg developers built on Sep 29 2013 23:03:27 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk -aac --enable-libfaac --enable-libmp3lame --extra-ldflags=-static --extra-cflags ='-march=nocona -mfpmath=sse' --optflags=-O2 libavutil 52. 46.100 / 52. 46.100 libavcodec 55. 33.100 / 55. 33.100 libavformat 55. 18.102 / 55. 18.102 libavdevice 55. 3.100 / 55. 3.100 libavfilter 3. 87.100 / 3. 87.100 libswscale 2. 5.100 / 2. 5.100 libswresample 0. 17.103 / 0. 17.103 libpostproc 52. 3.100 / 52. 3.100 Splitting the commandline. Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'. Reading option '-y' ... matched as option 'y' (overwrite output files) with argu ment '1'. Reading option '-i' ... matched as input file with argument '8_Channel_ID.wav'. Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac' . Reading option '-strict' ... matched as AVOption 'strict' with argument '-2'. Reading option '-q:a' ... matched as option 'q' (use fixed quality scale (VBR)) with argument '1'. Reading option '8chwav_q1.mp4' ... matched as output file. Finished splitting the commandline. Parsing a group of options: global . Applying option v (set logging level) with argument 9. Applying option y (overwrite output files) with argument 1. Successfully parsed a group of options. Parsing a group of options: input file 8_Channel_ID.wav. Successfully parsed a group of options. Opening an input file: 8_Channel_ID.wav. [wav @ 016bf2e0] Format wav probed with size=2048 and score=99 [wav @ 016bf2e0] File position before avformat_find_stream_info() is 128 [wav @ 016bf2e0] parser not found for codec pcm_s24le, packets or times may be i nvalid. [pcm_s24le @ 02c666e0] Channel layout '5.1' with 6 channels does not match speci fied number of channels 8: ignoring specified channel layout [wav @ 016bf2e0] parser not found for codec pcm_s24le, packets or times may be i nvalid. [wav @ 016bf2e0] Probe buffer size limit of 5000000 bytes reached [wav @ 016bf2e0] File position after avformat_find_stream_info() is 5002208 Guessed Channel Layout for Input Stream #0.0 : 7.1 Input #0, wav, from '8_Channel_ID.wav': Duration: 00:00:08.05, bitrate: 9216 kb/s Stream #0:0, 1226, 1/48000: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, 7.1, s32, 9216 kb/s Successfully opened the file. Parsing a group of options: output file 8chwav_q1.mp4. Applying option c:a (codec name) with argument aac. Applying option q:a (use fixed quality scale (VBR)) with argument 1. Successfully parsed a group of options. Opening an output file: 8chwav_q1.mp4. Successfully opened the file. detected 8 logical cores [graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'time_base' to value '1/48000 ' [graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'sample_rate' to value '48000 ' [graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'sample_fmt' to value 's32' [graph 0 input from stream 0:0 @ 02cd9ac0] Setting 'channel_layout' to value '0x 63f' [graph 0 input from stream 0:0 @ 02cd9ac0] tb:1/48000 samplefmt:s32 samplerate:4 8000 chlayout:0x63f [audio format for output stream 0:0 @ 037acf80] Setting 'sample_fmts' to value ' fltp' [audio format for output stream 0:0 @ 037acf80] Setting 'sample_rates' to value '96000|88200|64000|48000|44100|32000|24000|22050|16000|12000|11025|8000|7350' [audio format for output stream 0:0 @ 037acf80] auto-inserting filter 'auto-inse rted resampler 0' between the filter 'Parsed_anull_0' and the filter 'audio form at for output stream 0:0' [AVFilterGraph @ 02d4fee0] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed [auto-inserted resampler 0 @ 03672580] ch:8 chl:7.1 fmt:s32 r:48000Hz -> ch:8 ch l:7.1 fmt:fltp r:48000Hz [aac @ 02c66ae0] Unsupported number of channels: 8 Output #0, mp4, to '8chwav_q1.mp4': Stream #0:0, 0, 1/90000: Audio: aac, 48000 Hz, 7.1, fltp, 128 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s24le -> aac) Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height [AVIOContext @ 0390e260] Statistics: 0 seeks, 0 writeouts [AVIOContext @ 016bf900] Statistics: 5013504 bytes read, 0 seeks
by , 11 years ago
Attachment: | ffmpeg_aac_lead_voice.flac added |
---|
Degrades on FFmpeg aac encoder, both on vbr and abr. The original sound is very odd and may not be worthy to put a lot of effort improving it.
comment:200 by , 11 years ago
The sound sample above is from here.
http://www.hydrogenaudio.org/forums/index.php?showtopic=50056
I've uploaded 2 problematic samples for the v6 aac, but they are extreme exceptions, rarely happens in the real encoding situations. Generally, the v6 patch is a very good patch, the quality is satisfactory. If you upload the patch that prevents the speech bitrate bloat, I'll start testing. The opponents are here.
- v4 abr
- v7 abr
- v7 vbr
- fdk-aac cbr
- libfaac vbr
- ac3 cbr
- libmp3lame vbr
BTW, I prefer the name v7 rather than the v6b, even if the revision is minor. It's easier to explain.
comment:201 by , 11 years ago
FFmpeg crashes when the sampling rate is 7350Hz, both vbr and abr.
Insignificant, but in case you have missed something.
It worked on 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 8000Hz.
ffmpeg56668.exe -v 9 -loglevel 99 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -str ict -2 -ar 7350 -b:a 128k ffmpeg_aac_lead_voiceb128.mp4 ffmpeg version N-56667-g32cde96 Copyright (c) 2000-2013 the FFmpeg developers built on Sep 29 2013 23:03:27 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk -aac --enable-libfaac --enable-libmp3lame --extra-ldflags=-static --extra-cflags ='-march=nocona -mfpmath=sse' --optflags=-O2 libavutil 52. 46.100 / 52. 46.100 libavcodec 55. 33.100 / 55. 33.100 libavformat 55. 18.102 / 55. 18.102 libavdevice 55. 3.100 / 55. 3.100 libavfilter 3. 87.100 / 3. 87.100 libswscale 2. 5.100 / 2. 5.100 libswresample 0. 17.103 / 0. 17.103 libpostproc 52. 3.100 / 52. 3.100 Splitting the commandline. Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'. Reading option '-y' ... matched as option 'y' (overwrite output files) with argu ment '1'. Reading option '-i' ... matched as input file with argument '\ffmpeg_aac_lead_vo ice.flac'. Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac' . Reading option '-strict' ... matched as AVOption 'strict' with argument '-2'. Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz) ) with argument '7350'. Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v) ) with argument '128k'. Reading option 'ffmpeg_aac_lead_voiceb128.mp4' ... matched as output file. Finished splitting the commandline. Parsing a group of options: global . Applying option v (set logging level) with argument 9. Applying option y (overwrite output files) with argument 1. Successfully parsed a group of options. Parsing a group of options: input file ffmpeg_aac_ lead_voice.flac. Successfully parsed a group of options. Opening an input file: ffmpeg_aac_lead_voice.flac. [flac @ 0159f2e0] Format flac probed with size=2048 and score=50 [flac @ 0159f2e0] File position before avformat_find_stream_info() is 4374 [flac @ 030866e0] sample/frame number mismatch in adjacent frames Last message repeated 114 times [flac @ 0159f2e0] max_analyze_duration 5000000 reached at 5015510 microseconds [flac @ 0159f2e0] File position after avformat_find_stream_info() is 377856 Input #0, flac, from 'ffmpeg_aac_lead_voice.flac': Metadata: REPLAYGAIN_TRACK_PEAK: 0.67306519 REPLAYGAIN_TRACK_GAIN: -4.18 dB REPLAYGAIN_ALBUM_PEAK: 0.67306519 REPLAYGAIN_ALBUM_GAIN: -4.18 dB COMMENT : Encoded by FLAC v1.1.2a with FLAC Frontend v1.7.1 Duration: 00:00:24.68, bitrate: 471 kb/s Stream #0:0, 50, 1/44100: Audio: flac, 44100 Hz, mono, s16 Successfully opened the file. Parsing a group of options: output file ffmpeg_aac _lead_voiceb128.mp4. Applying option c:a (codec name) with argument aac. Applying option ar (set audio sampling rate (in Hz)) with argument 7350. Applying option b:a (video bitrate (please use -b:v)) with argument 128k. Successfully parsed a group of options. Opening an output file: ffmpeg_aac_lead_voiceb128. mp4. Successfully opened the file. detected 8 logical cores [graph 0 input from stream 0:0 @ 0159f100] Setting 'time_base' to value '1/44100 ' [graph 0 input from stream 0:0 @ 0159f100] Setting 'sample_rate' to value '44100 ' [graph 0 input from stream 0:0 @ 0159f100] Setting 'sample_fmt' to value 's16' [graph 0 input from stream 0:0 @ 0159f100] Setting 'channel_layout' to value '0x 4' [graph 0 input from stream 0:0 @ 0159f100] tb:1/44100 samplefmt:s16 samplerate:4 4100 chlayout:0x4 [audio format for output stream 0:0 @ 03142120] Setting 'sample_fmts' to value ' fltp' [audio format for output stream 0:0 @ 03142120] Setting 'sample_rates' to value '7350' [audio format for output stream 0:0 @ 03142120] auto-inserting filter 'auto-inse rted resampler 0' between the filter 'Parsed_anull_0' and the filter 'audio form at for output stream 0:0' [AVFilterGraph @ 030df400] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed [auto-inserted resampler 0 @ 0307a240] ch:1 chl:mono fmt:s16 r:44100Hz -> ch:1 c hl:mono fmt:fltp r:7350Hz [aac @ 030ed9a0] Too many bits per frame requested, clamping to max
How is the development going?
comment:202 by , 11 years ago
I think aacenc.c line 106 and 133 lacks the position for 7350Hz, which is in the 13th address.
static const uint8_t *swb_size_1024[] = { swb_size_1024_96, swb_size_1024_96, swb_size_1024_64, swb_size_1024_48, swb_size_1024_48, swb_size_1024_32, swb_size_1024_24, swb_size_1024_24, swb_size_1024_16, swb_size_1024_16, swb_size_1024_16, swb_size_1024_8 };
static const uint8_t *swb_size_128[] = { /* the last entry on the following row is swb_size_128_64 but is a duplicate of swb_size_128_96 */ swb_size_128_96, swb_size_128_96, swb_size_128_96, swb_size_128_48, swb_size_128_48, swb_size_128_48, swb_size_128_24, swb_size_128_24, swb_size_128_16, swb_size_128_16, swb_size_128_16, swb_size_128_8 };
aactab.c line 40 ~ 48 properly have the data for the 13th 7350Hz.
const uint8_t ff_aac_num_swb_1024[] = { 41, 41, 47, 49, 49, 51, 47, 47, 43, 43, 43, 40, 40 }; const uint8_t ff_aac_num_swb_512[] = { 0, 0, 0, 36, 36, 37, 31, 31, 0, 0, 0, 0, 0 }; const uint8_t ff_aac_num_swb_128[] = { 12, 12, 12, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15 };
follow-up: 205 comment:203 by , 11 years ago
Sorry, I've been having trouble making good, provable progress with v6b/v7, especially since I've got a few deadlines coming that require lot of my time.
I'll try to fix the crashing, and get performance comparable to v6 without the VBR constraints, which I noticed are hurting quality on high bit rates.
However, it seems the tendency to spend lots of bits on speech stems from psy itself, not anything else. It seems to estimate speech has a lot of perceptual entropy. It may be true, but there is some inefficiency that's hard to fix without deviating from the standards.
So v7 will probably just address the bugs, crashing, some bugs in tonal band priorization in short window blocks, and stuff like that.
About 7.1ch, there's another ticket for that, IIRC. While I will still take a look with the report in this ticket, I'd suggest you post the relevant bits on that ticket as well (for traceability you know)
follow-up: 207 comment:204 by , 11 years ago
Replying to Kamedo2:
This document recommends to use -cutoff 15000 option. Too outdated, the cutoff is automatically applied since July 2012.
http://ffmpeg.org/ffmpeg-codecs.html#aac
Hi, I wrote the documentation. Thanks for the report for that. However, I currently don't have any time to fix the doc. So it would be very kind of you to send a patch to ffmpeg-devel mailing list. Thanks.
comment:205 by , 11 years ago
Replying to klaussfreire:
However, it seems the tendency to spend lots of bits on speech stems from psy itself, not anything else. It seems to estimate speech has a lot of perceptual entropy. It may be true, but there is some inefficiency that's hard to fix without deviating from the standards.
Allocating less bits on short frame may help. It should increase the quality, although a strong tonality estimator is ideal.
comment:206 by , 11 years ago
aacenc.c line 106 and 133, this properly worked, and the result 7350Hz aac mp4s were playable on FFmpeg, foobar2000 v1.2.9 and Media Player Classic. The decoding failed on faad and WMP.
static const uint8_t *swb_size_1024[] = { swb_size_1024_96, swb_size_1024_96, swb_size_1024_64, swb_size_1024_48, swb_size_1024_48, swb_size_1024_32, swb_size_1024_24, swb_size_1024_24, swb_size_1024_16, swb_size_1024_16, swb_size_1024_16, swb_size_1024_8, swb_size_1024_8 };
static const uint8_t *swb_size_128[] = { /* the last entry on the following row is swb_size_128_64 but is a duplicate of swb_size_128_96 */ swb_size_128_96, swb_size_128_96, swb_size_128_96, swb_size_128_48, swb_size_128_48, swb_size_128_48, swb_size_128_24, swb_size_128_24, swb_size_128_16, swb_size_128_16, swb_size_128_16, swb_size_128_8, swb_size_128_8 };
follow-up: 208 comment:207 by , 11 years ago
Replying to Timothy_Gu:
Replying to Kamedo2:
This document recommends to use -cutoff 15000 option. Too outdated, the cutoff is automatically applied since July 2012.
http://ffmpeg.org/ffmpeg-codecs.html#aac
Hi, I wrote the documentation. Thanks for the report for that. However, I currently don't have any time to fix the doc. So it would be very kind of you to send a patch to ffmpeg-devel mailing list. Thanks.
Patch sent. http://ffmpeg.org/pipermail/ffmpeg-devel/2013-October/149225.html
comment:208 by , 11 years ago
Patch sent. http://ffmpeg.org/pipermail/ffmpeg-devel/2013-October/149225.html
Thank you very much.
comment:209 by , 11 years ago
I'll be free from October 16th. klaussfreire, could you provide the current status of the encoder in 16th? I'd be happier if it reduces the bitrate on speech samples, but even when it doesn't, I'll start the test.
comment:211 by , 11 years ago
Sorry, I had the intention of posting it this weekend, but I've been up to my ears in deadlines. Will see about posting it tonight.
by , 11 years ago
Attachment: | aac-improvements-wip-v7.patch added |
---|
v7 patch - mostly bugfixing on v6, but quite significant bugs - still incomplete (needs sample rate fixes and Mahler still sounds weird)
follow-up: 215 comment:212 by , 11 years ago
v7 patch is attached.
It's not commitable or complete yet. I know it was the idea, but I ran out of time, and since you'll be free and needing improvements to test... well...
This patch fixes some important bugs regarding RD limit computation in transients. It also has a more robust tonality boost (form factor in this patch) method, which accounts for what psy already does (I noticed it does its own bit). In essence, it was necessary to actually count nonzero lines. This, I believe, is mostly what was wasting bits on speech. Speech still takes more bits, but less.
Mahler (brass I guess in general) still sounds artifacty. I think I know how to fix it (and it might indeed fix some other stuff). But I haven't had time to incur there.
I haven't had time to see the sample rates issue. So maybe later, you've got it mostly done anyway ;)
comment:213 by , 11 years ago
klaussfreire,
Thank you for all the effort to improve the FFmpeg. I've successfully patched and configured it.
comment:214 by , 11 years ago
comment:215 by , 11 years ago
Replying to klaussfreire:
v7 patch is attached.
It's not commitable or complete yet. I know it was the idea, but I ran out of time, and since you'll be free and needing improvements to test... well...
Will it take long to make the comittable version? If it takes long, maybe I should start testing now. Or, I can wait for more complete version.
comment:216 by , 11 years ago
It'll take a while. I'm 1 week away from a conference that demands my full attention.
comment:217 by , 11 years ago
Then I'll start the test. BTW, the sine warbling problem #2706 reappeared in -q:a 0.1, 0.2, and 0.7. 50Hz and 7000Hz sine warbles with v7. Should I test v7, or v6? (+comment:185+comment:206)
comment:218 by , 11 years ago
I'll retest the sine when I've got time. But try to confirm it's the same problem as before, because what I noticed at random bitrates is clipping, not warbling. That's a different (harder) problem.
comment:219 by , 11 years ago
Seems like a hard problem. If you commit v7, people will have trouble handling the unexpectedly large file size of speech contents such as university lectures, and you probably can't fix it before the conference. Testing v6 may make more sense, probably with comment:185 and comment:206.
comment:220 by , 11 years ago
The v7 uses less than one third bitrate of v6 in -q:a 0.4 and -q:a 0.7 when encoding sine waves.
v6+c185+c206 | v7 | |
---|---|---|
-q:a 0.4 | 79 | 22 |
-q:a 0.7 | 107 | 32 |
-b:a 64k | 52 | 55 |
-b:a 128k | 87 | 105 |
The result of sine_tester.flac in kbps.
by , 11 years ago
Attachment: | sine_tester.flac added |
---|
Sine waves for a warbling test. 50 440 1000 3000 7000 10000 20000Hz. 24bit 48kHz PCM.
follow-up: 222 comment:221 by , 11 years ago
What was the "quite significant bugs" of v6? I didn't find any problem in a non-blind listening test, and the v6 was extensively tested over many songs, speeches, tv source, and artificial sounds, and I believe v6 is safe and stable.
comment:222 by , 11 years ago
Replying to Kamedo2:
What was the "quite significant bugs" of v6? I didn't find any problem in a non-blind listening test, and the v6 was extensively tested over many songs, speeches, tv source, and artificial sounds, and I believe v6 is safe and stable.
Well, for one, holes (bands below hearing threshold) would bork a tonality boost loop, creating all sorts of issues, most notably when using short transform length, since the borking would get carried over to other windows.
In essence, transients were broken. They still sounded alright most of the time, probably because of pure chance (ie: maybe there were no holes). But signals like Mahler, castanets and harpishcords tended to expose the bugs at lower bit rates.
comment:224 by , 11 years ago
Feel free to choose. I don't think I can go back to v6, but if v6 performs better than v7, it's something I'll have to account for.
comment:225 by , 11 years ago
follow-up: 228 comment:226 by , 11 years ago
Sorry to interrupt this awesome thread but... is there some known issues with the current patch which are not present in the current encoder?
Wouldn't it be nice to apply the current patch and move on that basis?
The patch is getting huge, and the more you wait, the less relevant the review on ffmpeg-devel will be.
comment:227 by , 11 years ago
also the patch should be split into self contained fixes, 1 issue == 1 patch when its submitted to ffmpeg-devel.
We can apply a huge monolithic patch too if thats the only option but it will give everyone working on aac headaches (that is you all) when theres a regression and git bisect then ends up just pointing to a huge all in one change.
comment:228 by , 11 years ago
Replying to ubitux:
Sorry to interrupt this awesome thread but... is there some known issues with the current patch which are not present in the current encoder?
Not many, but I think of...
- In the VBR encoding, the speech takes many bits and the music takes less bits. It should be the music that need bits, as the quality of the music is something more people care.
- The encoding is a bit slower.
According to klaussfreire, the v7, the one currently tested in a blind test, is "not commitable or complete yet."
follow-up: 230 comment:229 by , 11 years ago
I always intended to push this forward, when testing is done, as a series of smaller patches. I'm not sure I can split all of it, but there are quite a few worthy split points.
I think the main issue with the current patch is overall un-tidyness, with dead code left over from earlier attempts at some solutions for instance. That's what I think doesn't make it commitable yet, it has to be cleaned up.
Plus, there are quite a few magic numbers that ought to be tunables.
The slowness can be fixed later, encoding quality I believe being more important than speed, if needed. It's not that much slower anyway, still faster than realtime. VBR wasn't even working before, so even if imperfect, any improvements to VBR are commitable.
comment:230 by , 11 years ago
Replying to klaussfreire:
I always intended to push this forward, when testing is done, as a series of smaller patches. I'm not sure I can split all of it, but there are quite a few worthy split points.
sounds good, that resolves my concerns
thanks
comment:231 by , 11 years ago
Woops. I just fixed M/S encoding. Had to tell ;)
I was wondering whether your tests with faac/fdk used M/S encoding?
comment:232 by , 11 years ago
I used faac 1.28 and fdk 0.1.2 but how should I check? I didn't use any extra options.
comment:233 by , 11 years ago
Well, faac has a --no-midside, so I'd venture to guess that the default is to do M/S coding.
comment:234 by , 11 years ago
I checked the document (fdk-aac-0.1.2/documentation/aacEncoder.pdf), and it said:
3.3 Encoder Tools
The AAC encoder supports TNS, PNS, MS, Intensity and activates these tools depending on the audio signal and the encoder configuration (i.e. bitrate or AOT). It is not required to configure these tools manually.
comment:235 by , 11 years ago
I'm close to posting a v8. Main improvement in v8 is, besides various subtle but significant bug fixes, that M/S coding properly works and, I believe, is robust enough to be on by default.
But v8 won't have it on by default just yet. So, when I post it (I'm performing a last round of listening tests), be sure to run it with -stereo_mode auto
to get results comparable to the other contenders.
I just wanted to post the progress report early since this ticket has been silent for a while ;)
comment:236 by , 11 years ago
The listening test of v4 abr, v7 abr, v7 vbr, libfaac vbr, fdk-aac cbr, libmp3lame vbr, ac3 cbr (7 encoders) at 128kbps is ongoing, and I've done 10 samples out of 25 samples (40%).
Will your patch v8 properly address the sine warbling problem #2706?
comment:237 by , 11 years ago
Sadly, no. I thought so, but further tests (at various bitrates) show saturation. Not the same issue as the original in the OP, but an issue nevertheless.
I have only attacked M/S coding and some bit allocation inefficiencies, but the improvement seemed to improve those issues during initial tests since tonal band encoding improved significantly, but it's still not enough it seems to avoid clipping due to quantization noise.
I may have a relatively easy fix for it: Since the original signal is saturated already, all quantization noise risks the same artifacts, but I believe there is a simple (and probably the only one) fix, which involves tweaking rounding of strong signals to round towards zero instead of nearest.
That needs careful calibration, however, in order to avoid modifying behavior on non-clipping signals (since rounding towards zero generally induces higher SNR), but I'll be travelling soon and won't be able to work much on it till jan 1st.
comment:238 by , 11 years ago
You mean the clipping of the output PCM? LAME solve it by reducing the gain to around 98%.
comment:239 by , 11 years ago
The listening test of v4 abr, v7 abr, v7 vbr, libfaac vbr, fdk-aac cbr, libmp3lame vbr, ac3 cbr (7 encoders) at 128kbps is ongoing, and I've done 18 samples out of 25 samples (72%).
comment:240 by , 11 years ago
I have just finished the blind listening test and the result is here. http://www.hydrogenaudio.org/forums/index.php?showtopic=104471
comment:241 by , 11 years ago
Alright. I'm still not done with the clipping issue (it's proving to be more challenging than I thought). Though I do believe v8 will change things, because of M/S encoding (which all the other encoders you're testing I think use, except ac3, putting ffmpeg's AAC at a significant disadvantage).
comment:242 by , 11 years ago
How about artificially lowering the noise level of the loudest bin a bit?
comment:243 by , 11 years ago
No, it's not the loudest bin. It's a group of bins. Because the sine isn't represented on the mdct by a single bin, but rather a ripple pattern that has to be accurately encoded, or the resulting sine fluctuates in amplitude (hence the clipping).
But I got a solution. I'm doing the listening tests now, but it sounds good. Basically, a two-front approach: lower the volume of near-clipping windows (and only those, to make it idempotent), and tweak tonal band priorization (because the sine wave was getting too few bits after all).
comment:246 by , 11 years ago
Patience. I just want to make sure there's no serious regression before posting it.
comment:247 by , 11 years ago
Good thing I checked for regressions, because I found a huge one (regarding the bitrate curves that are so common in this ticket)
comment:249 by , 11 years ago
I've got this error:
$ make CC libavcodec/aaccoder.o libavcodec/aaccoder.c: In function 'search_for_quantizers_twoloop': libavcodec/aaccoder.c:867:40: error: 'AACEncContext' has no member named 'cur_ty pe' if (s->options.stereo_mode && s->cur_type == TYPE_CPE) ^ libavcodec/aaccoder.c:814:44: warning: variable 'energies' set but not used [-Wu nused-but-set-variable] float dists[128] = { 0 }, uplims[128], energies[128]; ^ libavcodec/aaccoder.c: At top level: libavcodec/aaccoder.c:1527:9: warning: initialization from incompatible pointer type [enabled by default] quantize_and_encode_band, ^ libavcodec/aaccoder.c:1527:9: warning: (near initialization for 'ff_aac_coders[0 ].quantize_and_encode_band') [enabled by default] libavcodec/aaccoder.c:1533:9: warning: initialization from incompatible pointer type [enabled by default] quantize_and_encode_band, ^ libavcodec/aaccoder.c:1533:9: warning: (near initialization for 'ff_aac_coders[1 ].quantize_and_encode_band') [enabled by default] libavcodec/aaccoder.c:1539:9: warning: initialization from incompatible pointer type [enabled by default] quantize_and_encode_band, ^ libavcodec/aaccoder.c:1539:9: warning: (near initialization for 'ff_aac_coders[2 ].quantize_and_encode_band') [enabled by default] libavcodec/aaccoder.c:1545:9: warning: initialization from incompatible pointer type [enabled by default] quantize_and_encode_band, ^ libavcodec/aaccoder.c:1545:9: warning: (near initialization for 'ff_aac_coders[3 ].quantize_and_encode_band') [enabled by default] libavcodec/aaccoder.c:366:14: warning: 'find_max_absval' defined but not used [- Wunused-function] static float find_max_absval(int group_len, int swb_size, const float *scaled) { ^ make: *** [libavcodec/aaccoder.o] Error 1
by , 11 years ago
Attachment: | aac-improvements-wip-v8.patch added |
---|
v8 patch - tweaked tonal band priorization, especially in transients, fixed M/S encoding and made default, and other assorted bugs. Added missing include changes.
comment:252 by , 11 years ago
by , 11 years ago
Attachment: | Whitenoise_left.flac added |
---|
Whitenoise.flac without the sound of right channel. A strange noise appears in the center in v8.
follow-up: 254 comment:253 by , 11 years ago
The problem of sound partially disappearing, like comment:12, reappeared in v8 in 320kbps ABR.
comment:254 by , 11 years ago
Replying to Kamedo2:
The problem of sound partially disappearing, like comment:12, reappeared in v8 in 320kbps ABR.
On which sample?
I tried most, although only up to 256kbps
comment:255 by , 11 years ago
ExitMusic at 256 kbps ABR
http://web.archive.org/web/*/ff123.net/samples/ExitMusic.flac
Greatest_Love_of_All_2min57.flac at 256 kbps ABR
http://www.hydrogenaudio.org/forums/index.php?showtopic=103989&st=0
by , 11 years ago
Attachment: | ffmpeg_aac256k_degrade.flac added |
---|
The sound degrades on v8 around 256kbps. Mainly right channel suffers. from Kohmi Hirose GIFT/Ai wa tokkoyaku Track3
comment:257 by , 11 years ago
Oh, it's not the same bug at all. It's a remaining bug on M/S encoding. I'll see what I can do about it, but it's not a regression by all means (phew)
comment:258 by , 11 years ago
The stereo image of Mama.wv, itCouldBeSweet.wv is strange in v8 ABR 192kbps. http://www.rarewares.org/test_samples/
comment:259 by , 11 years ago
It's probably the same bug, it happens sometimes with short windows I noticed.
comment:260 by , 11 years ago
I've successfully encoded over 10GB of AACs, but the stereo images of some vocal contents are strange at 192kbps. The bug rarely happens at 160kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8_128k.diff.flac added |
---|
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 128kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8_192k.diff.flac added |
---|
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 192kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8_320k.diff.flac added |
---|
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, 320kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8_q1.5.diff.flac added |
---|
The diff of the ItCouldBeSweet, before and after the v8 AAC encode, quality option -q:a 1.5
comment:262 by , 11 years ago
I uploaded the diff (error) of the v8 encoder. The original ItCouldBeSweet.wv is available in here. http://www.rarewares.org/test_samples/
comment:263 by , 11 years ago
I'd like to see the patch committed, but is there any progress? If the remaining bug of v8 is hard to fix, maybe we should consider pushing v7. It's stable.
comment:264 by , 11 years ago
It was a very simple thing, but very hard to find.
Anyway, fixed (hope). I'm testing an v8b patch now, we'll see if this passes the test.
by , 11 years ago
Attachment: | aac-improvements-wip-v8-fix.patch added |
---|
Cumulative patch over v8 to fix M/S coding
comment:266 by , 11 years ago
I just added a cumulative patch to fix v8's M/S coding, you're very welcome to test it.
comment:267 by , 11 years ago
The stereo images are still somewhat buggy, especially in ItCouldBeSweet.wv
I'll check the source code to see if there are something I can do.
comment:268 by , 11 years ago
I tested ItCouldBeSweet, and it sounds ok here.
Which parameters did you use for the encoding?
comment:269 by , 11 years ago
The bug is most obvious at -b:a 224k. The file is 672,362 Bytes, 267 kbps.
ffmpeg_2.1v8fix -y -i ItCouldBeSweet.wav -vn -c:a aac -strict -2 -b:a 224k ItCouldBeSweet_224k.mp4
ffmpeg version 2.1.git Copyright (c) 2000-2014 the FFmpeg developers built on May 3 2014 15:51:52 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --extra-ldflags=-static --extra-cflags='-march=native -mfpmath=sse' --optflags=-O2 libavutil 52. 63.101 / 52. 63.101 libavcodec 55. 52.101 / 55. 52.101 libavformat 55. 32.101 / 55. 32.101 libavdevice 55. 9.101 / 55. 9.101 libavfilter 4. 1.102 / 4. 1.102 libswscale 2. 5.101 / 2. 5.101 libswresample 0. 17.104 / 0. 17.104 libpostproc 52. 3.100 / 52. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'ItCouldBeSweet.wav': Duration: 00:00:20.02, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Output #0, mp4, to 'ItCouldBeSweet.ffv8f_224k.mp4': Metadata: encoder : Lavf55.32.101 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 224 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le -> aac) Press [q] to stop, [?] for help
by , 11 years ago
Attachment: | ItCouldBeSweet.qaac_cvbr128k.diff.flac added |
---|
Just for comparison. The diff of the ItCouldBeSweet, between the original and qaac encode, 128kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.fdk_128k.diff.flac added |
---|
Just for comparison. The diff of the ItCouldBeSweet, between the original and FDK-AAC encode, 128kbps.
comment:270 by , 11 years ago
You seem to have either a bad patch or a misapplied patch, because that's the bug I solved, and in my build it works fine. Let me test the patch...
comment:271 by , 11 years ago
The aac-improvements-wip-v8-fix.patch is very short. Please confirm that the patch is everything I need.
I checked the source and the patch seems to be properly applied.
The v8 diff and v8_fix diff sounds different and the artifact of v8_fix is less severe.
comment:272 by , 11 years ago
I tried from the current git head, and the v8 patch seems to fail.
v8-fix patch is alright.
$ patch -p1 < aac-improvements-wip-v8.patch patching file libavcodec/aac.h patching file libavcodec/aaccoder.c patching file libavcodec/aacenc.c patching file libavcodec/aacenc.h patching file libavcodec/aacpsy.c patching file libavcodec/psymodel.c Hunk #1 FAILED at 101. 1 out of 1 hunk FAILED -- saving rejects to file libavcodec/psymodel.c.rej patching file libavcodec/psymodel.h $ patch -p1 < aac-improvements-wip-v8-fix.patch patching file libavcodec/aaccoder.c $ ./configure --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk-a ac --enable-libmp3lame --enable-libfaac --enable-libvo-aacenc --extra-ldflags=- static --extra-cflags='-march=native -mfpmath=sse' --optflags=-O2
comment:273 by , 11 years ago
Sounds like you're supposed to apply v8 first and then v8-fix.
The sources may have changed so that v8 needs a bit of fixing to apply cleanly again.
comment:274 by , 11 years ago
That's the problem, you had to apply both. I'm going to upload a combined and rebased patch (v8 indeed doesn't apply cleanly on git head).
comment:275 by , 11 years ago
Alright, uploaded the rebased and combined v8f patch.
I noticed, there's an issue with CBR now, that needs a bigger refactoring to be fixable. Will have to do that as a further patch down the line.
comment:276 by , 11 years ago
A new issue, or just an existing problem? CBR/ABR is imho far more important than VBR modes, since its much more commonly used. If the patch improves VBR but causes issues for CBR, thats bad.
follow-up: 280 comment:277 by , 11 years ago
It's not a serious one. It's just that it won't target the bitrate so accurately as before, only when using M/S coding (which didn't even work before). So you could say it's not a regression, because the case that doesn't properly target the bitrate didn't even work before, it's overall better.
I can fix it (and I might post a patch fixing it), but I'm not sure how long it will take, so I don't want to delay committing these advances even more just for this that isn't even a regression.
comment:280 by , 11 years ago
Replying to klaussfreire:
I don't want to delay committing these advances even more just for this that isn't even a regression.
Then please send your patch to the developer mailing list so Michael can apply it (or point to your public git repository so the change can be merged). You both have done an enormous amount of work, I believe there is no reason to delay the results anymore.
Please make a reference to this ticket part of the commit message.
follow-up: 284 comment:281 by , 11 years ago
A Result of a quick analysis, using 17 music tracks:
Default ABR 128kbps
Min / Average / Max bitrate: 129 / 130 / 131 kbps
Average Speed: 12.9x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k %o
This sounds great for 128 kbps.
Default VBR q1
Min / Average / Max bitrate: 169 / 192 / 211 kbps
Average Speed: 14.1x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 1 %o
Sounds very clear, but not something expected from 192 kbps.
ABR with ms_off
Min / Average / Max bitrate: 128 / 129 / 131 kbps
Average Speed: 14.1x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder anmr %o
Sounds bad when transients exist.
ABR with -aac_coder fast
Min / Average / Max bitrate: 128 / 134 / 141 kbps
Average Speed: 28.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder fast %o
This is fast, but with very bad quality. I don't know if there's good use of this option.
ABR with -aac_coder anmr
Min / Average / Max bitrate: 130 / 134 / 141 kbps
Average Speed: 8.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder anmr %o
Sounds bad when transients exist. Noticably worse than the default ABR, and it's slow.
VBR with -aac_coder anmr
Min / Average / Max bitrate: 172 / 191 / 213 kbps
Average Speed: 7.5x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 1 -aac_coder anmr %o
Sounds slightly bad when transients exist. It's slow.
follow-up: 283 comment:282 by , 11 years ago
I think we should redirect users to 8k if the -b:a set is less than 8k.
The ancient FFmpeg set bitrate by the unit of kbps, so a novice user of modern FFmpeg may set bitrate like -b:a 128. If something sounds, the user may notice that the bitrate set is too low.
comment:283 by , 11 years ago
Replying to Kamedo2:
I think we should redirect users to 8k if the -b:a set is less than 8k.
This is not done for any other codec.
The ancient FFmpeg set bitrate by the unit of kbps
I just tested a five year old version and the unit is bps.
so a novice user of modern FFmpeg may set bitrate like -b:a 128. If something sounds, the user may notice that the bitrate set is too low.
Current FFmpeg prints a warning.
Since this is unrelated, please don't let this delay the patch.
comment:284 by , 11 years ago
Replying to Kamedo2:
ABR with -aac_coder fast
Min / Average / Max bitrate: 128 / 134 / 141 kbps
Average Speed: 28.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder fast %o
This is fast, but with very bad quality. I don't know if there's good use of this option.
It is supposed to sound bad: http://ffmpeg.org/ffmpeg-codecs.html#Options-2:
This method sets a constant quantizer for all bands. This is the fastest of all the methods, yet produces the worst quality.
Also, can you test -coder twoloop?
comment:285 by , 11 years ago
ABR with -aac_coder twoloop
Min / Average / Max bitrate: 129 / 130 / 131 kbps
Average Speed: 13.6x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder twoloop %o
This sounds great.
ABR with -aac_coder faac
Min / Average / Max bitrate: 133 / 157 / 210 kbps
Average Speed: 12.0x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -b:a 128k -aac_coder faac %o
This has the worst quality. Long windows seems to have 2~3kHz LPF and short windows seems to have no LPF.
VBR with -aac_coder twoloop
Min / Average / Max bitrate: 111 / 129 / 146 kbps
Average Speed: 15.7x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 0.7 -aac_coder twoloop %o
This sounds great.
VBR with -aac_coder faac
Min / Average / Max bitrate: 213 / 259 / 311 kbps
Average Speed: 11.1x
ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 0.7 -aac_coder faac %o
Abysmal quality, comparable to ABR with -aac_coder faac, but with more bitrate.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8f_128k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 128kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8f_192k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 192kbps.
by , 11 years ago
Attachment: | ItCouldBeSweet.ffv8f_320k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v8f AAC encode, 320kbps.
comment:287 by , 11 years ago
I've only worked on twoloop (which is the default).
My next step is working on amnr, which can outperform twoloop if done well.
comment:288 by , 11 years ago
I'm close to finishing testing of a better patch for ABR's M/S bug. All that remains is confirming it fixes the above diffs, and I'll upload it and we can move on.
comment:289 by , 11 years ago
Please consider adding support for 7350Hz as in comment:206.
libavcodec/aacenc.c line 107 & 134 should be
static const uint8_t *swb_size_1024[] = { swb_size_1024_96, swb_size_1024_96, swb_size_1024_64, swb_size_1024_48, swb_size_1024_48, swb_size_1024_32, swb_size_1024_24, swb_size_1024_24, swb_size_1024_16, swb_size_1024_16, swb_size_1024_16, swb_size_1024_8, swb_size_1024_8 };
static const uint8_t *swb_size_128[] = { /* the last entry on the following row is swb_size_128_64 but is a duplicate of swb_size_128_96 */ swb_size_128_96, swb_size_128_96, swb_size_128_96, swb_size_128_48, swb_size_128_48, swb_size_128_48, swb_size_128_24, swb_size_128_24, swb_size_128_16, swb_size_128_16, swb_size_128_16, swb_size_128_8, swb_size_128_8 };
comment:292 by , 10 years ago
I am assuming that this patch is meant to be applied to FFmpeg git master soon, the following is partly necessary, partly just a suggestion:
- The patch contains trailing white space, this cannot be pushed to our git repository, please remove it.
tools/patcheck
can help you finding such issues. - Please either remove all printf's or make them av_log's.
- The function sqrf() is duplicated iiuc, please move it to a header (if the function is necessary).
- There are three or four blocks where you just reindent existing code. It makes reading your patch (in the future) easier if you don't reindent them right now in the same commit, just leave them where they are. I can do the reindent for you (or you can send a followup patch).
- And finally (purely optional):
Using the following
if (condition) { do1; } else { do2; }
instead of
if (condition) do1; else do2;
has the advantage that future changes are smaller and easier to read (this point is of course up to you, it is your code).
If you want me to make any of these changes and attach the result here, please say so!
comment:293 by , 10 years ago
follow-ups: 295 297 comment:294 by , 10 years ago
@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.
Other than that, @klaussfreire, if the patch was to be applied to master, you could split this patch to at least two patches. The change of default to M/S encoding should also be documented somehow.
follow-up: 296 comment:295 by , 10 years ago
Replying to Timothy_Gu:
@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.
IMHO, at this point, the question is not whether the ANMR coding is worse than it should be but whether it makes it worse than it currently is.
If we blocked patches because something could be done even better, then the only acceptable patch series would be “[PATCH 0/85042] Make FFmpeg the ultimate multimedia software”.
As I understand, this patch makes some modes work much better than now, with very little or no degradation on the little that did work: in my book, this is very good for inclusion. Knowing ways of making even better is good too, but for later patches.
comment:296 by , 10 years ago
Replying to Cigaes:
Replying to Timothy_Gu:
@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.
IMHO, at this point, the question is not whether the ANMR coding is worse than it should be but whether it makes it worse than it currently is.
If we blocked patches because something could be done even better, then the only acceptable patch series would be “[PATCH 0/85042] Make FFmpeg the ultimate multimedia software”.
As I understand, this patch makes some modes work much better than now, with very little or no degradation on the little that did work: in my book, this is very good for inclusion. Knowing ways of making even better is good too, but for later patches.
I agree with you. However this behavior now contradicts the behavior originally in the documentation, which should be either fixed in the code or documented.
On a side note, can anyone check if ANMR with patch is better than without? If so then I have no problem landing the patch (except its nits) with the documentation changes.
comment:297 by , 10 years ago
Replying to cehoyos:
I am assuming that this patch is meant to be applied to FFmpeg git master soon, the following is partly necessary, partly just a suggestion:
- The patch contains trailing white space, this cannot be pushed to our git repository, please remove it.
tools/patcheck
can help you finding such issues.
Forgot about that. Will do. But this patch is only a POC, I'll re-do the changes (verbatim, but in steps) and post them as separate, progressive patches.
- Please either remove all printf's or make them av_log's.
Of course.
- The function sqrf() is duplicated iiuc, please move it to a header (if the function is necessary).
Ok.
- There are three or four blocks where you just reindent existing code. It makes reading your patch (in the future) easier if you don't reindent them right now in the same commit, just leave them where they are. I can do the reindent for you (or you can send a followup patch).
I could do the patch with -w to make it not diff whitespace-only. Does that work equally well? (there's a lot of code that really needs reindenting or it becomes unreadable). I could post both (with and without -w).
- And finally (purely optional):
Using the following
if (condition) { do1; } else { do2; }instead of
if (condition) do1; else do2;has the advantage that future changes are smaller and easier to read (this point is of course up to you, it is your code).
Surely. Easy enough to change.
Replying to Timothy_Gu:
@cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea.
What I could do, is try to get rid of a common hole-avoidance bug that makes all the other coders much worse. It shouldn't be hard. Other than that, I think only updating the doc is pertinent to thi patch set.
Other than that, @klaussfreire, if the patch was to be applied to master, you could split this patch to at least two patches. The change of default to M/S encoding should also be documented somehow.
Of course. Bugfixes first, improvements next. I need to update my A/B testing script, so I can run all patches through it - especially bugfixes, where automated A/B testing does work.
comment:298 by , 10 years ago
It seems that ANMR just needs a larger search space. Expanding TRELLIST_STATES to 121, and fixing path construction to respect SCALE_MAX_DIFF of course, though it doubles the time taken (ouch), it does improve the quality considerably. I think it's now just a matter of coding the same tonal priorizations and making sure it works well with VBR, and ANMR is probably good as done.
comment:299 by , 10 years ago
I'm about to start a preliminary listening test of:
- FAAC abr 96k.mp4
- FAAC vbr q30(~48kbps).mp4
- FFmpeg native mp2 encoder 96k.mp2
- vo-aacenc 0.1.3 abr 96k.mp4
- Bladeenc 96k.mp3
- FFmpeg native AAC encoder 96k.mp4
- FFmpeg native AAC encoder+v8g patch 96k.mp4
using first 15 samples of a 2011 public multiformat listening test (30 samples).
I guess the v8g at 96kbps beats the FAAC at 96kbps, because the FAAC is quite bad at lower bitrates.
comment:300 by , 10 years ago
Because of my simple mistake, I'm testing these 6 encoders with a duplicate:
- FAAC abr 96k.mp4
- FAAC vbr q30(~48kbps).mp4
- FAAC vbr q30(~48kbps).mp4
- vo-aacenc 0.1.3 abr 96k.mp4
- Bladeenc 96k.mp3
- FFmpeg native AAC encoder 96k.mp4
- FFmpeg native AAC encoder+v8g patch 96k.mp4
I've done 11 samples out of 15 x 2 samples. (37% done)
follow-up: 302 comment:301 by , 10 years ago
I don't understand. How is -q:a 30 48kbps?
Or is it 0.30?
comment:302 by , 10 years ago
Replying to klaussfreire:
I don't understand. How is -q:a 30 48kbps?
Or is it 0.30?
faac-1.28-mod\faac -q 30 -o out.mp4 in.44k.wav
encoder | rate | min | average | max |
---|---|---|---|---|
FAAC | 96k | 97 | 98 | 98 |
FAAC | q30 | 43 | 51 | 59 |
mp2 | 96k | 96 | 96 | 96 |
vo-aacenc | 96k | 98 | 98 | 99 |
Bladeenc | 96k | 96 | 96 | 96 |
Native | 96k | 98 | 98 | 100 |
Native+v8g | 96k | 98 | 99 | 101 |
follow-up: 307 comment:304 by , 10 years ago
@Kamedo2 Just to make sure, the problem described in your recent edit to GuidelinesHighQualityAudio is fixed in v8g, right?
comment:305 by , 10 years ago
Owner: | set to |
---|
comment:306 by , 10 years ago
It needs updating of course. I know 96k and 64k both work reasonably well.
comment:307 by , 10 years ago
Replying to Timothy_Gu:
@Kamedo2 Just to make sure, the problem described in your recent edit to GuidelinesHighQualityAudio is fixed in v8g, right?
Yes.
comment:308 by , 10 years ago
I finished the test and the result is here.
http://www.hydrogenaud.io/forums/index.php?showtopic=105959
The v8g patch beat both unpatched AAC encoder and FAAC at 96k.
comment:310 by , 10 years ago
Any news on getting closer to submitting the improvements?
We're all looking forward to that, more every day!
by , 10 years ago
Attachment: | ItCouldBeSweet.ffv8g_128k.diff.flac added |
---|
by , 10 years ago
Attachment: | ItCouldBeSweet.ffv8g_192k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v8g AAC encode, 192kbps.
by , 10 years ago
Attachment: | ItCouldBeSweet.ffv8g_320k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v8g AAC encode, 320kbps.
comment:312 by , 10 years ago
The glitches of v8g, stereo 48kHz diff. The black bar is 256 samples.
The sound is the intro of Fatboy Slim - Kalifornia.
http://www.hydrogenaud.io/forums/index.php?showtopic=19682
comment:314 by , 10 years ago
This last issue I'm not sure how to fix.
I've been sick lately so no progress, but soon I'll re-engage and I'll priorize sending simple bugfixes to the ML first.
From what I can tell, this is another issue with bit allocation related to constantly and quickly repeating transients.
comment:315 by , 10 years ago
If it's not easy to fix the bit allocation, maybe we should commit the patches without the default M/S encoding.
comment:316 by , 10 years ago
I'm thinking of conducting a personal listening test of the stable v7 or the experimental M/S enabled v8g (or anything latest). I'd like to hear your opinion.
comment:317 by , 10 years ago
Ticket #3816 describes an apparently problematic sample that improves with the patch(es) attached here:
http://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket3816/
follow-up: 320 comment:318 by , 10 years ago
RL has been throwing obstacles at me lately, so I couldn't make any progress here.
I did manage to find a few low-hanging-bugs in ANMR, but, and this is quaint, fixing them makes ANMR 10x slower.
Anyway, things are coming back to normal in RL so I'll be investing some time soon into patch-submitting the small bugfixes. I'll surely have lots of rebasing to do.
I still couldn't fix "Fatboy Slim - Kalifornia", but since the issue has been eluding me, I'm going to leave this for later. I'm only going to check whether it's an issue with M/S coding (doesn't seem to be), because I'd like the patch set to end up making M/S coding the default.
by , 10 years ago
Attachment: | aac-improvements-wip-v7-new.patch added |
---|
v7 patch altered to reflect the latest change by Michael Niedermayer at 20140525. This should work for the git head.
by , 10 years ago
Attachment: | aac-improvements-wip-v8g-new.patch added |
---|
v8g patch altered to reflect the latest change by Michael Niedermayer at 20140525. This should work for the git head.
comment:319 by , 10 years ago
I will start a listening test of the stable v7 patch, latest v8g patch, mp3lame, opus1.1 at 96 kbps in a week.
I will also use 40 samples from here.(See the Track download: section.)
http://listening-test.coresv.net/results.htm
comment:320 by , 10 years ago
Replying to klaussfreire:
I did manage to find a few low-hanging-bugs in ANMR, but, and this is quaint, fixing them makes ANMR 10x slower.
Is that
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399
? Rare, but it happens on some tracks.
libavcodec/aacenc.c
/** * Encode scalefactors. */ static void encode_scale_factors(AVCodecContext *avctx, AACEncContext *s, SingleChannelElement *sce) { int off = sce->sf_idx[0], diff; int i, w; for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { for (i = 0; i < sce->ics.max_sfb; i++) { if (!sce->zeroes[w*16 + i]) { diff = sce->sf_idx[w*16 + i] - off + SCALE_DIFF_ZERO; av_assert0(diff >= 0 && diff <= 120); off = sce->sf_idx[w*16 + i]; put_bits(&s->pb, ff_aac_scalefactor_bits[diff], ff_aac_scalefactor_code[diff]); } } } }
by , 10 years ago
Attachment: | ffmpeg_anmr_error.flac added |
---|
It causes the assertion error at aacenc.c line 399 by -aac_coder anmr on all -b:a and -q:a 0.1695 or bigger.
follow-up: 322 comment:321 by , 10 years ago
anmr is somewhat broken yes, I have some fixes, but they cause a huge performance regression so I don't consider them submittable yet.
It doesn't cause the same assertion failure with twoloop does it?
comment:322 by , 10 years ago
Replying to klaussfreire:
It doesn't cause the same assertion failure with twoloop does it?
It doesn't cause the same assertion failure with -aac_coder twoloop.
by , 10 years ago
Attachment: | ffmpeg_anmr_error2.flac added |
---|
EBU–TECH 3253 Sound Quality Assessment Material recordings for subjective tests, 50 Male speech, English.
comment:324 by , 10 years ago
Not yet. I will try to hunt for that fatboy issue again tonight, and if no progress is made, I will separate the patch into progressive improvements and test each separately. Maybe the process itself yields some insight.
In any case, which do you think is the best patch yet?
comment:326 by , 10 years ago
I don't want to be too dramatic, but I think I finally fixed fatboy. Subtle bugs being compounded. This version is much better (at least in fatboy).
I'm starting a thorough testing session and then I'll patch a rebased v9.
by , 10 years ago
Attachment: | aac-improvements-wip-v9.patch added |
---|
Hopefully final version of the AAC patch
comment:328 by , 10 years ago
Attached a new version of the patch, v9. v9 ABR performs much better than v7 in all the samples I tried, including fatboy. v9 VBR performs better too except in fatboy, I'll analyze the differences to v7 next to see why that's so. But for the time being, I think it'll be a good idea to test v9.
comment:329 by , 10 years ago
Thank you. The v9 is successfully running on many settings and samples.
by , 10 years ago
Attachment: | ItCouldBeSweet.ffv9_128k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 128kbps.
by , 10 years ago
Attachment: | ItCouldBeSweet.ffv9_192k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 192kbps.
by , 10 years ago
Attachment: | ItCouldBeSweet.ffv9_320k.diff.flac added |
---|
The diff of the ItCouldBeSweet, between the original and the patch v9 AAC encode, 320kbps.
comment:331 by , 10 years ago
Well, it is based on it, but it sounded very different to me, as it has important bugfixes. The encoded versions I mean, not the differences.
Namely, in codebook_trellis_rate/encode_window_bands_info, it was using the wrong scalefactors and that's a major bug when encoding transients.
And on the RD-reduction step, both v7 and v8g were assuming decreasing scalefactors had a predictable effect on distortion, and v9 just recomputes distortion, which proved to be a big improvement on VBR.
That pretty much singlehandedly fixed the biggest issues in fatboy.
comment:332 by , 10 years ago
Though I am considering rolling back one of v8g changes to the RD-reduction step compared to v7, since I believe the fixes in v9 make that change obsolete. I'll do that and a round of testing and let you know.
comment:333 by , 10 years ago
The v9 anmr still crashes on some rare samples. I will provide details later.
ffmpeg_r67961_v9 -i in.flac -c:a aac -strict experimental -q:a 1 -aac_coder anmr out.mp4
comment:334 by , 10 years ago
I don't get the crashes on the earlier samples (ffmpeg_anmr_errorX). If you can attach the rare samples I can debug.
comment:335 by , 10 years ago
sine_tester.flac causes assertion error at all -b:a and -q:a 2.839 or bigger.
This http://clt.odu.edu/dropbox/ffmpeg/input.mp4 also causes assertion error at -b:a 177k or bigger.
They all causes
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399
by , 10 years ago
Attachment: | ffmpeg_anmr_error3.flac added |
---|
EBU–TECH 3253 Sound Quality Assessment Material recordings for subjective tests, 3 Electronic gong 100 Hz.(sine wave)
comment:336 by , 10 years ago
by , 10 years ago
Attachment: | FFmpeg_anmr_error4.flac added |
---|
This causes the assertion error on both -b:a 128k and -q:a 1. 4000Hz sine wave, stereo.
by , 10 years ago
Attachment: | FFmpeg_anmr_error5.flac added |
---|
This causes the assertion error on both -b:a 128k and -q:a 1. 11000Hz sine wave, stereo.
follow-up: 347 comment:338 by , 10 years ago
Please consider following cehoyos's suggestion in comment:292.
libavcodec/aaccoder.c line 344 368 394 714 971 1025 1274 1292
libavcodec/aacpsy.c line 989
comment:339 by , 10 years ago
I checked the aacenc.c:399 assertion error.
ffmpeg_r67961_v9_with_printf -y -i file.flac -c:a aac -strict experimental -aac_coder anmr -b:a 128k out.mp4
file | rate | w | i | sf_idx[w*16+i] | off | SCALE_DIFF_ZERO | diff |
---|---|---|---|---|---|---|---|
sine_tester | 16k | 0 | 10 | 186 | 118 | 60 | 128 |
128k | 0 | 18 | 89 | 150 | 60 | -1 | |
320k | 0 | 4 | 87 | 152 | 60 | -5 | |
q1.5 | 0 | 10 | 140 | 77 | 60 | 123 | |
q4 | 0 | 12 | 94 | 156 | 60 | -2 | |
q8 | 0 | 12 | 86 | 150 | 60 | -4 | |
ffmpeg_anmr_error3 | 128k | 0 | 12 | 99 | 163 | 60 | -4 |
256k | 0 | 22 | 105 | 173 | 60 | -8 | |
64k | 0 | 32 | 117 | 180 | 60 | -3 | |
q0.5 | 0 | 22 | 119 | 180 | 60 | -1 | |
q1 | 7 | 10 | 104 | 171 | 60 | -7 | |
q2 | 0 | 12 | 109 | 173 | 60 | -4 | |
ffmpeg_anmr_error4 | 64k | 0 | 23 | 185 | 116 | 60 | 129 |
128k | 0 | 23 | 170 | 109 | 60 | 121 | |
192k | 0 | 23 | 164 | 98 | 60 | 126 | |
320k | 0 | 23 | 161 | 99 | 60 | 122 | |
q0.5 | 0 | 23 | 179 | 118 | 60 | 121 | |
q1 | 0 | 23 | 172 | 108 | 60 | 124 | |
ffmpeg_anmr_error5 | 64k | 0 | 34 | 182 | 114 | 60 | 128 |
128k | 0 | 34 | 182 | 115 | 60 | 127 | |
192k | 0 | 34 | 169 | 108 | 60 | 121 | |
384k | 0 | 40 | 103 | 166 | 60 | -3 | |
q1 | 0 | 34 | 183 | 119 | 60 | 124 | |
q2 | 0 | 34 | 181 | 120 | 60 | 121 |
comment:340 by , 10 years ago
Sorry for the silence, the assertion error has been fixed already. Some bug I couldn't find with encode_window_bands_info, but I just made anmr use the other (codebook_trellis_rate? something like that) which is very similar but better (avoids adding holes or picking a codebook that cannot encode the coefficients, for instance).
I'm currently doing a frankenstein between v9 and v7, to get v9 close to v7 in stability only with better quality. I'm close to posting v9b, hopefully a version that can be sliced and committed (if v7 was acceptable v9b should be as well)
comment:341 by , 10 years ago
From the itCouldBeSweet and fatboy sample, and the diff of them, I have an impression that the instability only happens at short windows. I hope it's fixed in the next patch.
comment:342 by , 10 years ago
Actually, the instability happens transient high bit demand (ie: when a transient signal induces the bit allocator into allocating bits from the reservoir). That results in lower quality of following windows, and twoloop fails to avoid holes in those situations, creating holes that shouldn't be there, hence the instability.
by , 10 years ago
Attachment: | aac-improvements-wip-v9b.patch added |
---|
v9b version, based on v9, matched behavior against v7
comment:346 by , 10 years ago
Attached a v9b. I'm still doing testing on this, but preliminar comparisons against v7 are strong. The instability of v9 seems to be eliminated, although I haven't tested very low bitrates (96kbps on stereo being the lowest for now).
ANMR doesn't crash, but other than that I haven't worked much on it.
comment:347 by , 10 years ago
Replying to Kamedo2:
Please consider following cehoyos's suggestion in comment:292.
libavcodec/aaccoder.c line 344 368 394 714 971 1025 1274 1292
libavcodec/aacpsy.c line 989
Don't worry about this, I'll make sure to clean up everything when I make the incremental patches at the end.
comment:348 by , 10 years ago
comment:349 by , 10 years ago
The option -aac_coder fast and faac crash on both -b:a and -q:a.
ffmpeg_r68337_v9b -y -i ffmpeg_anmr_error5.flac -c:a aac -strict experimental -q:a 1 -aac_coder faac out.wav
ffmpeg_r68337_v9b -y -i ffmpeg_anmr_error4.flac -c:a aac -strict experimental -b:a 128k -aac_coder fast out.wav
by , 10 years ago
Attachment: | FFmpeg_anmr_error6.flac added |
---|
This causes the assertion error on -b:a 96k, 128k, 160k on v9b. -q:a is OK. 9000Hz sine wave, stereo.
follow-up: 353 comment:351 by , 10 years ago
I'm not sure whether to fix or scrap fast and faac. Fast would be nice to rewrite using twoloop with quick and dirty parameters (far few iterations for instance), faac not sure how it compares against the others.
Anyone knows what's faac's rationale? Is it discardable?
comment:352 by , 10 years ago
I have some nice improvements for VBR half-baked, just making sure there are no regressions. Just wanted to mention in case you notice VBR not being up to par with ABR.
by , 10 years ago
Attachment: | FFmpeg_anmr_error7.flac added |
---|
This causes the assertion error on -b:a 192k on v9b. Dave Matthews Band - Crush, http://www.hydrogenaud.io/forums/index.php?showtopic=102079&hl=
comment:353 by , 10 years ago
Replying to klaussfreire:
I'm not sure whether to fix or scrap fast and faac. Fast would be nice to rewrite using twoloop with quick and dirty parameters (far few iterations for instance), faac not sure how it compares against the others.
Anyone knows what's faac's rationale? Is it discardable?
Faac on v9 is extremely bad. I don't think we will need many -aac_coder variant in the final version. Overwhelming majority will use the default setting.
comment:354 by , 10 years ago
Alright, so the plan is to rewrite fast, and scrap faac.
I'll look into the assertion error. That's for anmr only... right?
Still, I'd like some validation on v9b twoloop if possible? If it's performing acceptably I'd like to start splitting and pushing the patch.
comment:355 by , 10 years ago
Klaussfreire, sorry for the slow response.
Yes, the assertion error only happens when the -aac_coder
is anmr
.
FFmpeg_anmr_error6 and FFmpeg_anmr_error7 crashes at the beginning of the file, but I've found 4 music tracks (out of hundreds) that crashes at the middle of the file.
I failed to reproduce the results after cutting to distributable short clips.
Also I have listened to hundreds of songs encoded by v9b anmr
and twoloop
and I have found no apparent problems so far.
comment:356 by , 10 years ago
Assertion error on v9b patch -c:a aac -strict experimental -aac_coder anmr
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399
investigated.
/** * Encode scalefactors. */ static void encode_scale_factors(AVCodecContext *avctx, AACEncContext *s, SingleChannelElement *sce) { int off = sce->sf_idx[0], diff; int i, w; for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { for (i = 0; i < sce->ics.max_sfb; i++) { if (!sce->zeroes[w*16 + i]) { diff = sce->sf_idx[w*16 + i] - off + SCALE_DIFF_ZERO;if(diff<0 || diff>120)fprintf(stderr, "|| || k|| %d|| %d|| %d|| %d|| %d|| %d|| %d|| %d||\n",sce->ics.num_windows, sce->ics.max_sfb, w, i, sce->sf_idx[w*16+i], off , SCALE_DIFF_ZERO, diff); av_assert0(diff >= 0 && diff <= 120); off = sce->sf_idx[w*16 + i]; put_bits(&s->pb, ff_aac_scalefactor_bits[diff], ff_aac_scalefactor_code[diff]); } } } }
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | SCALE_DIFF_ZERO | diff |
---|---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 82k | 1 | 38 | 0 | 31 | 187 | 125 | 60 | 122 |
FFmpeg_anmr_error6 | 84k | 1 | 38 | 0 | 33 | 124 | 185 | 60 | -1 |
FFmpeg_anmr_error6 | 88k | 1 | 39 | 0 | 31 | 187 | 125 | 60 | 122 |
FFmpeg_anmr_error6 | 96k | 1 | 40 | 0 | 31 | 187 | 125 | 60 | 122 |
FFmpeg_anmr_error6 | 112k | 1 | 41 | 0 | 31 | 172 | 111 | 60 | 121 |
FFmpeg_anmr_error6 | 128k | 1 | 42 | 0 | 31 | 169 | 108 | 60 | 121 |
FFmpeg_anmr_error6 | 144k | 1 | 43 | 0 | 31 | 169 | 107 | 60 | 122 |
FFmpeg_anmr_error6 | 160k | 1 | 43 | 0 | 31 | 170 | 107 | 60 | 123 |
FFmpeg_anmr_error6 | 176k | 1 | 44 | 0 | 31 | 173 | 111 | 60 | 122 |
FFmpeg_anmr_error6 | 184k | 1 | 44 | 0 | 31 | 171 | 109 | 60 | 122 |
FFmpeg_anmr_error6 | 186k | 1 | 44 | 0 | 32 | 181 | 120 | 60 | 121 |
FFmpeg_anmr_error7 | 192k | 8 | 12 | 7 | 0 | 181 | 119 | 60 | 122 |
FFmpeg_anmr_error7 | 196k | 8 | 12 | 7 | 0 | 181 | 119 | 60 | 122 |
FFmpeg_anmr_error7 | 200k | 8 | 12 | 7 | 0 | 181 | 119 | 60 | 122 |
FFmpeg_anmr_error7 | 208k | 8 | 12 | 7 | 0 | 181 | 119 | 60 | 122 |
FFmpeg_anmr_error7 | 212k | 8 | 12 | 7 | 0 | 181 | 119 | 60 | 122 |
Koimusume no rondo | 160k | 8 | 12 | 6 | 0 | 183 | 122 | 60 | 121 |
Koimusume no rondo | 192k | 8 | 12 | 3 | 0 | 183 | 121 | 60 | 122 |
Koimusume no rondo | 224k | 8 | 13 | 5 | 0 | 183 | 122 | 60 | 121 |
Sphere no hane | 160k | 8 | 12 | 6 | 0 | 187 | 126 | 60 | 121 |
wmax : sce->ics.num_windows
imax : sce->ics.max_sfb
Sorry that I cannot distribute those last two large files.
comment:357 by , 10 years ago
Alright, I could reproduce the issue, I just need to find how to fix it.
comment:358 by , 10 years ago
Very rare, but foobar2000 v1.3.1 outputs this error on v9b anmr 320kbps. The track plays just fine.
File verification error: Decoding error: Unsupported format or corrupted file, frame: 576 of 14855
comment:359 by , 10 years ago
Any recent developments? I was almost hoping we would start merging things by now after some previous comments. =)
Anything we can help with to expedite things?
comment:360 by , 10 years ago
Well, I don't like the broken state of anmr. Those errors mentioned upthread, I could reproduce them alright, but not fix them. I've been busy with RL stuff these days too, so I had very little time for real debugging work there.
Twoloop seems stable enough, so maybe if people don't mind anmr's breakage, we could start the merging.
I would prefer to fix that at least before merging. I had thought it would be a quick fix, but it's proving to be rather not.
comment:361 by , 10 years ago
The assertion error of v9b patch when the aac_coder
is set to anmr
, ffmpeg.exe -y -i audio_file -c:a aac -strict experimental -aac_coder anmr -b:a xxk out.mp4
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:399
was investigated again by looking into full behavior of static void encode_scale_factors()
.
I don't know why, but the numerical value is slightly different from the comment:356.
sce->sf_idx[w*16 + i]
spikes when the assertion error happens.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 82k | 1 | 38 | 0 | 31 | 187 | 125 | reproduction failed |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 82k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 82 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help size= 103kB time=00:00:10.00 bitrate= 84.1kbits/s video:0kB audio:100kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.416551%
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 84k | 1 | 38 | 0 | 33 | 124 | 185 | reproduction failed |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 84k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 84 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help size= 105kB time=00:00:10.00 bitrate= 86.1kbits/s video:0kB audio:103kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.359015%
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 88k | 1 | 39 | 0 | 31 | 187 | 125 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 88k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 88 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx 99 99 97 97 97 96 96 96 96100100100117116116122116121120121120120120122120124121123124122124186156125125125123123118 diff :............................................................................................. +2..................... : 60 60 58 60 60 59 60 60 60 64 60 60 77 59 60 66 54 65 59 61 59 60 60 62 58 64 57 62 61 58 62122 30 29 60 60 58 60 55 || || k|| 1|| 39|| 0|| 31|| 186|| 124|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 96k | 1 | 40 | 0 | 31 | 187 | 125 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 96k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 96 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 zeros: 0 0 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # # # # sf_idx114107107106106106106105105112112112112111112115108107109111110110116117119119120121122125125187164129126124121121120120 diff :...... ... ... ............... ...................................................... +2........................ : 60 53 59 60 59 60 67 60 60 59 61 63 53 59 62 62 59 60 66 61 62 60 61 61 61 63 60122 37 25 57 58 57 60 59 60 || || k|| 1|| 40|| 0|| 31|| 187|| 125|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 184k | 1 | 44 | 0 | 31 | 171 | 109 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 184k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 184 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 zeros: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx 98 98 98 97 97 96 96 97 97 99 99 99 98101100100 99 98100101 99 99 99 99102108110115113105109171156110106106110113115118117114113114 diff :... ....................................................................................... +2.................................... : 60 60 59 60 59 60 61 60 62 60 60 59 63 59 60 59 59 62 61 58 60 60 60 63 66 62 65 58 52 64122 45 14 56 60 64 63 62 63 59 57 59 61 || || k|| 1|| 44|| 0|| 31|| 171|| 109|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error6 | 186k | 1 | 44 | 0 | 32 | 181 | 120 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error6.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 186k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 186 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx 60120120114114114120120 81118120120120120120120120120120120120120120120120120120120120120120120181120120120120120120120120120120120 diff :................................................................................................ +1 -1.............................. : 60120 60 54 60 60 66 60 21 97 62 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60121 -1 60 60 60 60 60 60 60 60 60 60 || || k|| 1|| 44|| 0|| 32|| 181|| 120|| 60|| 121|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error7 | 192k | 8 | 12 | 7 | 0 | 181 | 119 | reproduced |
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error7.flac': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 192 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx140141141138135135135133131129127127149124123123122116116126118120120118151129125121117116116115113120119119181153148143141137134131128125124121 diff :............................................................................................................ +2................................. : 60 61 60 57 57 60 60 58 58 58 58 60 82 35 59 60 59 54 60 70 52 62 60 58 93 38 56 56 56 59 60 59 58 67 59 60122 32 55 55 58 56 57 57 57 57 59 57 || || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error7 | 196k | 8 | 12 | 7 | 0 | 181 | 119 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error7.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 196k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error7.flac': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 196 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx140141141138135135135133131129127127149124123123122116116126118120120118151129125121117116116115113121121119181153148143141137134131128125124121 diff :............................................................................................................ +2................................. : 60 61 60 57 57 60 60 58 58 58 58 60 82 35 59 60 59 54 60 70 52 62 60 58 93 38 56 56 56 59 60 59 58 68 60 58122 32 55 55 58 56 57 57 57 57 59 57 || || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error7 | 200k | 8 | 12 | 7 | 0 | 181 | 119 | reproduced |
ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error7.flac': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 200 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx136136136138135135135133131129127127149124123122122116116114112120120118151129125119117116116114113120121119181150148143141137134131128125124121 diff :............................................................................................................ +2................................. : 60 60 60 62 57 60 60 58 58 58 58 60 82 35 59 59 60 54 60 58 58 68 60 58 93 38 56 54 58 59 60 58 59 67 61 58122 29 58 55 58 56 57 57 57 57 59 57 || || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error7 | 208k | 8 | 12 | 7 | 0 | 181 | 119 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error7.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 208k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error7.flac': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 208 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx136136137137134134134133131129127127149124123122118116116114112120120118148129125119117115116114113120121119181150148143141137134131128125124121 diff :............................................................................................................ +2................................. : 60 60 61 60 57 60 60 59 58 58 58 60 82 35 59 59 56 58 60 58 58 68 60 58 90 41 56 54 58 58 61 58 59 67 61 58122 29 58 55 58 56 57 57 57 57 59 57 || || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
FFmpeg_anmr_error7 | 212k | 8 | 12 | 7 | 0 | 181 | 119 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "FFmpeg_anmr_error7.flac" -c:a aac -strict experimental -aac_coder anmr -b:a 212k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'FFmpeg_anmr_error7.flac': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 Duration: 00:00:25.50, start: 0.000000, bitrate: 777 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'kfanmr.mp4': Metadata: ALBUM : Before These Crowded Streets ARTIST : Dave Matthews Band GENRE : Rock TITLE : Crush track : 8 encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 212 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx136136137137135135135133131129127127149124123122118116116114112120120118151129125119117116116114113120121119181150148143141137134131128125124121 diff :............................................................................................................ +2................................. : 60 60 61 60 58 60 60 58 58 58 58 60 82 35 59 59 56 58 60 58 58 68 60 58 93 38 56 54 58 59 60 58 59 67 61 58122 29 58 55 58 56 57 57 57 57 59 57 || || k|| 8|| 12|| 7|| 0|| 181|| 119|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
Koimusume no rondo | 160k | 8 | 12 | 6 | 0 | 183 | 122 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Koimusume_no_rondo.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 160k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'Koimusume_no_rondo.wav': Duration: 00:02:51.51, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 160 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx158158158149151156147147151146123124169166164159154152155154149144122122183177168163161159155150150147123123178173168159164157158158170154144140 diff :........................................................................ +1..................................................................... : 60 60 60 51 62 65 51 60 64 55 37 61105 57 58 55 55 58 63 59 55 55 38 60121 54 51 55 58 58 56 55 60 57 36 60115 55 55 51 65 53 61 60 72 44 50 56 || || k|| 8|| 12|| 6|| 0|| 183|| 122|| 60|| 121|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
Koimusume no rondo | 192k | 8 | 12 | 3 | 0 | 183 | 121 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Koimusume_no_rondo.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 192k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'Koimusume_no_rondo.wav': Duration: 00:02:51.51, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 192 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx160153153153151151148143146144121121183166166163157155153156161156124123162161158157155151151160164151124120170168159159160161157161163154129123 diff :.................................... +2......................................................................................................... : 60 53 60 60 58 60 57 55 63 58 37 60122 43 60 57 54 58 58 63 65 55 28 59 99 59 57 59 58 56 60 69 64 47 33 56110 58 51 60 61 61 56 64 62 51 35 54 || || k|| 8|| 12|| 3|| 0|| 183|| 121|| 60|| 122|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
Koimusume no rondo | 224k | 8 | 13 | 5 | 0 | 183 | 122 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Koimusume_no_rondo.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 224k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'Koimusume_no_rondo.wav': Duration: 00:02:51.51, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 224 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx149149146149146146143142135139134133125157164154149158154151145146133119119122183169169158163152151151145136121118127173162164153153152150149151151121117124 diff :.............................................................................. +1........................................................................... : 60 60 57 63 57 60 57 59 53 64 55 59 52 92 67 50 55 69 56 57 54 61 47 46 60 63121 46 60 49 65 49 59 60 54 51 45 57 69106 49 62 49 60 59 58 59 62 60 30 56 67 || || k|| 8|| 13|| 5|| 0|| 183|| 122|| 60|| 121|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
audio file | rate | wmax | imax | w | i | sf_idx[w*16+i] | off | status |
---|---|---|---|---|---|---|---|---|
Sphere no hane | 160k | 8 | 12 | 6 | 0 | 187 | 126 | reproduced |
$ "C:/FFmpeg/ffmpeg_v9b/ffmpeg.exe" -y -i "Sphere_no_hane.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 160k "kfanmr.mp4" ffmpeg version N-68337-g92d47e2 Copyright (c) 2000-2014 the FFmpeg developers built on Dec 9 2014 23:03:19 with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --extr a-cflags='-march=nocona' --optflags=-O2 libavutil 54. 15.100 / 54. 15.100 libavcodec 56. 14.100 / 56. 14.100 libavformat 56. 15.103 / 56. 15.103 libavdevice 56. 3.100 / 56. 3.100 libavfilter 5. 2.103 / 5. 2.103 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'Sphere_no_hane.wav': Duration: 00:04:14.48, bitrate: 1411 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s Output #0, mp4, to 'kfanmr.mp4': Metadata: encoder : Lavf56.15.103 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 160 kb/s Metadata: encoder : Lavc56.14.100 aac Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help w : 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx137149143142139137136130132128132127154157156155148146134133134127126126187164163163159159158155155154148150166166166164159154154153154153153149 diff :........................................................................ +1..................................................................... : 60 72 54 59 57 58 59 54 62 56 64 55 87 63 59 59 53 58 48 59 61 53 59 60121 37 59 60 56 60 59 57 60 59 54 62 76 60 60 58 55 55 60 59 61 59 60 56 || || k|| 8|| 12|| 6|| 0|| 187|| 126|| 60|| 121|| Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:509 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
comment:362 by , 10 years ago
Other assertion errors. Hope it helps.
$ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 40k "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx122122121121121126126124129126129133134136138141139200176146137131123122123123123124124 diff :................................................... +1................................. : 60 60 59 60 60 65 60 58 65 57 63 64 61 62 62 63 58121 36 30 51 54 52 59 61 60 60 61 60 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 64k "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx108108107112107118125125117120130133134136138140137200157145137131123122123123121124120121121122119 diff :................................................... +3............................................. : 60 60 59 65 55 71 67 60 52 63 70 63 61 62 62 62 57123 17 48 52 54 52 59 61 60 58 63 56 61 60 61 57 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 80k "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 zeros: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx158158120123118118120120120120120120122127130134140178187130132127129131184186125130160188136140188144122178 diff :... ........................................................................ -1........................... : 60 22 63 55 60 62 60 60 60 60 60 62 65 63 64 66 98 69 3 62 55 62 62113 62 -1 65 90 88 8 64108 16 38116 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 128k "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 zeros: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx152161109103103103118119111105105107170163133123117128123174174133129174174138177186135163129168129163124166126163129169 diff :... .............................. +3................................................................................. : 60 17 54 60 60 75 61 52 54 60 62123 53 30 50 54 71 55111 60 19 56105 60 24 99 69 9 88 26 99 21 94 21102 20 97 26100 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 160k "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 zeros: 0 1 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # # # # # # # sf_idx159159108108 88 88 88 88 88 88 88 98102102109111152145122125114109115173156127118159148130139148126130148120131148120135148 diff :... ......... .................................... +4................................................... : 60 9 60 40 60 70 64 60 67 62101 53 37 63 49 55 124 43 31 51101 49 42 69 69 38 64 78 32 71 77 32 75 73 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 320k "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 zeros: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sf_idx102102 99104102104101103100100 99102102104 99103134165103102102103103101101101103102103101102100102101102100101 99100100101101109110101 diff :...................................................... -2.............................................................................. : 60 60 57 65 58 62 57 62 57 60 59 63 60 62 55 64 91 91 -2 59 60 61 60 58 60 60 62 59 61 58 61 58 62 59 61 58 61 58 61 60 61 60 68 61 51 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -q:a 0.5 "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 zeros: 0 1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 # # # # # # # # # # # # # # # sf_idx186186121119119119119119119119119119119119119121124130179178129123123123123175176180123134188129122171182133 diff :... -5...... ..................... ...... ........................ : 60 -5 58 60 62 63 66109 59 11 54 112 61 7 71114 1 53109 71 11 $ ffmpeg_v9b -y -i "snippet_tai4.wav" -c:a aac -strict experimental -aac_coder anmr -q:a 1 "kfanmr2.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 zeros: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx180180119123120120121122124126121163157123119119119140166127127123176132156170136165120165160165171134171128168167166167138 diff :... -1.................................................................................................................. : 60 -1 64 57 60 61 61 62 62 55102 54 26 56 60 60 81 86 21 60 56113 16 84 74 26 89 15105 55 65 66 23 97 17100 59 59 61 31 $ ffmpeg_v9b -y -i "snippet_tai3.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 192k "kfanmr3.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx 91 91 90 90 88 92 93 93 93100104109 85 83 82 78 76 76 84 84 95100111148 86 83 77 73 87 96102105106110117153 90 89 77 77 77 98 98102106108116156 diff : ..................................................................... -2................................. -3................................. : 60 59 60 58 64 61 60 60 67 64 65 36 58 59 56 58 60 68 60 71 65 71 97 -2 57 54 56 74 69 66 63 61 64 67 96 -3 59 48 60 60 81 60 64 64 62 68100 $ ffmpeg_v9b -y -i "snippet_tai3.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 256k "kfanmr3.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx 91 91 90 90 84 92 93 93 93100102105 85 83 81 78 76 76 84 84 95100111148 86 83 77 73 91 91102105106110117153 90 89 77 77 88 91102102106110116156 diff : ..................................................................... -2................................. -3................................. : 60 59 60 54 68 61 60 60 67 62 63 40 58 58 57 58 60 68 60 71 65 71 97 -2 57 54 56 78 60 71 63 61 64 67 96 -3 59 48 60 71 63 71 60 64 64 66100 $ ffmpeg_v9b -y -i "snippet_tai3.wav" -c:a aac -strict experimental -aac_coder anmr -b:a 320k "kfanmr3.mp4" w : 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 i : 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 zeros: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # sf_idx 91 91 90 90 84 92 93 93 93100102105 85 83 81 78 76 76 84 84 95 98111148 86 83 77 73 91 91102105106110117153 90 82 81 77 88 98102102106110116156 diff : ..................................................................... -2................................. -3................................. : 60 59 60 54 68 61 60 60 67 62 63 40 58 58 57 58 60 68 60 71 63 73 97 -2 57 54 56 78 60 71 63 61 64 67 96 -3 52 59 56 71 70 64 60 64 64 66100
by , 10 years ago
Attachment: | aac-improvements-wip-v7-new.2.patch added |
---|
v7 patch altered to reflect the latest changes. This should work for the git head.
by , 10 years ago
Attachment: | aac-improvements-wip-v9b-new.patch added |
---|
v9b patch altered to reflect the latest changes. This should work for the git head.
comment:363 by , 10 years ago
The -aac_coder anmr
assertion error still happens after the Michael Niedermayer's changes.
comment:364 by , 10 years ago
I tracked it all the way to codebook_trellis_rate and encode_window_bands_info with short windows. When the optimum allocation spans more than SCALE_MAX_DIFF sf, anmr is carefull not to create allocations that result in deltas to be encoded greater than SCALE_MAX_DIFF, but codebook_trellis_rate and the other both undo this. I tried many ways to patch that with no avail as of yet. The issue almost always happens near zero bands and the switch from one window group to the next (which is usually where the greatest deltas happen and where it is hardest to enforce).
But I did just notice the assertion error happens on git head as well. It seems like a pre-existing bug. Feel free to check exactly under which conditions, I just saw the error logs on a run of tests and didn't get to comparing when it happens vs v9b just yet.
by , 10 years ago
Attachment: | FFmpeg_anmr_error8.flac added |
---|
This causes the assertion error on -q:a 1 on v9b. -q:a 0.99 or 1.01 is safe. Susanne Vega, Tom's Diner http://www.rarewares.org/test_samples/
comment:365 by , 10 years ago
Lots of good jobs are going on, such as these, thanks to Claudio Freire and Michael Niedermayer.
Now the patch does not apply to the git head.
avcodec/aacpsy: Fix AAC Psy PE reduction calculation when multiple iterations are required This is a small change, but it does have a big impact on bit allocation. all the regressions marked in the report have no audible difference (I didn't check them all though), but the improvements can be heard. This affects mostly high bit rates. It's related to issue #2686. In the report, A is the patched version, B is unpatched, all comparisons show deltas in the form (A-B), so a positive pSNR delta means a better quality in the patched version, and negative a regression. Regressions are only considered for pSNR deltas below -1db, they're considered serious below -6db. All measurements were done with tiny_psnr. The summary of the report inline for quick reading: Files: 58 Bitrates: 6 Tests: 347 Serious Regressions: 0 (0%) Regressions: 10 (2%) Improvements: 54 (15%) Big improvements: 26 (7%) Worst regression - sine_tester.flac - 384k - StdDev: 1.68 pSNR: -3.05 maxdiff: -178.00 Best improvement - 07 - Bound.flac - 384k - StdDev: -1700.05 pSNR: 20.64 maxdiff: -29595.00 Average - StdDev: -55.67 pSNR: 1.20 maxdiff: -1593.00
AAC: Fix M/S stereo encoding This patch fixes a pointer arithmetic bug in adjust_frame_information that resulted in heavily corrupted audio when using M/S encoding. Also, a backup copy of untransformed coefficients has to be kept around or attempts at re-processing the frame (which happens when hevavily overspending bits during transients) will result in re-encoding of the coefficients and subsequent corruption of the resulting stream. A/B testing shows the bug as corrected, but still cannot prove that M/S coding is a win at least in numbers. Limited listening tests do show improvement on M/S encoded samples in lower bitrates, but they're hidden among the other artifacts that remain to be corrected in the encoder. Some of the regressions flagged in the report do show poor stereo image (but not buggy), so M/S encoding is clearly not good enough yet to be defaulted to auto. In numbers, Patched against Unpatched, stereo_mode auto: Files: 114 Bitrates: 6 Tests: 683 Serious Regressions: 0 (0%) Regressions: 0 (0%) Improvements: 227 (33%) Big improvements: 92 (13%) Worst regression - mybloodrusts.wv - 256k - StdDev: 28.61 pSNR: -0.43 maxdiff: 1372.00 Best improvement - 60.wv - 384k - StdDev: -369.57 pSNR: 45.02 maxdiff: -13322.00 Average - StdDev: -80.56 pSNR: 2.49 maxdiff: -8858.00 Patched against Unpatched stereo_mode ms_off shows no difference. Patched stereo_mode auto vs Unpatched stereo_mode ms_off shows a small average improvement, just not too significant: Serious Regressions: 0 (0%) Regressions: 10 (1%) Improvements: 45 (6%) Big improvements: 2 (0%) Worst regression - Illinois.wv - 256k - StdDev: 33.20 pSNR: -2.03 maxdiff: 477.00 Best improvement - song_of_circomstances.flac - 384k - StdDev: -3.97 pSNR: 7.61 maxdiff: -826.00 Average - StdDev: -10.25 pSNR: 0.20 maxdiff: -281.00
comment:366 by , 10 years ago
Yes, I'm picking apart v9b step by step. Rebasing v9b would be pointless until that's done (I'm going to push another set of small patches tonight for instance).
comment:367 by , 10 years ago
I've encoded many sounds, and listened to hours of them. Standard ABR and VBR were tested.
ffmpeg_aac320k_collapse4 at 96 kbps is quite bad at this moment of the git head (r70520).
The sine wave warbling problem #2706 still appear.
comment:368 by , 10 years ago
You might be aware of this, but the recent push to the git head:
"AAC: Add support for 7350Hz sampling rates, no error on too hight bitrate."
- ERROR_IF(i >= 12, + ERROR_IF(i == 16 + || i >= (sizeof(swb_size_1024) / sizeof(*swb_size_1024)) + || i >= (sizeof(swb_size_128) / sizeof(*swb_size_128)),
undid Michael Niedermayer's contribution:
"avcodec/aacenc: Fix sample rate check".
Fixes out of array read
Fixes CID1257803, CID1257797, CID1257789, CID1257786
- ERROR_IF(i == 16, + ERROR_IF(i >= 12, "Unsupported sample rate %d\n", avctx->sample_rate); ERROR_IF(s->channels > AAC_MAX_CHANNELS, "Unsupported number of channels: %d\n", s->channels);
comment:369 by , 10 years ago
Actually no, since swb_size_1024/128 contain 13 elements each, so any value above 12 will trigger the OR conditions. (13 >= 13)
So what he did was practically allow one further frequency.
Not sure what the i == 16 check in addition is good for, however, i guess its the maximum limit the spec allows, and the others are just rates we don't implement?
comment:370 by , 10 years ago
Actually, the 12 was there because swb_size_N were of size 12. The patch added another entry, so it'd now be size 13. But instead of using 13, I replaced it by sizeof which is easier to maintain.
i == 16 is there so that the code doesn't depend on the size of swb_size_N. The above loop that searches for the samplerate_index ends at 16, so 16 means the search didn't find anything, and that's also an error independent of whether i is out of bounds for swb_size_N.
comment:372 by , 10 years ago
Sorry for the late reply. I meant to answer a while ago but other duties made me lose track.
Currently, I'm debugging the patch adding support for 7350hz sample rates which doesn't pass the test for MIPS. So I'm debugging that, and am a bit at a loss. The issue seems to be a floating point precision issue that is not on the MIPS-specific side of the code, and that surfaces with hardware floating point emulation. So... doesn't look immediately fixable.
So if I don't find a fix, the TODO follows:
- Push small fixes
- ANMR bugs
- Push improvements (in no particular order)
- M/S search improvements
- Clip avoidance
- VBR support
- R/C improvements, bit allocation improvments, etc (this last step I'm not sure I can split in smaller steps, as they all interact and extracting one not only is difficult but may also cause regressions)
The assertion error I will leave for GSoC students to try and fix (or will attack it later if GSoC fails to address it).
comment:373 by , 10 years ago
Personally, I wouldn't worry too much about a MIPS-only issue at 7350Hz, thats unusual edge cases in the second degree. Can schedule it for later.
comment:374 by , 10 years ago
Is Rostislav Pehlivanov going to implement the Perceptual Noise Substitution?
comment:375 by , 10 years ago
Cc: | added |
---|
comment:376 by , 10 years ago
Well, that's really up to GSoC, but he's already done a PoC that's good enough for a starting point, so that's a probably.
comment:377 by , 10 years ago
Cc: | added |
---|
comment:379 by , 9 years ago
klaussfreire is working on v9c of the patch and hopes to get it sent to the mailing list by the end of the month.
comment:380 by , 9 years ago
I am currently conducting a personal listening test of these 5 encoders at 96kbps. The progress is 23% (17 samples / 74 samples).
- lame 3.99.5 --abr 98
- opusenc opus-tools-0.1.9 --bitrate 91
- NeroAACEnc -q 0.333
- ffmpeg_r70351_v7_patch -c:a aac -strict experimental -b:a 96k
- ffmpeg_r70351_v9b_patch -c:a aac -strict experimental -b:a 96k
comment:381 by , 9 years ago
Nice.
I have some improvements on v9b but RL has not been in the mood to let me polish the patches for the ML. I'm close to having one almost submittable, but every DB at work just decided to act up :(
Anyway, looking forward to hearing about the results of those tests.
follow-up: 383 comment:382 by , 9 years ago
BTW... on which revision are you applying v9b? (it doesn't apply on head anymore)
comment:383 by , 9 years ago
Replying to klaussfreire:
BTW... on which revision are you applying v9b? (it doesn't apply on head anymore)
ffmpeg version N-70351-g2b40416 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 4.8.1 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libx264 --enable-w32threads --extra-ldflags=-static --optflags=-O2 libavutil 54. 19.100 / 54. 19.100 libavcodec 56. 26.100 / 56. 26.100 libavformat 56. 23.106 / 56. 23.106 libavdevice 56. 4.100 / 56. 4.100 libavfilter 5. 11.102 / 5. 11.102 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100
By the way, the cutoff option -cutoff 124 or less on 44.1kHz sample rate stops the FFmpeg aac, regardless of the settings and the bitrate.
comment:384 by , 9 years ago
Yes, I've been thinking about putting a lower limit, but I couldn't decide on a lower bound. Almost all candidates I could think of I could also think a reason why they may be useful.
I guess we have another candidate there. That's probably the cutoff for the first scalefactor window.
comment:385 by , 9 years ago
The progress of the personal listening test is 50% now. (37 samples done / 74 samples).
- lame 3.99.5 --abr 98
- opusenc opus-tools-0.1.9 --bitrate 91
- NeroAACEnc -q 0.333
- ffmpeg_r70351_v7_patch -c:a aac -strict experimental -b:a 96k
- ffmpeg_r70351_v9b_patch -c:a aac -strict experimental -b:a 96k
comment:386 by , 9 years ago
I am wondering if someone may want to -cutoff 120
to make a LFE channel of the surround sound (although it won't work).
Lower than -cutoff 3000
makes no sense from psychoacoustic point of view.
comment:387 by , 9 years ago
I would like to thank Rostislav Pehlivanov and Michael Niedermayer for committing the improvement.
Is the current git head similar to the N-70351-g2b40416+v9b patch I am testing?
comment:388 by , 9 years ago
@Kamedo2:
No, the current git master contains no changes from the previous v9b or the future v9c yet.
Claudio Freire is currently merging his v9c patch with the current git master and should send it off to the mailing list once he's done. This shouldn't hopefully take long, he only needs to tweak the PNS band marking.
comment:389 by , 9 years ago
@atomnuker
Thank you. Then I will need another listening test to confirm the progress on the MOS scale. The current test is 64% done now.
by , 9 years ago
Attachment: | ffmpeg_aac_error1.flac added |
---|
FFmpeg doesn't stop when the sample rate is 8kHz and the bitrate is high. -ar 8000 -b:a 96k, -q:a 0.958 or higher. Fear Factory, Digimortal, Linchpin.
follow-up: 402 comment:390 by , 9 years ago
This rare sample and this command induces infinite loop on the current git head.
ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4 ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4
comment:391 by , 9 years ago
@Kamedo2
The only time I've had an infinite loop was when I deliberately broke the trellis algorithm, so it's either that or the twoloop function. I'll take a look at it.
comment:393 by , 9 years ago
This also induces infinite loop on the current git head.
ffmpeg73515 -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 13000 out.mp4 ffmpeg73515 -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 15000 out.mp4
comment:394 by , 9 years ago
The results of the 96kbps listening test
v9b patch beat the stable v7 patch. The quality of LAME and Nero was higher than the FFmpeg's native encoder.
Encoders
- lame 3.99.5 --abr 98
- opusenc opus-tools-0.1.9 --bitrate 91
- NeroAACEnc -q 0.333
- ffmpeg_r70351_v7_patch -c:a aac -strict experimental -b:a 96k
- ffmpeg_r70351_v9b_patch -c:a aac -strict experimental -b:a 96k
Used Sound Samples
There were no significant differences between corpora.
- 40 tracks from the 2014 public listening test. http://listening-test.coresv.net/results.htm
- 25 tracks from my corpus. http://zak.s206.xrea.com/bitratetest/main.htm
- 9 tracks from SoundExpert. http://soundexpert.org/sound-samples
comment:395 by , 9 years ago
Further discussions of this listening test at 96kbps may be posted on Hydrogenaudio. http://www.hydrogenaud.io/forums/index.php?showtopic=109716
comment:396 by , 9 years ago
Isn't it a bit apples to oranges comparing vbr vs abr? Did you try v9b vbr?
Btw, I'm slowly pushing v9c (not posted here, but similar to v9b), once all is done and merged with GSoC stuff it should improve tenfold.
comment:397 by , 9 years ago
For state-of-the-art encoders, vbr is more advantageous than abr. But for Nero, vbr may not be noticeably superior to abr. http://d.hatena.ne.jp/kamedo2/20110430/1304181738
I tried Nero abr, but I could not set it to 96kbps.
by , 9 years ago
Attachment: | SinceAlways.flac added |
---|
This is one exceptional case that degrades on v9b.
by , 9 years ago
Attachment: | mybloodrusts.flac added |
---|
This is one exceptional case that degrades on v9b.
by , 9 years ago
Attachment: | castanets.flac added |
---|
This is one exceptional case that degrades on v9b.
comment:398 by , 9 years ago
Other exceptional case where v7 is better than v9b includes "Can't Wait Until Tonight (Dry Wurlitzer Mix).flac", "41_30sec.flac" etc.
Can the infinite loop problem on the git head be solved?
comment:399 by , 9 years ago
About the 41_30sec, I believe v9c fixes that, but I'll double-check just in case.
Re. the infinite loop, I'll take a look too when I get the time.
comment:400 by , 9 years ago
I posted more graphs in the discussion thread of the personal listening test at 96kbps.
http://www.hydrogenaud.io/forums/index.php?showtopic=109716
FFmpeg_anmr_error7.flac still stops FFmpeg on options -aac_coder faac and fast.
comment:401 by , 9 years ago
-aac_coder faac
induces infinite loop whenever the bitrate is clamped to max. It never induces infinite loop when the bitrate is below the max.
This bug is reproducible on any samples on any channel/sampling freq. settings.
ffmpeg74294 -y -i Whitenoise.flac -c:a aac -strict experimental -b:a 530k -ac 2 -ar 44100 -aac_coder faac whitenoise.mp4 ffmpeg version N-74294-g45d9d16 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 54. 30.100 / 54. 30.100 libavcodec 56. 57.100 / 56. 57.100 libavformat 56. 40.101 / 56. 40.101 libavdevice 56. 4.100 / 56. 4.100 libavfilter 5. 32.100 / 5. 32.100 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 2.101 / 1. 2.101 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'Whitenoise.flac': Duration: 00:00:05.00, start: 0.000000, bitrate: 1550 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 [aac @ 01e37760] Too many bits per frame requested, clamping to max Output #0, mp4, to 'whitenoise.mp4': Metadata: encoder : Lavf56.40.101 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 529 kb/s Metadata: encoder : Lavc56.57.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help size= 86kB time=00:00:01.48 bitrate= 475.6kbits/s
[aac @ 01e37760] Too many bits per frame requested, clamping to max
is the sign it fails.
ffmpeg74294 -y -i Whitenoise.flac -c:a aac -strict experimental -b:a 133k -ac 1 -ar 22050 -aac_coder faac whitenoise.m p4 ffmpeg version N-74294-g45d9d16 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 54. 30.100 / 54. 30.100 libavcodec 56. 57.100 / 56. 57.100 libavformat 56. 40.101 / 56. 40.101 libavdevice 56. 4.100 / 56. 4.100 libavfilter 5. 32.100 / 5. 32.100 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 2.101 / 1. 2.101 libpostproc 53. 3.100 / 53. 3.100 Input #0, flac, from 'Whitenoise.flac': Duration: 00:00:05.00, start: 0.000000, bitrate: 1550 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 [aac @ 01997760] Too many bits per frame requested, clamping to max Output #0, mp4, to 'whitenoise.mp4': Metadata: encoder : Lavf56.40.101 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 22050 Hz, mono, fltp (16 b it), 132 kb/s Metadata: encoder : Lavc56.57.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help
follow-up: 403 comment:402 by , 9 years ago
Replying to Kamedo2:
This rare sample and this command induces infinite loop on the current git head.
ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4 ffmpeg73505 -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4
I cannot replicate this bug anymore so it's probably fixed, could you test with the newest git master to see if it causes problems?
I'll look into what causes the faac coder to get stuck at high bitrates.
comment:403 by , 9 years ago
Replying to atomnuker:
I cannot replicate this bug anymore so it's probably fixed, could you test with the newest git master to see if it causes problems?
OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4 OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4 OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -b:a 96k -ar 8000 out.mp4 OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_error1.flac -c:a aac -strict experimental -q:a 1 -ar 8000 out.mp4
Yes, it was fixed.
I'll look into what causes the faac coder to get stuck at high bitrates.
OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 19000 out.mp4 OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 17000 out.mp4 NG ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 15000 out.mp4 NG ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 13000 out.mp4 NG ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 11000 out.mp4 OK ffmpeg74678-g6701c92 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 9000 out.mp4 OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 19000 out.mp4 OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 17000 out.mp4 NG ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 15000 out.mp4 NG ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 13000 out.mp4 NG ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 11000 out.mp4 OK ffmpeg74721-g4bd99f7 -y -i ffmpeg_aac_lead_voice.flac -c:a aac -strict experimental -b:a 320k -cutoff 9000 out.mp4
The bug still exists in the git master. It induces infinite loop when the cutoff is 15, 13, 11kHz, but not 9kHz. When the cutoff is 7kHz, the initial setup takes 16 seconds, which is unnaturally slow, but the encoded sound is OK. ffmpeg_aac_lead_voice.flac is 44.1kHz mono.
by , 9 years ago
Attachment: | ffmpeg_96k_error.flac added |
---|
Low Freq. Sine Sweep Stereo with right channel inverted; inaudible on mono.
comment:404 by , 9 years ago
This sound will crash the FFmpeg, when the sampling rate is 96kHz.
ffmpeg74961-g61009a7 -y -i ffmpeg_96k_error.flac -c:a aac -strict experimental -b:a 96k -ac 2 -ar 96000 out.mp4 ffmpeg74961-g61009a7 -y -i ffmpeg_96k_error.flac -c:a aac -strict experimental -b:a 160k -ac 2 -ar 96000 out.mp4 ffmpeg74961-g61009a7 -y -i ffmpeg_96k_error.flac -c:a aac -strict experimental -q:a 1 -ac 2 -ar 96000 out.mp4
I tried many sine sweeps but it seems that the bug only happens when one of channel is inverted.
comment:405 by , 9 years ago
There's something weird happening in the search_for_is for one of the phases.
Will submit a patch in a few hours and reply here for you to test.
follow-up: 407 comment:406 by , 9 years ago
Thank you. With your devotion, the sound is getting great, and I have heard no apparent problem on over 20 hours of music and speech tracks on common settings.
comment:407 by , 9 years ago
Replying to Kamedo2:
Thank you. With your devotion, the sound is getting great, and I have heard no apparent problem on over 20 hours of music and speech tracks on common settings.
Thanks, nice to know someone's using the encoder.
Make sure to reencode them once the encoder's ready :-)
Fixed the bug. Probably fixes quite a lot of IS artifacts too.
comment:408 by , 9 years ago
I think the sound deteriorated on ffmpeg75016-g50d9121, compared to 74961-g61009a7, after fixing the bug. Tested on music tracks on 128k, 192k, 320k, q1, q2. Stereo 192k 32000Hz is especially worsened.
comment:409 by , 9 years ago
Huh, that's odd. The changes which I made to PNS today (at 12:39 UTC, commit b6cc8ec7ec) brought PNS closer to what it used to be before but fixed the warbling artifacts at lower frequencies (it's used alot more now). The changes to the IS which fixed the bug yesterday (1956cfbaedd36) shouldn't really have done much to the quality at all and I didn't hear a difference.
74961-g61009a7 is before I made my PNS changes from yesterday, so are you sure that the last current git master still sounds worse? The PNS commit I made yesterday did reduce PNS usage too much (before I fixed that today).
Either way, I'll take a listen to what the encoder sounded like before and try to see if it's better in the current master.
by , 9 years ago
Attachment: | mybloodrusts.ff74961_128k.mp4 added |
---|
mybloodrusts.flac encoded at -b:a 128k by ffmpeg74961-g61009a7.
by , 9 years ago
Attachment: | mybloodrusts.ff75043_128k.mp4 added |
---|
mybloodrusts.flac encoded at -b:a 128k by ffmpeg75043-gb31041a.
comment:410 by , 9 years ago
I tried both the 74961-g61009a7 and the current git head(75043-gb31041a), and the 74961-g61009a7 was noticeably better than the current git head. The S/N was significantly better on all L, R, M, and S, and the difference was more pronounced on the tonal tracks than the transient blocks.
follow-up: 412 comment:411 by , 9 years ago
Well, thanks for the feedback.
I'll add back the 4+ quantization factor for the PNS energy tommorow morning. I've tested other decoders (to make sure it's not the ffmpeg decoder causing issues) so I have no idea why there's such an energy difference, but apparently that 4+ was enough to make PNS sound right.
As for the L, R, M and S signal/noise ratio, did you test that without PNS? That could have interfered. Could you tell me if IS sounded better before or after without PNS?
comment:412 by , 9 years ago
Confirmed that the current git head 75147-g9d742d2 fixed the regression.
Replying to atomnuker:
As for the L, R, M and S signal/noise ratio, did you test that without PNS? That could have interfered. Could you tell me if IS sounded better before or after without PNS?
I have tested -b:a 128k -ar 44100
, -b:a 192k -ar 32000
, -b:a 320k -ar 48000
, -q:a 1 -ar 44100
, -q:a 2 -ar 48000
, without additional -aac_pns enable
nor -aac_is enable
settings.
What optional settings should I test?
comment:413 by , 9 years ago
PNS and IS are enabled by default, so your tests would've included them in any case.
Pass -aac_pns 0 to disable it, and test IS alone.
comment:414 by , 9 years ago
I have tested 75156-gfd8b90f. At 128kbps with IS, with PNS is better.
http://wiki.hydrogenaud.io/index.php?title=Joint_stereo#Intensity_Stereo
Intensity stereo is by definition a lossy coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.
comment:415 by , 9 years ago
In my tests, it has usually been the case that as you increase the bitrate, IS is used less while MS usage increases, naturally due to the R/D model used. If you see otherwise, it would be useful to know and have a sample to better tweak the model.
comment:416 by , 9 years ago
Klaussfreire, Thank you for the explanation! At 240kbps, it was hard to spot the difference between -aac_is 0
and -aac_is enable
.
follow-up: 418 comment:417 by , 9 years ago
Hmm, perhaps it's best I add a print for PNS/IS/Prediction/MS/TNS usage when the verbose level has been increased.
Anyway, Kamedo2: I pushed some PNS patches yesterday which should have fixed the drop in quality. Did it improve?
comment:418 by , 9 years ago
Replying to atomnuker:
Anyway, Kamedo2: I pushed some PNS patches yesterday which should have fixed the drop in quality. Did it improve?
Yes, 75156-gfd8b90f is great.
comment:419 by , 9 years ago
I tested 75268-g3f9fa2d and the quality was a bit worse than the 74961-g61009a7 and 75147-g9d742d2 on -b:a 128k
44.1kHz stereo, but the encoding speed was very fast.
follow-up: 422 comment:420 by , 9 years ago
libavcodec / aaccoder_twoloop.h line 172: 60 - qstep
166 if (tbits > destbits) { 167 for (i = 0; i < 128; i++) 168 if (sce->sf_idx[i] < 218 - qstep) 169 sce->sf_idx[i] += qstep; 170 } else { 171 for (i = 0; i < 128; i++) 172 if (sce->sf_idx[i] > 60 - qstep) 173 sce->sf_idx[i] -= qstep; 174 }
might meant
166 if (tbits > destbits) { 167 for (i = 0; i < 128; i++) 168 if (sce->sf_idx[i] < 218 - qstep) 169 sce->sf_idx[i] += qstep; 170 } else { 171 for (i = 0; i < 128; i++) 172 if (sce->sf_idx[i] > 60 + qstep) 173 sce->sf_idx[i] -= qstep; 174 }
or
166 if (tbits > destbits) { 167 for (i = 0; i < 128; i++) 168 sce->sf_idx[i] = FFMIN(sce->sf_idx[i]+qstep, 217); 169 } else if (destbits > tbits){ 170 for (i = 0; i < 128; i++) 171 sce->sf_idx[i] = FFMAX(sce->sf_idx[i]-qstep, 61); 172 } else{ 173 break; 174 }
comment:421 by , 9 years ago
I'm confused what your post is meant to say. This code is just copied from the old position in aaccoder.c
comment:422 by , 9 years ago
Replying to Kamedo2:
libavcodec / aaccoder_twoloop.h line 172:
60 - qstep
166 if (tbits > destbits) { 167 for (i = 0; i < 128; i++) 168 if (sce->sf_idx[i] < 218 - qstep) 169 sce->sf_idx[i] += qstep; 170 } else { 171 for (i = 0; i < 128; i++) 172 if (sce->sf_idx[i] > 60 - qstep) 173 sce->sf_idx[i] -= qstep; 174 }might meant
166 if (tbits > destbits) { 167 for (i = 0; i < 128; i++) 168 if (sce->sf_idx[i] < 218 - qstep) 169 sce->sf_idx[i] += qstep; 170 } else { 171 for (i = 0; i < 128; i++) 172 if (sce->sf_idx[i] > 60 + qstep) 173 sce->sf_idx[i] -= qstep; 174 }or
166 if (tbits > destbits) { 167 for (i = 0; i < 128; i++) 168 sce->sf_idx[i] = FFMIN(sce->sf_idx[i]+qstep, 217); 169 } else if (destbits > tbits){ 170 for (i = 0; i < 128; i++) 171 sce->sf_idx[i] = FFMAX(sce->sf_idx[i]-qstep, 61); 172 } else{ 173 break; 174 }
you're right, v9c has it fixed (probably earlier versions too).
comment:423 by , 9 years ago
I have encoded over 200 GB of diverse sounds on diverse settings without apparent problems.
comment:424 by , 9 years ago
I've been bugging Claudio almost daily to push his work to git master so that finally we can move on with testing it out and nailing any last bugs left.
This might hopefully happen in a day or two if there are no any setbacks left.
comment:426 by , 9 years ago
-stereo_mode
in the FFmpeg Codecs Documentation was abolished and -aac_ms 1
(Force M/S stereo coding) will be used instead, Am I right?
comment:428 by , 9 years ago
Replying to Kamedo2:
Yes, all encoder options now start with "-aac_":
ffmpeg -help encoder=aac
AAC encoder AVOptions: -aac_coder <int> E...A... Coding algorithm (from -1 to 3) (default 2) faac E...A... FAAC-inspired method anmr E...A... ANMR method twoloop E...A... Two loop searching method fast E...A... Constant quantizer -aac_ms <boolean> E...A... Force M/S stereo coding (default false) -aac_is <boolean> E...A... Intensity stereo coding (default auto) -aac_pns <boolean> E...A... Perceptual noise substitution (default auto) -aac_tns <boolean> E...A... Temporal noise shaping (default auto) -aac_pred <boolean> E...A... AAC-Main prediction (default auto)
Any option set to automatic means that the profile will determine it by default, unless it is set via the command line. Any option not set to a default 'auto' means the default value indicated will be set. Also, "-aac_ms" is not boolean as indicated but can be set to '-1' which means it will be automatically used when there will be an encoding gain.
Keep in mind the psychoacoustic system currently doesn't account for the cutoff which the new coder introduced, leading to bits being wasted and the quality being decreased. Claudio will be pushing a patch to fix that. This only affects heavy synth samples but should fix a lot of bugs which might be related currently. This is also what's currently blocking us from removing the 'experimental' flag.
Also, I still have to merge my LTP patches, which will happen later today.
Kamedo2: Not sure how but I got an email invitation from Shion to Slack (Audio Video Encoding Community) which you are apparently a member of. I understand enough Japanese to kinda understand the email and I'd love to join but I'm still learning, let alone understanding technical jargon. Sorry :|
Maybe after I know a little more.
follow-up: 430 comment:429 by , 9 years ago
aac_tns "auto" is a bit misleading though, its not actually turned on for any of the profiles.
comment:430 by , 9 years ago
Replying to heleppkes:
aac_tns "auto" is a bit misleading though, its not actually turned on for any of the profiles.
Not yet, no. I'll change it to false when I commit my LTP changes.
comment:431 by , 9 years ago
The combinations of these options below are now extensively tested. Rate, speed, and error codes are monitored.
["-aac_coder faac", "-aac_coder fast", "-aac_coder twoloop"], ["-aac_ms 0","-aac_ms 1"], ["-aac_is 0","-aac_is 1"], ["-aac_tns 0","-aac_tns 1"], ["-profile:a aac_main -aac_pred 1 -aac_pns 0","-profile:a aac_low -aac_pns 1","-profile:a mpeg2_aac_low"], ["-ar 8000", "-ar 44100", "-ar 48000", "-ar 96000"], ["-b:a 16k", "-b:a 96k", "-b:a 128k", "-q:a 1", "-b:a 240k", "-b:a 320k", "-b:a 512k", "-q:a 0.25"]
(2304 combinations total)
--aac_coder anmr
seems to be unstable and prone to crashing.
-aac_coder faac
and -aac_coder fast
often ignore bitrate blatantly.
atomnuker: Yes, I am a member of the Fueru Wakame, the audio video encoding community on Slack. Glad to hear that you understand Japanese to that point.
comment:432 by , 9 years ago
Neither anmr, faac or fast were modified.
It is possible that they need to be updated to avoid the crashing, although I don't see how exactly. You could try confirming whether earlier revisions also exhibit that behavior, and how far back.
faac
will probably be scrapped, fast
will have to be rewritten, and anmr
is a big question mark at present. I've been working on ANMR and some problems have surfaced that don't seem easy to resolve, or at all possible with ANMR's approach. Surely it can be made not to crash, but beyond that I'm unsure how far we can push ANMR.
For now, the priority is twoloop
.
comment:433 by , 9 years ago
Since this appears to be the aac encoder development thread I have been fuzzing the encoder and get this crash a lot:
comment:434 by , 9 years ago
Yeah, I asked on IRC but you seemed to be away: did you build with assertion_level=2? Can you share a sample that reproduces the crash? (or add the fuzzer as a fate test?)
I don't see many ways in which that crash could happen. The only way I can think of has an assert that should have tripped earlier.
comment:435 by , 9 years ago
Yes, same with assertion-level=2
Sample can be found here: http://obe.tv/Downloads/fuzz/fuzz1.wav
./ffmpeg_g -i "fuzz1.wav" -strict -2 -y out.aac
comment:436 by , 9 years ago
The fuzz1.wav file seems to be improperly delivered. This seems to be a stereo 48kHz 16bit linear wav file, but the header is 52 C9 46 46, as opposed to usual 52 49 46 46(RIFF), and 'fmt ' chunk have 10 02 00 00(528) length when normally and from context 10 00 00 00(16).
Kierank, is the wav file playable in your environment?
comment:437 by , 9 years ago
Hi,
No the file is not meant to be playable, it's the output from a tool designed to make crazy inputs in order to crash decoders (or in this case encoders).
Kieran
comment:438 by , 9 years ago
An update, I have a fix (fixes in fact) for the assertion error, I'll be pushing it as soon as I can confirm it causes no regressions (it did, fixed a few).
comment:441 by , 9 years ago
Kamedo2:
I've improved TNS and have made it the default (-aac_tns 1). Also MS coding is now automatic by default (-aac_ms -1).
I've also added LTP support for voice or piano music encoding, use -aac_ltp 1 (or -profile:a aac_ltp) to test it. All features are currently in git master.
Claudio has 2 small fixes to merge. Hopefully won't take long.
comment:442 by , 9 years ago
I am testing the combination of these options on N-76111-g8c9c8fd.
["-b:a 8k", "-b:a 80k", "-b:a 160k", "-b:a 320k", "-b:a 530k", "-q:a 0.1", "-q:a 2", "-q:a 320k"], ["-profile:a aac_ltp", "-profile:a aac_main -aac_pred 1 -aac_pns 0", "-profile:a aac_low -aac_pns 1", "-profile:a mpeg2_aac_low"], ["-ar 8000", "-ar 11025", "-ar 24000", "-ar 44100", "-ar 48000", "-ar 96000"], ["", "-cutoff 15000", "-cutoff 22050"]
Test tracks are ffmpeg_aacvbr_pulse1.flac(12.12sec), ffmpeg_anmr_error.flac(2.32sec), ffmpeg_96k_error.flac(2.01sec).
-profile:a aac_ltp
encoding is currently slower than the realtime, about 0.9x on 44.1kHz. Is it the intended behavior?
comment:443 by , 9 years ago
ffmpeg76324 -y -i ffmpeg_anmr_error2.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4
or
ffmpeg76324 -y -i ffmpeg_anmr_error5.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4
crash the encoder.
follow-up: 446 comment:444 by , 9 years ago
ffmpeg76851 -y -i ffmpeg_anmr_error2.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4
still crashes the encoder.
ffmpeg version N-76851-ga330430 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 9.100 / 55. 9.100 libavcodec 57. 16.100 / 57. 16.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 15.100 / 6. 15.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_anmr_error2.flac': Duration: 00:00:17.95, start: 0.000000, bitrate: 504 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 80 kb/s Metadata: encoder : Lavc57.16.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help size= 38kB time=00:00:03.85 bitrate= 81.8kbits/s
comment:445 by , 9 years ago
ANMR isn't getting any love yet.
It will take some time, I discovered some nasty roadblocks in ANMR's approach.
Twoloop keeps giving away lessons that are useful for ANMR too, so my objective is to get twoloop to its full potential before I start massaging ANMR.
comment:446 by , 9 years ago
Replying to Kamedo2:
ffmpeg76851 -y -i ffmpeg_anmr_error2.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 80k out.mp4still crashes the encoder.
ffmpeg version N-76851-ga330430 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 9.100 / 55. 9.100 libavcodec 57. 16.100 / 57. 16.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 15.100 / 6. 15.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_anmr_error2.flac': Duration: 00:00:17.95, start: 0.000000, bitrate: 504 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 80 kb/s Metadata: encoder : Lavc57.16.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help size= 38kB time=00:00:03.85 bitrate= 81.8kbits/s
Just pushed a commit which improved LTP and fixes the crash. Give it a try. New version is N-76858-g1e5dbb3.
follow-up: 448 comment:447 by , 9 years ago
Thank you, but it still crashes on 80kbps. This N-76863-g8000d48 is after the aac_tablegen speed up.
ffmpeg76863 -i ffmpeg_aacvbr_pulse1.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4
ffmpeg76863 -i ffmpeg_aac320k_collapse3.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4
comment:448 by , 9 years ago
Replying to Kamedo2:
Thank you, but it still crashes on 80kbps. This N-76863-g8000d48 is after the aac_tablegen speed up.
ffmpeg76863 -i ffmpeg_aacvbr_pulse1.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4ffmpeg76863 -i ffmpeg_aac320k_collapse3.flac -c:a aac -strict experimental -profile:a aac_ltp -b:a 80k out.mp4
Hm, I can't seem to replicate either. Does it crash at any other bitrate for you?
comment:449 by , 9 years ago
ffmpeg76877 -y -i ffmpeg_aac320k_collapse3.flac -c:a aac -strict experimental -ar 44100 -profile:a aac_ltp -b:a 128k out.mp4 ffmpeg version N-76877-g861f2b2 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 9.100 / 55. 9.100 libavcodec 57. 16.100 / 57. 16.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 15.100 / 6. 15.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aac320k_collapse3.flac': Duration: 00:00:12.56, start: 0.000000, bitrate: 684 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 128 kb/s Metadata: encoder : Lavc57.16.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help av_interleaved_write_frame(): Visual C++ CRT: Not enough memory to complete call to strerror. size= 1kB time=00:00:02.02 bitrate= 5.0kbits/s
128kbps leads to av_interleaved_write_frame error, 80kbps just crashes.
by , 9 years ago
Attachment: | assertion_diff_shimoseka.m4a added |
---|
Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363
follow-up: 451 comment:450 by , 9 years ago
$ ffmpeg -i assertion_diff_shimoseka.m4a -strict experimental -c:a aac -f null - ffmpeg version N-76947-gec494e6 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --disable-doc libavutil 55. 9.100 / 55. 9.100 libavcodec 57. 16.101 / 57. 16.101 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 17.100 / 6. 17.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'assertion_diff_shimoseka.m4a': Metadata: major_brand : M4A minor_version : 512 compatible_brands: isomiso2 encoder : Lavf57.19.100 Duration: 00:00:50.13, start: 0.000000, bitrate: 129 kb/s Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default) Metadata: handler_name : SoundHandler Output #0, null, to 'pipe:': Metadata: major_brand : M4A minor_version : 512 compatible_brands: isomiso2 encoder : Lavf57.19.100 Stream #0:0(und): Audio: aac, 48000 Hz, stereo, fltp, 128 kb/s (default) Metadata: handler_name : SoundHandler encoder : Lavc57.16.101 aac Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 Aborted (core dumped)
Appears to be a regression, but I did not run a bisect. Attached sample input file from Zeranoe forum user Zaoshi.
comment:451 by , 9 years ago
Replying to llogan:
Appears to be a regression, but I did not run a bisect. Attached sample input file from Zeranoe forum user Zaoshi.
Bug seems to only happen with intensity stereo enabled. The newest patch by klaussfreire fixes the bug. Might look into a quick fix but it's a lower priority than reviewing that patch, considering how many artifacts it fixes.
comment:452 by , 9 years ago
Two crash bugs on 240kbps on both -profile:a aac_ltp
and default.
ffmpeg77126 -i FFmpeg_anmr_error6.flac -c:a aac -profile:a aac_ltp -b:a 240k out.mp4 ffmpeg version N-77126-g357c626 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 File 'out.mp4' already exists. Overwrite ? [y/N] y Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 240 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77126 -y -i FFmpeg_anmr_error6.flac -c:a aac -b:a 240k out.mp4 ffmpeg version N-77126-g357c626 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 240 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
comment:453 by , 9 years ago
Another crash example, probably another arithmetic overflow like comment:440. I think anything above -q:a 3 should be clipped to -q:a 3, because I don't think of any practical use, and to simplify the testing procedure.
ffmpeg77126 -y -i FFmpeg_anmr_error6.flac -c:a aac -q:a 1280k out.mp4 ffmpeg version N-77126-g357c626 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
comment:454 by , 9 years ago
Fixed the FFmpeg_anmr_error6.flac crashes in git master, version N-77158-g4c5136a. Give it a test.
comment:455 by , 9 years ago
These three still crashes.
ffmpeg77171 -y -i FFmpeg_aacvbr_pulse2.flac -c:a aac -b:a 1 out.mp4 ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_aacvbr_pulse2.flac': Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 [aac @ 001b5ba0] Bitrate 1 is extremely low, maybe you mean 1k The bitrate parameter is set too low. It takes bits/s as argument, not kbits/s Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp (16 bit), 0 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77171 -y -i FFmpeg_aacvbr_pulse2.flac -c:a aac -ar 11025 -cutoff 5000 -profile:a aac_main -b:a 8k out.mp4 ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_aacvbr_pulse2.flac': Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16 bit), 8 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77171 -y -i FFmpeg_anmr_error6.flac -c:a aac -q:a 1280k out.mp4 ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
comment:456 by , 9 years ago
ffmpeg77171 -y -i ffmpeg_aacvbr_pulse2.flac -c:a aac -ar 11025 -cutoff 5000 -profile:a aac_main -b:a 8k out.mp4 ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aacvbr_pulse2.flac': Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16 bit), 8 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77171 -y -i sine_tester.flac -c:a aac -ar 11025 -cutoff 5000 -b:a 8k out.mp4 ffmpeg version N-77171-g89bbf01 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 19.100 / 57. 19.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'sine_tester.flac': Duration: 00:00:28.00, start: 0.000000, bitrate: 294 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit) Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.19.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (24 bit), 8 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
follow-up: 458 comment:457 by , 9 years ago
Wasn't one of claudio's changes supposed to get rid of this particular assert for good? The commit message suggested as much.
comment:458 by , 9 years ago
Replying to heleppkes:
Wasn't one of claudio's changes supposed to get rid of this particular assert for good? The commit message suggested as much.
It should have. It did in all the cases I tested. I'll have to try and reproduce this particular case later.
follow-up: 466 comment:459 by , 9 years ago
The new AAC output cannot be decoded properly by the faad decoder.
ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
faad -b 1 out.mp4 out.wav
results in collapsed sounds.
ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
faad -q -b 4 out.mp4 out.wav
decoding to 32bit float also results in collapsed sounds.
ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
ffmpeg77208 -y -i out.mp4 -c:a pcm_s16le out.wav
The same AAC output decoded by the new FFmpeg is OK.
ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4
ffmpeg72585 -y -i out.mp4 -c:a pcm_s16le out.wav
The same AAC output decoded by older FFmpeg is also OK.
ffmpeg76735 -y -i abc\compilation2.wav -c:a aac -b:a 96k -strict -2 out.mp4
faad -b 1 out.mp4 out.wav
72585, 76735 was OK, but the 76851, 76877, 76976, 77208 suffer the same problem.
by , 9 years ago
Attachment: | ffmpeg_aac_error2.flac added |
---|
This causes error on -profile:a aac_ltp -b:a 96k. The error msg are "av_interleaved_write_frame(): Not enough space" or "Audio encoding failed (avcodec_encode_audio2)". The sound is 08._Sarah_McLachlan_Ice_ringing.flac
comment:460 by , 9 years ago
ffmpeg77208 -y -i ffmpeg_aac_error2.flac -c:a aac -profile:a aac_ltp -b:a 96k out.mp4 ffmpeg version N-77208-gb4f1636 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 20.100 / 6. 20.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aac_error2.flac': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Side data: replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un known, album peak - unknown, Output #0, mp4, to 'out.mp4': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 96 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Audio encoding failed (avcodec_encode_audio2)
follow-up: 463 comment:461 by , 9 years ago
I am considering a new listening test of -c:a aac
and -c:a libfdk_aac
at 64k, 96k, and 128kbps. Is the bug comment:459 easy to solve?
comment:462 by , 9 years ago
The bug comment:459 don't reproduce with -profile:a mpeg2_aac_low
option.
The bug comment:459 reproduce on -profile:a aac_main
, -profile:a aac_low
, and -profile:a aac_ltp
options.
comment:463 by , 9 years ago
Replying to Kamedo2:
I am considering a new listening test of
-c:a aac
and-c:a libfdk_aac
at 64k, 96k, and 128kbps. Is the bug comment:459 easy to solve?
First someone would need to determine if its maybe faad thats broken.
No software is ever perfect.
If it doesn't re produce with mpeg2_aac_low, its likely related to PNS, as thats the only feature that gets turned off over aac_low.
comment:464 by , 9 years ago
Apparently, lower bitrate induces the assertion error.
ffmpeg77223 -y -i FFmpeg_anmr_error5.flac -c:a aac -b:a 16k -cutoff 15000 -ar 48000 out.mp4 ffmpeg version N-77233-g28e9b7e Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_anmr_error5.flac': Duration: 00:00:05.00, start: 0.000000, bitrate: 229 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp (16 bit), 16 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77223 -y -i ffmpeg_96k_error.flac -c:a aac -profile:a aac_main -b:a 16k -cutoff 20000 -ar 44100 out.mp4 ffmpeg version N-77233-g28e9b7e Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 10.100 / 55. 10.100 libavcodec 57. 17.100 / 57. 17.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_96k_error.flac': Duration: 00:00:02.01, start: 0.000000, bitrate: 238 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 16 kb/s Metadata: encoder : Lavc57.17.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
comment:465 by , 9 years ago
All of the asserts happen because of PNS. Disable it with -aac_pns 0 and you'll see you won't get any more. Claudio and I are working on a fix, it's a hard problem to solve.
comment:466 by , 9 years ago
Replying to Kamedo2:
The new AAC output cannot be decoded properly by the faad decoder.
ffmpeg77208 -y -i abc\compilation2.wav -c:a aac -b:a 96k out.mp4faad -b 1 out.mp4 out.wavresults in collapsed sounds.
Can you attach or email the compilation2.wav file or something that helps reproduce this?
I've tried a few samples with faad and couldn't yet reproduce it.
comment:467 by , 9 years ago
http://downloads.xiph.org/websites/xiph.org/vorbis/listen/compilation2.wav
The encoded sound collapses on FAAD2 ( Ahead Software MPEG-4 AAC Decoder V2.7 ).
I failed to reproduce it on NeroAACDec 1.5.1.0.
comment:468 by , 9 years ago
I believe the assertion failure in comment:456 and comment:464 has been fixed by the last commit.
I managed to reproduce comment:459, but I'm still investigating it. I'm suspecting it is indeed a bug in faad related to either M/S coding or I/S coding.
comment:469 by , 9 years ago
Sadly, it still crashes with some samples.
ffmpeg77436 -y -i "FFmpeg_aacvbr_pulse2.flac" -c:a aac -ar 11025 -cutoff 5000 -profile:a aac_main -b:a 8k out.mp4 ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 11.100 / 55. 11.100 libavcodec 57. 20.100 / 57. 20.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_aacvbr_pulse2.flac': Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16 bit), 8 kb/s Metadata: encoder : Lavc57.20.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77436 -y -i "FFmpeg_anmr_error6.flac" -c:a aac -q:a 1280k out.mp4 ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 11.100 / 55. 11.100 libavcodec 57. 20.100 / 57. 20.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'FFmpeg_anmr_error6.flac': Duration: 00:00:10.00, start: 0.000000, bitrate: 218 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
ffmpeg77436 -y -i "ffmpeg_aacvbr_pulse2.flac" -c:a aac -ar 11025 -cutoff 5000 -b:a 8k -profile:a aac_main out.mp4 ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 11.100 / 55. 11.100 libavcodec 57. 20.100 / 57. 20.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aacvbr_pulse2.flac': Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16 bit), 8 kb/s Metadata: encoder : Lavc57.20.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77436 -y -i "sine_tester.flac" -c:a aac -ar 11025 -cutoff 5000 -b:a 8k out.mp4 ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 11.100 / 55. 11.100 libavcodec 57. 20.100 / 57. 20.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'sine_tester.flac': Duration: 00:00:28.00, start: 0.000000, bitrate: 294 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit) Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (24 bit), 8 kb/s Metadata: encoder : Lavc57.20.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
ffmpeg77436 -y -i "ffmpeg_96k_error.flac" -c:a aac -profile:a aac_main -b:a 16k -ar 44100 -cutoff 20000 out.mp4 ffmpeg version N-77436-g18cd789 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 11.100 / 55. 11.100 libavcodec 57. 20.100 / 57. 20.100 libavformat 57. 20.100 / 57. 20.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 21.100 / 6. 21.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_96k_error.flac': Duration: 00:00:02.01, start: 0.000000, bitrate: 238 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.20.100 Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 16 kb/s Metadata: encoder : Lavc57.20.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
comment:470 by , 9 years ago
Oh, I hadn't seen the -profile:a aac_main in comment:456, without it, it doesn't crash anymore, but with it it does.
AFAIK, the only difference there is main prediction (aac_main enables main prediction).
I'll look into it later.
follow-up: 472 comment:471 by , 9 years ago
I'm going to start a new listening test of -c:a aac
and -c:a libfdk_aac
at 64k, 96k, and 128kbps in 2016/01/05. I hope this encoder will be stable until then.
comment:472 by , 9 years ago
Replying to Kamedo2:
I'm going to start a new listening test of
-c:a aac
and-c:a libfdk_aac
at 64k, 96k, and 128kbps in 2016/01/05. I hope this encoder will be stable until then.
It's perfectly stable under normal operating bitrates and settings until you start testing it to the extremes with variable bit rate (please don't use it) and aac_main (don't use this either).
Considering it's already used in professional broadcasting (with aac_pns 0 since that's what causes the instability) I say it's stable. It survived a whole week of fuzzing after all.
comment:473 by , 9 years ago
Well, time for an update...
I did some tests, and faad's problem is with correlated PNS bands (PNS + ms_mask). It seems to be applying the M/S transform even though the specs clearly state that when PNS is used in conjunction with ms_mask bits, it should not.
I'd consider that a faad bug, but we are indeed producing "weird" bitstreams (we signal correlated PNS when only one side uses PNS, which makes the ms_mask unnecessary). Avoiding that weirdness works around faad's bug (and avoids possibly triggering similar bugs in other decoders).
I'm working on that patch now (thoroughly testing now).
follow-up: 475 comment:474 by , 9 years ago
Is the faad's bug work-around ready?
I have thoroughly tested ffmpeg77652. In LC profile above 22kHz and 32kbps, the native encoder seems to be stable.
comment:475 by , 9 years ago
Replying to Kamedo2:
Is the faad's bug work-around ready?
I have thoroughly tested ffmpeg77652. In LC profile above 22kHz and 32kbps, the native encoder seems to be stable.
It's in the pipeline.
Just doing some regression ABX testing, since the objective (PSNR) A/B script pointed out some seemingly significant regressions (until now I couldn't confirm anyone with ABX but I'm not done testing yet)
follow-up: 479 comment:477 by , 9 years ago
https://www.ffmpeg.org/ffmpeg-codecs.html#Options-5
'aac_pred'
Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.
I believe it should be 'aac_main'.
comment:478 by , 9 years ago
aac_pred enables the prediction feature, the profile is controlled by "-profile aac_main".
So yes, the docs seem buggy.
comment:479 by , 9 years ago
Replying to Kamedo2:
https://www.ffmpeg.org/ffmpeg-codecs.html#Options-5
'aac_pred'
Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.
I believe it should be 'aac_main'.
It's correct as it is. You can enable AAC-Main (and thus prediction) in two ways: set the profile via -profile:a aac_main or set the prediction flag via -aac_pred 1. Setting one will set the other as well, since you can't have prediction without the profile being set and you can't have the profile set without prediction (well you can but it would be a hack as you'd just set all scalefactor bands to disable prediction).
follow-up: 481 comment:480 by , 9 years ago
I think he is referring to the aac_pred bullet point under the "profiles" section, which is not quite correct, since aac_main is the name of the profile.
comment:481 by , 9 years ago
Replying to heleppkes:
I think he is referring to the aac_pred bullet point under the "profiles" section, which is not quite correct, since aac_main is the name of the profile.
It is correct since the option aac_pred will enable AAC-Main prediction, even though it's not the name of the profile. Hence why it's listed there.
-aac_pred 1 enables -profile:a aac_main and -profile:a aac_main enables -aac_pred 1
follow-up: 483 comment:482 by , 9 years ago
ffmpeg77758 -i in.wav -c:a aac -profile:a aac_pred out.mp4
It fails.
The document https://www.ffmpeg.org/ffmpeg-codecs.html should be:
'aac_main'
Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.
comment:483 by , 9 years ago
Replying to Kamedo2:
ffmpeg77758 -i in.wav -c:a aac -profile:a aac_pred out.mp4
It fails.
The document https://www.ffmpeg.org/ffmpeg-codecs.html should be:
'aac_main'
Main-type prediction profile, is enabled by and will enable the aac_pred option. Introduced in MPEG2.
I fixed it 3 hours ago. I didn't understand where the typo was.
comment:484 by , 9 years ago
Another assertion failure.
ffmpeg77758 -y -i ffmpeg_aac_error2.flac -c:a aac -profile:a aac_ltp -cutoff 15000 out.mp4 ffmpeg version N-77758-g6e24946 Copyright (c) 2000-2016 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 13.100 / 55. 13.100 libavcodec 57. 22.100 / 57. 22.100 libavformat 57. 21.101 / 57. 21.101 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 23.100 / 6. 23.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aac_error2.flac': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Side data: replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un known, album peak - unknown, Output #0, mp4, to 'out.mp4': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 encoder : Lavf57.21.101 Stream #0:0: Audio: aac (LTP) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fl tp (16 bit), 128 kb/s Metadata: encoder : Lavc57.22.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion afq->remaining_samples == afq->remaining_delay failed at libavcodec/au dio_frame_queue.c:106 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
comment:485 by , 9 years ago
I reproduced the aacenc.c assertion errors on ARM, but not the audio_frame_queue.c assertion error on comment:484.
pi@raspberrypi:~/ffmpeg160112 $ time ./ffmpeg -y -i ffmpeg_96k_error.flac -c:a aac -profile:a aac_main -b:a 16k -ar 44100 -cutoff 20000 out.mp4 ffmpeg version N-77804-gd64d6ed Copyright (c) 2000-2016 the FFmpeg developers built with gcc 4.9.2 (Raspbian 4.9.2-10) configuration: libavutil 55. 13.100 / 55. 13.100 libavcodec 57. 22.100 / 57. 22.100 libavformat 57. 21.101 / 57. 21.101 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 23.100 / 6. 23.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 Input #0, flac, from 'ffmpeg_96k_error.flac': Duration: 00:00:02.01, start: 0.000000, bitrate: 238 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.21.101 Stream #0:0: Audio: aac (Main) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp (16 bit), 16 kb/s Metadata: encoder : Lavc57.22.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 Aborted real 0m8.021s user 0m7.960s sys 0m0.060s
pi@raspberrypi:~/ffmpeg160112 $ time ./ffmpeg -y -i ffmpeg_aacvbr_pulse2.flac -c:a aac -ar 11025 -cutoff 5000 -b:a 8k -profile:a aac_main out.mp4 ffmpeg version N-77804-gd64d6ed Copyright (c) 2000-2016 the FFmpeg developers built with gcc 4.9.2 (Raspbian 4.9.2-10) configuration: libavutil 55. 13.100 / 55. 13.100 libavcodec 57. 22.100 / 57. 22.100 libavformat 57. 21.101 / 57. 21.101 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 23.100 / 6. 23.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 Input #0, flac, from 'ffmpeg_aacvbr_pulse2.flac': Duration: 00:00:16.10, start: 0.000000, bitrate: 1167 kb/s Stream #0:0: Audio: flac, 48000 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.21.101 Stream #0:0: Audio: aac (Main) ([64][0][0][0] / 0x0040), 11025 Hz, stereo, fltp (16 bit), 8 kb/s Metadata: encoder : Lavc57.22.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 Aborted real 0m11.663s user 0m11.660s sys 0m0.100s
comment:487 by , 9 years ago
Now the faad decode and the low bitrate encodes above are properly working on x86.
But this will fail.
ffmpeg77827 -y -i short_block_test_2.flac -c:a aac -b:a 8k -cutoff 15000 -ar 48000 out.mp4 ffmpeg version N-77827-g9006567 Copyright (c) 2000-2016 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: libavutil 55. 13.100 / 55. 13.100 libavcodec 57. 22.100 / 57. 22.100 libavformat 57. 21.101 / 57. 21.101 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 23.100 / 6. 23.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 Input #0, flac, from 'short_block_test_2.flac': Duration: 00:00:15.00, start: 0.000000, bitrate: 91 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf57.21.101 Stream #0:0: Audio: aac (LC) ([64][0][0][0] / 0x0040), 48000 Hz, stereo, flt p (16 bit), 8 kb/s Metadata: encoder : Lavc57.22.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:363 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
by , 9 years ago
Attachment: | short_block_test_2.flac added |
---|
comment:489 by , 9 years ago
I started a blind listening test of -c:a aac
and -c:a libfdk_aac
at 64k, 96k, and 128kbps. The progress is 33% now.
comment:490 by , 9 years ago
I just pushed a fix for the assertion failure on short_block_test_2, and a few other artifacts that were exposed by that sample. There are some artifacts remaining still, but I'm having a hard time pinpointing where they come from, so I thought I should push before you're done with your listening test ;)
comment:491 by , 9 years ago
This will output error and stop if the bitrate is 8k, 16k, 24k, 32k, and 48k.
40k 64k 72k 80k 88k 96k 104k 112k 120k 128k is encodable.
ffmpeg77914 -y -i ffmpeg_aac_error2.flac -c:a aac -ac 1 -profile:a aac_ltp -b:a 8k -cutoff 15000 out.mp4 ffmpeg version N-77914-g03d83ba Copyright (c) 2000-2016 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 13.100 / 55. 13.100 libavcodec 57. 22.100 / 57. 22.100 libavformat 57. 21.101 / 57. 21.101 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 25.100 / 6. 25.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aac_error2.flac': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Side data: replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un known, album peak - unknown, Output #0, mp4, to 'out.mp4': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 encoder : Lavf57.21.101 Stream #0:0: Audio: aac (LTP) ([64][0][0][0] / 0x0040), 44100 Hz, mono, fltp (16 bit), 8 kb/s Metadata: encoder : Lavc57.22.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help [mp4 @ 005c48e0] Application provided duration: 4539201763687527334 / timestamp: 4539201763687792550 is out of range for mov/mp4 format [mp4 @ 005c48e0] pts has no value [aac @ 005c5940] Queue input is backward in time [mp4 @ 005c48e0] Non-monotonous DTS in output stream 0:0; previous: 453920176368 7792550, current: 4535156856773993315; changing to 4539201763687792551. This may result in incorrect timestamps in the output file. [mp4 @ 005c48e0] Application provided duration: 4539201763687527334 / timestamp: 4539201763687792551 is out of range for mov/mp4 format [mp4 @ 005c48e0] pts has no value
ffmpeg_aacvbr_pulse1.flac also have this error.
comment:492 by , 9 years ago
Sadly, this will also fail.
ffmpeg77914 -y -i ffmpeg_aac_error2.flac -c:a aac -profile:a aac_ltp out.mp4 ffmpeg version N-77914-g03d83ba Copyright (c) 2000-2016 the FFmpeg developers built with gcc 5.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libmp3 lame --enable-libvo-aacenc --enable-libvorbis --enable-libfdk-aac --enable-w32th reads --extra-ldflags=-static --extra-cflags='-mtune=nocona' --optflags=-O2 libavutil 55. 13.100 / 55. 13.100 libavcodec 57. 22.100 / 57. 22.100 libavformat 57. 21.101 / 57. 21.101 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 25.100 / 6. 25.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.101 / 2. 0.101 libpostproc 54. 0.100 / 54. 0.100 Input #0, flac, from 'ffmpeg_aac_error2.flac': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 Duration: 00:00:15.00, start: 0.000000, bitrate: 658 kb/s Stream #0:0: Audio: flac, 44100 Hz, stereo, s16 Side data: replaygain: track gain - -0.570000, track peak - 0.000011, album gain - un known, album peak - unknown, Output #0, mp4, to 'out.mp4': Metadata: REPLAYGAIN_TRACK_GAIN: -0.57 dB REPLAYGAIN_TRACK_PEAK: 0.474701 encoder : Lavf57.21.101 Stream #0:0: Audio: aac (LTP) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fl tp (16 bit), 128 kb/s Metadata: encoder : Lavc57.22.100 aac Stream mapping: Stream #0:0 -> #0:0 (flac (native) -> aac (native)) Press [q] to stop, [?] for help Assertion afq->remaining_samples == afq->remaining_delay failed at libavcodec/au dio_frame_queue.c:106 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
comment:493 by , 9 years ago
Clearly ltp has problems, I haven't gotten around to solving them yet.
I did find the last of TNS issues, I'll push after some further testing, but it looks good.
comment:494 by , 9 years ago
Flagged the LTP profile as experimental. We have enough bug reports to work on to fix LTP. Crashes on the aac_low profile are the main priority right now.
Also removed the FAAC-like coder, since it has been marked for removal for over a month.
comment:495 by , 9 years ago
I thought the timescale of those things (removal) was measured in releases.
comment:496 by , 9 years ago
Generally yes, however aacenc was experimental in the last release, which kind of exempts it from stability rules.
comment:497 by , 9 years ago
comment:498 by , 9 years ago
Which version?
The bugs on TNS that were fixed recently make a big difference in perceived quality (not so much on PSNR)
comment:500 by , 9 years ago
I suggest then you revalidate the results. No need to do the whole test again, start with a canary (a sample that was particularly troublesome), to see if it now compares better.
follow-up: 502 comment:501 by , 9 years ago
Update: I found the main reason why fdk is so far ahead - basically, I/S is way too conservating, to the point where it barely gets used. I'm toying with making it far more aggressive.
comment:502 by , 9 years ago
Replying to klaussfreire:
Update: I found the main reason why fdk is so far ahead - basically, I/S is way too conservating, to the point where it barely gets used. I'm toying with making it far more aggressive.
OK, I will retest the new one.
comment:503 by , 9 years ago
Hold your horses, I haven't pushed anything for I/S yet, and it will take a while (it's a big change that I want to properly test first)
comment:504 by , 9 years ago
In this commit http://git.videolan.org/?p=ffmpeg.git;a=commit;h=66edd8656b851a0c85ba25ec293cc66192c363ae
I guess libavcodec/lpc.c line 179 is meant to be i < len / 2;
.
170 double ff_lpc_calc_ref_coefs_f(LPCContext *s, const float *samples, int len, 171 int order, double *ref) 172 { 173 int i; 174 double signal = 0.0f, avg_err = 0.0f; 175 double autoc[MAX_LPC_ORDER+1] = {0}, error[MAX_LPC_ORDER+1] = {0}; 176 const double a = 0.5f, b = 1.0f - a; 177 178 /* Apply windowing */ 179 for (i = 0; i <= len / 2; i++) { 180 double weight = a - b*cos((2*M_PI*i)/(len - 1)); 181 s->windowed_samples[i] = weight*samples[i]; 182 s->windowed_samples[len-1-i] = weight*samples[len-1-i]; 183 } 184 185 s->lpc_compute_autocorr(s->windowed_samples, len, order, autoc); 186 signal = autoc[0]; 187 compute_ref_coefs(autoc, order, ref, error); 188 for (i = 0; i < order; i++) 189 avg_err = (avg_err + error[i])/2.0f; 190 return signal/avg_err; 191 }
And we can get a 1.2% speedup if we exclude cos function from the loop.
/* Apply windowing */ double cos_onestep = cos((2*M_PI)/(len - 1)); double sin_onestep = sin((2*M_PI)/(len - 1)); double cos_isteps = b; double sin_isteps = 0; for (i = 0; i < len / 2; i++) { double sin_newsteps; double weight = a - cos_isteps; s->windowed_samples[i] = weight*samples[i]; s->windowed_samples[len-1-i] = weight*samples[len-1-i]; sin_newsteps = sin_isteps*cos_onestep + cos_isteps*sin_onestep; cos_isteps = cos_isteps*cos_onestep - sin_isteps*sin_onestep; sin_isteps = sin_newsteps; }
comment:505 by , 9 years ago
This command crashes FFmpeg, after the commit "AAC encoder: fix undefined behavior".
cores\ffmpeg79177 -i "ffmpeg_aac320k_collapse3.flac" -c:a aac -strict experimental -b:a 4k out.mp4
Past versions before the commit, such as N-79171-ga35a4a5 was safe.
follow-up: 507 comment:506 by , 9 years ago
I could never understand those commit numbers. Which repo are they referencing? My checkout has no such commit hash.
There are two commits about undefined behavior. I'm guessing you're referring to the second one. I have actually checked with an automated script that same sample, albeit not with 4kbps. It's a bit low to be included in standard A/B tests, but I'll add it and re-run.
comment:507 by , 9 years ago
Replying to klaussfreire:
I could never understand those commit numbers. Which repo are they referencing? My checkout has no such commit hash.
a677121cc568db7c101ebf3a797a779a983fc668: N-79177-ga677121
a35a4a5774a196f8eefc8ef2994979a6c563e0c2: N-79171-ga35a4a5
comment:509 by , 8 years ago
Analyzed by developer: | set |
---|---|
Resolution: | → fixed |
Status: | open → closed |
Thanks everyone, I think it's time to close this ticket.
Hopefully soon there'll be an Opus encoder to keep Kamedo2 busy :)
The sound file that cripples a native AAC encoder. True My Heart [DVTS-2121][07.09.03] Track05 2m50s~58s