Opened 3 years ago

Closed 10 months ago

#2706 closed defect (fixed)

Native AAC encoder produces warbling with pure aevalsrc sine wave

Reported by: MarkZV Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: aac
Cc: Kamedo2 Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

When encoding a pure sine wave using aevalsrc, using the example expression in the documentation sin(440*2*PI*t), encoding it with the native AAC encoder, and playing it with ffplay, the output warbles rather than being a pure sine wave as expected.

How to reproduce:

$ ffmpeg -v 9 -loglevel 99 -filter_complex "aevalsrc=sin(440*2*PI*t)" -c:a aac -strict experimental -t 3 out.aac
ffmpeg version 1.1.git-bbe26ef Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun 24 2013 14:49:49 with gcc 4.2.1 (GCC) (Apple Inc. build 5666) (dot 3)
  configuration: --prefix=/opt/local --enable-swscale --enable-avfilter --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libtheora --enable-libschroedinger --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libspeex --enable-libass --enable-libbluray --enable-gnutls --enable-libfreetype --mandir=/opt/local/share/man --enable-shared --enable-pthreads --cc=/usr/bin/gcc-4.2 --arch=x86_64 --enable-yasm --enable-gpl --enable-postproc --enable-libx264 --enable-libxvid --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-nonfree --enable-libfdk-aac --enable-libfaac
  libavutil      52. 37.101 / 52. 37.101
  libavcodec     55. 17.100 / 55. 17.100
  libavformat    55.  9.100 / 55.  9.100
  libavdevice    55.  2.100 / 55.  2.100
  libavfilter     3. 77.101 /  3. 77.101
  libswscale      2.  3.100 /  2.  3.100
  libswresample   0. 17.102 /  0. 17.102
  libpostproc    52.  3.100 / 52.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'.
Reading option '-filter_complex' ... matched as option 'filter_complex' (create a complex filtergraph) with argument 'aevalsrc=sin(440*2*PI*t)'.
Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac'.
Reading option '-strict' ... matched as AVOption 'strict' with argument 'experimental'.
Reading option '-t' ... matched as option 't' (record or transcode "duration" seconds of audio/video) with argument '3'.
Reading option 'out.aac' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option filter_complex (create a complex filtergraph) with argument aevalsrc=sin(440*2*PI*t).
Successfully parsed a group of options.
Parsing a group of options: output file out.aac.
Applying option c:a (codec name) with argument aac.
Applying option t (record or transcode "duration" seconds of audio/video) with argument 3.
Successfully parsed a group of options.
Opening an output file: out.aac.
detected 4 logical cores
[Parsed_aevalsrc_0 @ 0x103100000] compat: called with args=[sin(440*2*PI*t)]
[Parsed_aevalsrc_0 @ 0x103100000] Setting 'exprs' to value 'sin(440*2*PI*t)'
[audio format for output stream 0:0 @ 0x1031010c0] Setting 'sample_fmts' to value 'fltp'
[audio format for output stream 0:0 @ 0x1031010c0] Setting 'sample_rates' to value '96000|88200|64000|48000|44100|32000|24000|22050|16000|12000|11025|8000|7350'
Successfully opened the file.
[audio format for output stream 0:0 @ 0x1031010c0] auto-inserting filter 'auto-inserted resampler 0' between the filter 'Parsed_aevalsrc_0' and the filter 'audio format for output stream 0:0'
[AVFilterGraph @ 0x102421880] query_formats: 3 queried, 6 merged, 3 already done, 0 delayed
[Parsed_aevalsrc_0 @ 0x103100000] sample_rate:44100 chlayout:mono duration:-1.000000
[auto-inserted resampler 0 @ 0x103101800] [SWR @ 0x10380a600] Using double precision mode
[auto-inserted resampler 0 @ 0x103101800] ch:1 chl:mono fmt:dblp r:44100Hz -> ch:1 chl:mono fmt:fltp r:44100Hz
Output #0, adts, to 'out.aac':
  Metadata:
    encoder         : Lavf55.9.100
    Stream #0:0, 0, 1/90000: Audio: aac, 44100 Hz, mono, fltp, 128 kb/s
Stream mapping:
  aevalsrc -> Stream #0:0 (aac)
Press [q] to stop, [?] for help
No more output streams to write to, finishing.
size=      23kB time=00:00:03.01 bitrate=  63.0kbits/s    
video:0kB audio:22kB subtitle:0 global headers:0kB muxing overhead 3.990025%
0 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x103101700] Statistics: 0 seeks, 131 writeouts
$ ffplay out.aac
ffplay version 1.1.git-bbe26ef Copyright (c) 2003-2013 the FFmpeg developers
  built on Jun 24 2013 14:49:49 with gcc 4.2.1 (GCC) (Apple Inc. build 5666) (dot 3)
  configuration: --prefix=/opt/local --enable-swscale --enable-avfilter --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libtheora --enable-libschroedinger --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libspeex --enable-libass --enable-libbluray --enable-gnutls --enable-libfreetype --mandir=/opt/local/share/man --enable-shared --enable-pthreads --cc=/usr/bin/gcc-4.2 --arch=x86_64 --enable-yasm --enable-gpl --enable-postproc --enable-libx264 --enable-libxvid --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-nonfree --enable-libfdk-aac --enable-libfaac
  libavutil      52. 37.101 / 52. 37.101
  libavcodec     55. 17.100 / 55. 17.100
  libavformat    55.  9.100 / 55.  9.100
  libavdevice    55.  2.100 / 55.  2.100
  libavfilter     3. 77.101 /  3. 77.101
  libswscale      2.  3.100 /  2.  3.100
  libswresample   0. 17.102 /  0. 17.102
  libpostproc    52.  3.100 / 52.  3.100
Estimating duration from bitrate, this may be inaccurate 0B f=0/0   
Input #0, aac, from 'out.aac':
  Duration: 00:00:00.84, bitrate: 226 kb/s
    Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 226 kb/s
   4.29 M-A:  0.000 fd=   0 aq=    0KB vq=    0KB sq=    0B f=0/0   

Seems to be overshooting the range. It works as expected if the FDK AAC encoder is used (-c:a libfdk_aac). Also the "sine" source works (although it is quieter) if a sine wave is all that is needed, but of course it is not as flexible. It would be nice to start with a working sine wave and then be able to make modifications to the expression.

Attachments (1)

sine440Hz_24bit.flac (915.8 KB) - added by Kamedo2 3 years ago.
sine wave, full scale, 440Hz, mono, 24bit 20sec. created by SoundEngine? Free ver.4.59

Download all attachments as: .zip

Change History (13)

comment:1 Changed 3 years ago by cehoyos

Does the result sound better if you force a lower bitrate?

comment:2 Changed 3 years ago by MarkZV

With the default options it produced 63 kb/s. 32 kb/s is still bad, although 16 kb/s sounds ok.

comment:3 Changed 3 years ago by cehoyos

Could this be a duplicate of ticket #2686?

Changed 3 years ago by Kamedo2

sine wave, full scale, 440Hz, mono, 24bit 20sec. created by SoundEngine? Free ver.4.59

comment:4 Changed 3 years ago by Kamedo2

I reproduced the warbling effect, on recent ffmpeg N-54096-ge41bf19 I've got from git, and one ffmpeg version 1.2.1. The warbling was reproduced using the flac file above.

ffmpeg -i sine440Hz_24bit.flac -vn -c:a aac -strict experimental -b:a 128k sin440Hz_24bit.mp4

Both versions were tested and bitrate 0k, 16k, 32k, ......, 240k, 256k was tested.
I listened to the 34 mp4 files and any files above 64k had a serious warbling.

comment:5 Changed 3 years ago by cehoyos

Does that mean it is not a duplicate of #2686 or did you confirm that it is a duplicate?
Or is it too soon to answer this question?

comment:6 Changed 3 years ago by Kamedo2

It can be a duplicate. These two bugs #2686 and #2706, although using two completely different clips, have one thing in common: they happen in higher bitrates.

comment:7 follow-up: Changed 3 years ago by Kamedo2

I Listened to these AACs, Sampling rate: 48000Hz

Freq163248648096112128144160176192208224240256kbps
750Hz010 2 1 1 1 1 1 1 1 1 1 1 1 1
1500Hz001 0 1 2 1 1 1 1 1 1 1 1 1 1
3000Hz000 0 0 1 2 1 1 1 1 1 1 1 1 1
6000Hz-01 2 1 2 0 1 1 1 1 1 1 1 1 1
9000Hz--0 0 2 1 1 1 1 1 1 1 1 1 1 1
12000Hz--- - 0 0 0 1 1 1 1 1 1 1 1 1

0: no warbling
1: warbling
2: terrible warbling
-: within cutoff range

ffmpeg -v 9 -loglevel 99 -filter_complex "aevalsrc=sin(750*2*PI
*t)" -c:a aac -strict experimental -ar 48000 -b:a 64k -t 5 aac_750Hz_64k_sample.
mp4
ffmpeg version N-54245-g7eb5288 Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun 28 2013 20:41:30 with gcc 4.8.1 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-libfdk
-aac --extra-ldflags=-static --extra-cflags='-march=native -mfpmath=sse' --optfl
ags=-O2
  libavutil      52. 37.101 / 52. 37.101
  libavcodec     55. 17.100 / 55. 17.100
  libavformat    55. 10.100 / 55. 10.100
  libavdevice    55.  2.100 / 55.  2.100
  libavfilter     3. 77.101 /  3. 77.101
  libswscale      2.  3.100 /  2.  3.100
  libswresample   0. 17.102 /  0. 17.102
  libpostproc    52.  3.100 / 52.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument
'9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level)
with argument '99'.
Reading option '-filter_complex' ... matched as option 'filter_complex' (create
a complex filtergraph) with argument 'aevalsrc=sin(750*2*PI*t)'.
Reading option '-c:a' ... matched as option 'c' (codec name) with argument 'aac'
.
Reading option '-strict' ... matched as AVOption 'strict' with argument 'experim
ental'.
Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz)
) with argument '48000'.
Reading option '-b:a' ... matched as option 'b' (video bitrate (please use -b:v)
) with argument '64k'.
Reading option '-t' ... matched as option 't' (record or transcode "duration" se
conds of audio/video) with argument '5'.
Reading option 'aac_750Hz_64k_sample.mp4' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option filter_complex (create a complex filtergraph) with argument aeva
lsrc=sin(750*2*PI*t).
Successfully parsed a group of options.
Parsing a group of options: output file aac_750Hz_64k_sample.mp4.
Applying option c:a (codec name) with argument aac.
Applying option ar (set audio sampling rate (in Hz)) with argument 48000.
Applying option b:a (video bitrate (please use -b:v)) with argument 64k.
Applying option t (record or transcode "duration" seconds of audio/video) with a
rgument 5.
Successfully parsed a group of options.
Opening an output file: aac_750Hz_64k_sample.mp4.
detected 8 logical cores
[Parsed_aevalsrc_0 @ 0149fe40] compat: called with args=[sin(750*2*PI*t)]
[Parsed_aevalsrc_0 @ 0149fe40] Setting 'exprs' to value 'sin(750*2*PI*t)'
[audio format for output stream 0:0 @ 03c13440] Setting 'sample_fmts' to value '
fltp'
[audio format for output stream 0:0 @ 03c13440] Setting 'sample_rates' to value
'48000'
Successfully opened the file.
[audio format for output stream 0:0 @ 03c13440] auto-inserting filter 'auto-inse
rted resampler 0' between the filter 'Parsed_aevalsrc_0' and the filter 'audio f
ormat for output stream 0:0'
[AVFilterGraph @ 0149f160] query_formats: 3 queried, 6 merged, 3 already done, 0
 delayed
[Parsed_aevalsrc_0 @ 0149fe40] sample_rate:44100 chlayout:mono duration:-1.00000
0
[auto-inserted resampler 0 @ 0149f0c0] [SWR @ 01492f60] Using double precision m
ode
[auto-inserted resampler 0 @ 0149f0c0] ch:1 chl:mono fmt:dblp r:44100Hz -> ch:1
chl:mono fmt:fltp r:48000Hz
Output #0, mp4, to 'aac_750Hz_64k_sample.mp4':
  Metadata:
    encoder         : Lavf55.10.100
    Stream #0:0, 0, 1/48000: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, mono
, fltp, 64 kb/s
Stream mapping:
  aevalsrc -> Stream #0:0 (aac)
Press [q] to stop, [?] for help
No more output streams to write to, finishing.
size=      41kB time=00:00:05.01 bitrate=  67.1kbits/s
video:0kB audio:39kB subtitle:0 global headers:0kB muxing overhead 4.174310%
0 frames successfully decoded, 0 decoding errors
[AVIOContext @ 03c57b80] Statistics: 30 seeks, 259 writeouts

It is likely this bug is from bit allocation algorithm.
Claudio Freire had kindly written a patch in #2686, but the patch never worked so we can't test it.

comment:8 in reply to: ↑ 7 Changed 3 years ago by Kamedo2

After reading this thread, http://ffmpeg.org/pipermail/ffmpeg-devel/2013-May/143208.html
I could successfully apply a patch #2686 from klaussfreire's repository. So I'm going to post the result with the WIP patch enabled.

I listened to these AACs, Sampling rate: 48000Hz

Freq163248648096112128144160176192208224240256kbps
750Hz000 0 0 0 0 0 0 0 0 0 0 0 0 0
1500Hz000 0 0 0 0 0 0 0 0 0 0 0 0 0
3000Hz000 0 0 0 0 0 0 0 0 0 0 0 0 0
6000Hz-00 0 0 0 0 0 0 0 0 0 0 0 0 0
9000Hz--0 0 0 0 0 0 0 0 0 0 0 0 0 0
12000Hz--- - 0 0 0 0 0 0 0 0 0 0 0 0

0: no warbling
1: warbling
2: terrible warbling
-: within cutoff range

Yes, at least perceptually, there were no warbling at all. However, the patch came with another cost. It will be off topic in here, so I'll keep posting in #2686 thread.

comment:9 follow-up: Changed 14 months ago by cehoyos

  • Cc Kamedo2 added

IS this still reproducible with current FFmpeg git head?

comment:10 in reply to: ↑ 9 Changed 14 months ago by Kamedo2

Replying to cehoyos:

IS this still reproducible with current FFmpeg git head?

No, not on sine440Hz_24bit.flac and sine_tester.flac (See #2686 Sine waves for a warbling test. 50 440 1000 3000 7000 10000 20000Hz. 24bit 48kHz PCM. ), -q:a 1, -q:a 2, -b:a 128k and -b:a 320k, on default on N-75950-ge652f69. The warble remains in perceptually acceptable range.

comment:11 Changed 14 months ago by cehoyos

So is it ok to close this ticket as fixed?

comment:12 Changed 10 months ago by richardpl

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.