Opened 5 months ago

Last modified 5 months ago

#8467 open defect

Audio artefacts encoding AAC

Reported by: kmamal Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: aac
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

I'm trying to create hls streams for various media. I have encountered an mp4 file which sounds terrible when passed through ffmpeg.

The original file is here: https://s3-eu-west-1.amazonaws.com/konstantin.test/ffmpeg-aac-problem/in.mp4

Here's the report from ffmpeg: https://s3-eu-west-1.amazonaws.com/konstantin.test/ffmpeg-aac-problem/ffmpeg-20200113-155239.log

Here's the output file it produces: https://s3-eu-west-1.amazonaws.com/konstantin.test/ffmpeg-aac-problem/out.ts

I notice that the original audio is also aac, so passing -c:a copy works ok and sounds normal. Sadly this is not a viable workaround for me, as I have to change bitrate and sample-rate.

I have also noticed that changing the bitrate and sample rate, changes the "pattern" of the artefacts, but I couldn't find a combination that sounded ok.

Attachments (1)

in.mp4 (232.4 KB) - added by kmamal 5 months ago.
input file

Download all attachments as: .zip

Change History (26)

comment:1 Changed 5 months ago by cehoyos

Who wants to hear the sample file you provided? (And why did you not attach it?)

Please post the command line you tested together with the complete, uncut console output here to make this a valid ticket.

comment:2 Changed 5 months ago by kmamal

The sample file is used by our installations team to test audio levels across speakers in a studio. It's should be a flat tone. Do you prefer attachments? I'll attach it now.

Here's also the contents of ffmpeg-20200113-155239.log for the command line and console output:

ffmpeg started on 2020-01-13 at 15:52:39
Report written to "ffmpeg-20200113-155239.log"
Log level: 48
Command line:
/home/kostis/ffmpeg_build/bin/ffmpeg -report -i in.mp4 -f hls -hls_flags single_file out.m3u8 -y
ffmpeg version N-96334-g1a7f4a1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
  configuration: --prefix=/home/kostis/ffmpeg_build/out --pkg-config-flags=--static --extra-cflags=-I/home/kostis/ffmpeg_build/out/include --extra-ldflags=-L/home/kostis/ffmpeg_build/out/lib --extra-libs='-lpthread -lm' --bindir=/home/kostis/ffmpeg_build/bin --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
  libavutil      56. 38.100 / 56. 38.100
  libavcodec     58. 65.103 / 58. 65.103
  libavformat    58. 35.102 / 58. 35.102
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 71.100 /  7. 71.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Splitting the commandline.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-i' ... matched as input url with argument 'in.mp4'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'hls'.
Reading option '-hls_flags' ... matched as AVOption 'hls_flags' with argument 'single_file'.
Reading option 'out.m3u8' ... matched as output url.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option report (generate a report) with argument 1.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url in.mp4.
Successfully parsed a group of options.
Opening an input file: in.mp4.
[NULL @ 0x56209c726140] Opening 'in.mp4' for reading
[file @ 0x56209c728340] Setting default whitelist 'file,crypto'
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] Format mov,mp4,m4a,3gp,3g2,mj2 probed with size=2048 and score=100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] ISO: File Type Major Brand: isom
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] Unknown dref type 0x206c7275 size 12
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] Processing st: 0, edit list 0 - media time: 0, duration: 441353
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] Before avformat_find_stream_info() pos: 238009 bytes read:35244 seeks:1 nb_streams:1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] All info found
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56209c726140] After avformat_find_stream_info() pos: 50 bytes read:68012 seeks:2 frames:1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'in.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf58.29.100
  Duration: 00:00:10.01, start: 0.000000, bitrate: 190 kb/s
    Stream #0:0(und), 1, 1/44100: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 188 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Successfully opened the file.
Parsing a group of options: output url out.m3u8.
Applying option f (force format) with argument hls.
Successfully parsed a group of options.
Opening an output file: out.m3u8.
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
detected 8 logical cores
[graph_0_in_0_0 @ 0x56209c7e5e40] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 0x56209c7e5e40] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 0x56209c7e5e40] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_0 @ 0x56209c7e5e40] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x56209c7e5e40] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x3
[format_out_0_0 @ 0x56209c7e64c0] Setting 'sample_fmts' to value 'fltp'
[format_out_0_0 @ 0x56209c7e64c0] Setting 'sample_rates' to value '96000|88200|64000|48000|44100|32000|24000|22050|16000|12000|11025|8000|7350'
[AVFilterGraph @ 0x56209c755ac0] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
[hls @ 0x56209c741180] Opening 'out.ts' for writing
[file @ 0x56209c885b40] Setting default whitelist 'file,crypto'
[mpegts @ 0x56209c7e73c0] frame size not set
[mpegts @ 0x56209c7e73c0] service 1 using PCR in pid=256, pcr_period=93ms
[mpegts @ 0x56209c7e73c0] muxrate VBR, sdt every 1073741822000 ms, pat/pmt every 1073741822000 ms
Output #0, hls, to 'out.m3u8':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf58.35.102
    Stream #0:0(und), 0, 1/90000: Audio: aac (LC), 44100 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc58.65.103 aac
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
[hls @ 0x56209c741180] Opening 'out.m3u8.tmp' for writing
[file @ 0x56209c8d4880] Setting default whitelist 'file,crypto'
EXT-X-MEDIA-SEQUENCE:0
[AVIOContext @ 0x56209c8d2780] Statistics: 0 seeks, 1 writeouts
[hls @ 0x56209c741180] Opening 'out.m3u8.tmp' for writing
[file @ 0x56209c8da540] Setting default whitelist 'file,crypto'
EXT-X-MEDIA-SEQUENCE:0
[AVIOContext @ 0x56209c8d4880] Statistics: 0 seeks, 1 writeouts
[hls @ 0x56209c741180] Opening 'out.m3u8.tmp' for writing
[file @ 0x56209c8da540] Setting default whitelist 'file,crypto'
EXT-X-MEDIA-SEQUENCE:0
[AVIOContext @ 0x56209c8d4880] Statistics: 0 seeks, 1 writeouts
[hls @ 0x56209c741180] Opening 'out.m3u8.tmp' for writing
[file @ 0x56209c8da540] Setting default whitelist 'file,crypto'
EXT-X-MEDIA-SEQUENCE:0
[AVIOContext @ 0x56209c8e5680] Statistics: 0 seeks, 1 writeouts
[out_0_0 @ 0x56209c7e6e40] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
[hls @ 0x56209c741180] Opening 'out.m3u8.tmp' for writing
[file @ 0x56209c8ede40] Setting default whitelist 'file,crypto'
EXT-X-MEDIA-SEQUENCE:0
[AVIOContext @ 0x56209c8e5680] Statistics: 0 seeks, 1 writeouts
[AVIOContext @ 0x56209c8c5c80] Statistics: 0 seeks, 6 writeouts
[hls @ 0x56209c741180] Opening 'out.m3u8.tmp' for writing
[file @ 0x56209c8da540] Setting default whitelist 'file,crypto'
EXT-X-MEDIA-SEQUENCE:0
[AVIOContext @ 0x56209c8e5680] Statistics: 0 seeks, 1 writeouts
size=N/A time=00:00:10.00 bitrate=N/A speed=61.8x    
video:0kB audio:159kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (in.mp4):
  Input stream #0:0 (audio): 431 packets read (235489 bytes); 431 frames decoded (441344 samples); 
  Total: 431 packets (235489 bytes) demuxed
Output file #0 (out.m3u8):
  Output stream #0:0 (audio): 431 frames encoded (441344 samples); 432 packets muxed (162595 bytes); 
  Total: 432 packets (162595 bytes) muxed
431 frames successfully decoded, 0 decoding errors
[aac @ 0x56209c72f180] Qavg: 631.204
[AVIOContext @ 0x56209c730700] Statistics: 273209 bytes read, 2 seeks

Changed 5 months ago by kmamal

input file

comment:3 Changed 5 months ago by gdgsdg123

Convert the original audio to WAV (and process based on it), then re-encode to AAC.

Does it work?..

comment:4 Changed 5 months ago by cehoyos

Please test one of the following two command lines to allow better understanding of the issue:

$ ffmpeg -i in.mp4 -acodec aac -map 0:a:0 out.mp4
$ ffmpeg -i in.mp4 -acodec aac out.aac

Do the output files show the same issue as described in your original report?

comment:5 follow-up: Changed 5 months ago by kmamal

Converting to wav and then to aac also sounds bad.

Both these command lines show the exact same issue as in the original report.

Do you want me to attach files and/or reports for any of the above?

comment:6 follow-up: Changed 5 months ago by cehoyos

Please also test the following:

$ ffmpeg -i in.mp4 out.mp4
$ ffmpeg -i in.mp4 out.aac

These should produce different files.

comment:7 in reply to: ↑ 5 Changed 5 months ago by gdgsdg123

Replying to kmamal:

Converting to wav and then to aac also sounds bad.

Then the problem is likely in the post-processing.

comment:8 in reply to: ↑ 6 ; follow-up: Changed 5 months ago by kmamal

Replying to cehoyos:

Please also test the following:

$ ffmpeg -i in.mp4 out.mp4
$ ffmpeg -i in.mp4 out.aac

These should produce different files.

The produced files still sound bad and are in fact byte-for-byte identical with the two files produced by the previous commands.

Converting to wav and then to aac also sounds bad.

Note that the wav sounds fine.

comment:9 in reply to: ↑ 8 Changed 5 months ago by gdgsdg123

Replying to kmamal:

Note that the wav sounds fine.

So you mean the problem can be reproduced using certain WAV file as the input?..

comment:10 Changed 5 months ago by cehoyos

Sorry:

$ ffmpeg -i in.mp4 -acodec libfdk_aac out.mp4

comment:11 Changed 5 months ago by kmamal

So you mean the problem can be reproduced using certain WAV file as the input?

Yes, I will attach it here. I produced it using the command

ffmpeg -i in.mp4 in.wav

This wav file sounds ok.

Then I convert it to aac using

ffmpeg -i in.wav out.aac

and it sounds stuttery again.

ffmpeg -i in.mp4 -acodec libfdk_aac out.mp4

This one sounds ok! Can I use libfdk_aac as a drop-in replacement wherever I would have used aac?

comment:12 Changed 5 months ago by cehoyos

In general, please do not attach files that you created with FFmpeg: All developers should be able to use our application, more attachments only lead to more confusion.

comment:13 Changed 5 months ago by cehoyos

libfdk is the best software aac encoder available for FFmpeg.

comment:14 Changed 5 months ago by cehoyos

Who provides your FFmpeg binary?

comment:15 Changed 5 months ago by kmamal

The ffmpeg binary I have been using throughout this ticket I built myself following the instructions at https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu

We also build ourselves the binary we use in production, but not from the master branch. Behavior is the same though.

Thanks for all the help :)

comment:16 Changed 5 months ago by cehoyos

  • Component changed from undetermined to avcodec
  • Keywords aac added
  • Status changed from new to open

Related to ticket #7550

comment:17 Changed 5 months ago by gdgsdg123

Can the problem be reproduced with the below command?

ffmpeg -i "in.wav" -c:a aac -b:a 320k "out.aac"

(increased target bit rate with the native AAC encoder)

comment:18 Changed 5 months ago by kmamal

Yes the problem happens across a number of bitrates (I tested up to 1024) and sample rates. On lower sample rates it actually sounds better.

comment:19 Changed 5 months ago by gdgsdg123

What's the format of the "in.wav"? (output of:

ffprobe -show_streams "in.wav"

)

comment:20 Changed 5 months ago by kmamal

Output of

ffprobe -show_streams "in.wav"
ffprobe version N-96334-g1a7f4a1 Copyright (c) 2007-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
  configuration: --prefix=/home/kostis/ffmpeg_build/out --pkg-config-flags=--static --extra-cflags=-I/home/kostis/ffmpeg_build/out/include --extra-ldflags=-L/home/kostis/ffmpeg_build/out/lib --extra-libs='-lpthread -lm' --bindir=/home/kostis/ffmpeg_build/bin --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
  libavutil      56. 38.100 / 56. 38.100
  libavcodec     58. 65.103 / 58. 65.103
  libavformat    58. 35.102 / 58. 35.102
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 71.100 /  7. 71.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Input #0, wav, from 'in.wav':
  Metadata:
    encoder         : Lavf58.35.102
  Duration: 00:00:10.01, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
[STREAM]
index=0
codec_name=pcm_s16le
codec_long_name=PCM signed 16-bit little-endian
profile=unknown
codec_type=audio
codec_time_base=1/44100
codec_tag_string=[1][0][0][0]
codec_tag=0x0001
sample_fmt=s16
sample_rate=44100
channels=2
channel_layout=unknown
bits_per_sample=16
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/44100
start_pts=N/A
start_time=N/A
duration_ts=441344
duration=10.007800
bit_rate=1411200
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=N/A
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=0
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
[/STREAM]

comment:21 Changed 5 months ago by gdgsdg123

You can attach the "in.wav" to this ticket for better analyzing. (preferably putting it in a RAR4 archive with compression to save the bandwidth and storage)

Not strictly necessary unless the comment:23 was wrong...

Last edited 5 months ago by gdgsdg123 (previous) (diff)

comment:22 follow-up: Changed 5 months ago by cehoyos

Can we please stop this? Any interested developer can use the attached file as input.
And please try very hard to avoid compressing attachments here.

comment:23 in reply to: ↑ 22 Changed 5 months ago by gdgsdg123

Replying to cehoyos:

Any interested developer can use the attached file as input.

...That's hardly straight forward as the "in.wav" is post-processed and the procedure is unknown. (finally got it... check the comment:11)


Replying to cehoyos:

And please try very hard to avoid compressing attachments here.

You prefer things get uncompressed?.. I don't see how this could be beneficial.

Last edited 5 months ago by gdgsdg123 (previous) (diff)

comment:24 Changed 5 months ago by kmamal

The wav files is produced using only ffmpeg as I have written in https://trac.ffmpeg.org/ticket/8467#comment:11. No post-processing. Just run the command

ffmpeg -i in.mp4 in.wav

Sorry for the confusion.

comment:25 Changed 5 months ago by gdgsdg123

Came into likely the same problem...


The attached * is basically produced in the below manner:

ffmpeg -i "in.wav" -c:a aac -b:a 320k "out.aac"

(used -c copy to cut the output file to fit the size limit)



Compare it with this video (starting from 2:55), the problem is particular obvious within the first 25 s. (note for hisses)





* Removed for storage conservation as the problem can be reproduced with the Opus on YouTube:

youtube-dl -f 251 -o "in.webm" http://www.youtube.com/watch?v=6Mr_Epq1uRc


Command to reproduce:

ffmpeg -i "in.webm" -c:a aac -b:a 320k "out.m4a"
ffmpeg version git-2020-01-10-3d894db Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.2.1 (GCC) 20191125
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
  libavutil      56. 38.100 / 56. 38.100
  libavcodec     58. 65.103 / 58. 65.103
  libavformat    58. 35.101 / 58. 35.101
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 70.101 /  7. 70.101
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Input #0, matroska,webm, from 'in.webm':
  Metadata:
    encoder         : google/video-file
  Duration: 00:04:47.26, start: -0.007000, bitrate: 130 kb/s
    Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Stream mapping:
  Stream #0:0 -> #0:0 (opus (native) -> aac (native))
Press [q] to stop, [?] for help
Output #0, ipod, to 'out.m4a':
  Metadata:
    encoder         : Lavf58.35.101
    Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
    Metadata:
      encoder         : Lavc58.65.103 aac
size=   10370kB time=00:04:47.26 bitrate= 295.7kbits/s speed=23.5x
video:0kB audio:10317kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.518689%
[aac @ 000000000051c980] Qavg: 24696.709

The "out.m4a" appears to be VBR and according to the wiki:

This VBR is experimental and likely to get even worse results than the CBR.


Which might be the cause of problem. (and the command I use according to the wiki, is supposed to generate a CBR output?..)

Last edited 5 months ago by gdgsdg123 (previous) (diff)
Note: See TracTickets for help on using tickets.