Opened 18 months ago

Last modified 18 months ago

#10523 new defect

Artifacts converting aac to wav

Reported by: drive4code Owned by:
Priority: important Component: undetermined
Version: 4.3.6 Keywords: ffmpeg conversion
Cc: drive4code Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description (last modified by drive4code)

Summary of the bug

Anytime You convert an AAC to WAV (pcm_s24le; the 16 bit version...) you get artifacts in random spots trough the video, which can get better or worse as you run the program again

What you were trying to accomplish

Convert a 32 bit 48000 kHz 1080p 60fps .mp4 AAC video using into a 24 bit 44100 kHz .wav WAV audio

I should mention i tried to convert it to several other formats, and because i found the same artifacts in all of them I assume it's a problem with all conversions of this type

The problem you encountered

There are artifacts at random points towards the video. Here's the best way to describe them after visualizing the audio:

They seem to fall in the distortion artifacts category. Some waves (seems like half the time for a short period of time) over-extend over the voice of the speaker, causing the artifacts. It sounds like a robotic sound overextending the speaker.

How to reproduce:

Run the command below on a file according to spec, make it large enough (10-20 minutes) to scroll through it and find the mention artifacts

See the example attached for the final product. I had to trim it with windows photo viewer, so the container is of course different.

% ffmpeg -y -i "$vid" -vn -c:a pcm_s24le -ar 44100 -ac 2 $audio
ffmpeg version n4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)

The output provided by ffmpeg -v 9 -loglevel 99 -i (would be too big to fit)

[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231133, offset e148f8da, dts 236680192, size 326, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231134, offset e1492afb, dts 236681216, size 359, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231135, offset e149a267, dts 236682240, size 326, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231136, offset e149aac4, dts 236683264, size 346, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231137, offset e14a0c19, dts 236684288, size 328, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231138, offset e14a248e, dts 236685312, size 333, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231139, offset e14a56b6, dts 236686336, size 360, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231140, offset e14a9f3e, dts 236687360, size 331, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] AVIndex stream 1, sample 231141, offset e14aebf9, dts 236688384, size 309, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] Processing st: 1, edit list 0 - media time: 1024, duration: 236688000
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] drop a frame at curr_cts: 0 @ 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'udta' parent:'moov' sz: 333 8697198 8697523
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'meta' parent:'udta' sz: 325 8 325
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'hdlr' parent:'meta' sz: 33 8 313
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] ctype=[0][0][0][0]
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stype=mdir
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'ilst' parent:'meta' sz: 280 41 313
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'[169]nam' parent:'ilst' sz: 80 8 272
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'[169]ART' parent:'ilst' sz: 33 88 272
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'[169]day' parent:'ilst' sz: 28 121 272
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'[169]too' parent:'ilst' sz: 36 149 272
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] type:'[169]cmt' parent:'ilst' sz: 95 185 272
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] on_parse_exit_offset=3788499620
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] rfps: 60.000000 0.000265
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] rfps: 120.000000 0.001061
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] rfps: 240.000000 0.004244
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] rfps: 59.940060 0.001057
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] Before avformat_find_stream_info() pos: 3788499620 bytes read:8730299 seeks:1 nb_streams:2
[h264 @ 0x557ed98411c0] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x557ed98411c0] nal_unit_type: 8(PPS), nal_ref_idc: 3
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stream 0, sample 0, dts -18000
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stream 1, sample 0, dts -21333
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] demuxer injecting skip 1024 / discard 0
[aac @ 0x557ed9842800] skip 1024 / discard 0 samples due to side data
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stream 0, sample 0, dts -18000
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stream 1, sample 1, dts 0
[h264 @ 0x557ed98411c0] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x557ed98411c0] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x557ed98411c0] nal_unit_type: 8(PPS), nal_ref_idc: 3
[h264 @ 0x557ed98411c0] nal_unit_type: 6(SEI), nal_ref_idc: 0

Last message repeated 1 times

[h264 @ 0x557ed98411c0] nal_unit_type: 5(IDR), nal_ref_idc: 3
[h264 @ 0x557ed98411c0] Format yuv420p chosen by get_format().
[h264 @ 0x557ed98411c0] Reinit context to 1920x1088, pix_fmt: yuv420p
[h264 @ 0x557ed98411c0] ct_type:0 pic_struct:0
[h264 @ 0x557ed98411c0] no picture
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] All info found
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stream 0: start_time: 0.015 duration: 4931.03
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] stream 1: start_time: 0 duration: 4931
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] format: start_time: 0 duration: 4931.08 (estimate from stream) bitrate=6146 kb/s
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x557ed983f440] After avformat_find_stream_info() pos: 118453 bytes read:8848704 seeks:2 frames:2

Attachments (1)

artifact.mp4 (204.4 KB ) - added by drive4code 18 months ago.
Here's an example of artifacting

Download all attachments as: .zip

Change History (3)

by drive4code, 18 months ago

Attachment: artifact.mp4 added

Here's an example of artifacting

comment:1 by drive4code, 18 months ago

Description: modified (diff)

comment:2 by drive4code, 18 months ago

I've tried using pcm_s32le and the artifacts are still there and in same amount. However, this command reduces the number of artifacts:
vlc input_video.mp4 --sout "#transcode{acodec=s32l,ab=1411}:std{access=file,mux=wav,dst=output.wav}" vlc://quit

I understand its from vlc, but I belive it uses the same libraries as ffmpeg and hopefully it can help diagnose the problem

Last edited 18 months ago by drive4code (previous) (diff)
Note: See TracTickets for help on using tickets.