Opened 9 months ago

Last modified 9 months ago

#11696 new defect

WAV ≥ 4 GiB abnormal decoding due to length field overflow

Reported by: wavybaby Owned by:
Priority: minor Component: avformat
Version: 7.1 Keywords: wavdec
Cc: MasterQuestionable Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: yes

Description

Goal: I am attempting to load an audio recording made with my phone. I want to load the audio file in complete, using audacity and/or using ffmpeg to compress the full raw .WAV file to .FLAC or other format, for further processing.

Problem: Audacity, by extension ffmpeg (which I believe audacity uses under the hood), only parse up to 37/38 minutes of content, out of the full 2h11m of content. I am not sure if this is a bug, but I am generally unsure where else to inquire about this kind of lower-level file modification and access.

Command:

ffmpeg -i "2025-07-22 21.39.41.wav" 2025-07-22_21.39.41.flac

Verbose initial output:

$ ffmpeg -v 9 -loglevel 99 -i "2025-07-22 21.39.41.wav"
ffmpeg version n7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 15.1.1 (GCC) 20250425
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-frei0r --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libdvdnav --enable-libdvdread --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgsm --enable-libharfbuzz --enable-libiec61883 --enable-libjack --enable-libjxl --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librsvg --enable-librubberband --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-vapoursynth --enable-version3 --enable-vulkan
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'.
Reading option '-i' ... matched as input url with argument '2025-07-22 21.39.41.wav'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Successfully parsed a group of options.
Parsing a group of options: input url 2025-07-22 21.39.41.wav.
Successfully parsed a group of options.
Opening an input file: 2025-07-22 21.39.41.wav.
[AVFormatContext @ 0x57a35d09bc00] Opening '2025-07-22 21.39.41.wav' for reading
[file @ 0x57a35d09c2c0] Setting default whitelist 'file,crypto,data'
Probing wav score:99 size:2048
[wav @ 0x57a35d09bc00] Format wav probed with size=2048 and score=99
[wav @ 0x57a35d09bc00] Before avformat_find_stream_info() pos: 44 bytes read:196608 seeks:6 nb_streams:1
[wav @ 0x57a35d09bc00] parser not found for codec pcm_f32le, packets or times may be invalid.
    Last message repeated 1 times
[wav @ 0x57a35d09bc00] All info found
[wav @ 0x57a35d09bc00] stream 0: start_time: NOPTS duration: 2271.044667
[wav @ 0x57a35d09bc00] format: start_time: NOPTS duration: 2271.044667 (estimate from stream) bitrate=21273 kb/s
[wav @ 0x57a35d09bc00] After avformat_find_stream_info() pos: 3276844 bytes read:3473408 seeks:6 frames:50
[aist#0:0/pcm_f32le @ 0x57a35d0a30c0] Guessed Channel Layout: stereo
Input #0, wav, from '2025-07-22 21.39.41.wav':
  Duration: 00:37:51.04, bitrate: 21273 kb/s
  Stream #0:0, 50, 1/96000: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 96000 Hz, stereo, flt, 6144 kb/s
Successfully opened the file.
At least one output file must be specified
[AVIOContext @ 0x57a35d0a44c0] Statistics: 3473408 bytes read, 6 seeks

Full processing log attached/to-be-attached. ffmpeg-20250730-211000.log.zip

(It might be difficult to offer an audio sample, however perhaps something closer to the "reported" end would be better?)

Extra info: I noticed that mediainfo can parse the file in full, and reports the complete duration of the file... not sure what to make of this.

MediaInfoLib v25.04
General
Complete name                            : 2025-07-22 21.39.41.wav
Format                                   : Wave
Format settings                          : PcmWaveformat
File size                                : 5.62 GiB
Duration                                 : 2 h 11 min
Overall bit rate mode                    : Constant
Overall bit rate                         : 6 144 kb/s

Audio
Format                                   : PCM
Format profile                           : Float
Codec ID                                 : 3
Codec ID/Hint                            : IEEE 
Duration                                 : 2 h 11 min
Bit rate mode                            : Constant
Bit rate                                 : 6 144 kb/s
Channel(s)                               : 2 channels
Sampling rate                            : 96.0 kHz
Bit depth                                : 32 bits
Stream size                              : 5.62 GiB (100%)

Attachments (2)

ffmpeg-20250730-211000.log.zip (166.5 KB ) - added by wavybaby 9 months ago.
Full test command output log file - for odd wav file
long-wav.log (9.9 KB ) - added by MasterQuestionable 9 months ago.
͏    Filtered from "ffmpeg-20250730-211000.log". ͏    Cue: https://trac.ffmpeg.org/attachment/ticket/11696/long-wav.log#L132

Download all attachments as: .zip

Change History (9)

by wavybaby, 9 months ago

Full test command output log file - for odd wav file

comment:1 by MasterQuestionable, 9 months ago

Analyzed by developer: set
Cc: MasterQuestionable added
Component: undeterminedavcodec
Keywords: pcm_f32le added; wav removed
Summary: ffmpeg unable to parse full runtime of wav fileLong PCM WAV incomplete decoding?

͏    Perhaps some peculiarity within the input file:
[aist#0:0/pcm_f32le] [dec:pcm_f32le] Decoder thread received EOF packet
[aist#0:0/pcm_f32le] [dec:pcm_f32le] Decoder returned EOF, finishing
[aist#0:0/pcm_f32le] [dec:pcm_f32le] Terminating thread with return code 0 (success)
͏    .
͏    32/8 * 96,000 * 2
͏    * (2 * 60 + 11) * 60
͏    / 1024^3 ≈ 5.62

Last edited 9 months ago by MasterQuestionable (previous) (diff)

by MasterQuestionable, 9 months ago

Attachment: long-wav.log added

͏    Filtered from "ffmpeg-20250730-211000.log".
͏    Cue: https://trac.ffmpeg.org/attachment/ticket/11696/long-wav.log#L132

comment:2 by Marton Balint, 9 months ago

You might want to try using the -ignore_length 1 option. In wav fiels the data chunk contains the wave data and it has a 32 bit length, so if your phone writes some random data there instead of UINT32_MAX, then that can limit the read audio.

Alternatively you might want to upload here the first 64kb of some of your samples which are > 4GiB so we can check if there is a pattern which can be used to autodetect this...

in reply to:  2 comment:3 by cgbug, 9 months ago

Replying to Marton Balint:

You might want to try using the -ignore_length 1 option. In wav fiels the data chunk contains the wave data and it has a 32 bit length, so if your phone writes some random data there instead of UINT32_MAX, then that can limit the read audio.

Based on the numbers above I would guess it has overflown value.

131 min / 5.62 GB * (5.62 - 4) = 37.7 min

comment:4 by MasterQuestionable, 9 months ago

Component: avcodecavformat
Keywords: wavdec added; pcm_f32le removed
Summary: Long PCM WAV incomplete decoding?WAV ≥ 4 GiB abnormal decoding due to length field overflow

͏    Would it be sensible to automatically apply "-ignore_length":
͏    https://ffmpeg.org/ffmpeg-formats.html#wav (documentation yet incomplete)
͏    https://github.com/FFmpeg/FFmpeg/blob/ce01c7fb58597f525e130f47a13ff77f1db62bf4/libavformat/wavdec.c#L76
͏    ; for all WAV ≥ 4 GiB?

͏    Or why should this length field even be used..?

Last edited 9 months ago by MasterQuestionable (previous) (diff)

in reply to:  2 comment:5 by wavybaby, 9 months ago

Replying to Marton Balint:

You might want to try using the -ignore_length 1 option. In wav fiels the data chunk contains the wave data and it has a 32 bit length, so if your phone writes some random data there instead of UINT32_MAX, then that can limit the read audio.

Alternatively you might want to upload here the first 64kb of some of your samples which are > 4GiB so we can check if there is a pattern which can be used to autodetect this...

Good morning all, indeed -ignore_length 1 resolves the issue:

$ ffmpeg -ignore_length 1 -i "2025-07-22 21.39.41.wav" 2025-07-22_21.39.41.flac
ffmpeg version n7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 15.1.1 (GCC) 20250425
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-frei0r --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libdvdnav --enable-libdvdread --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgsm --enable-libharfbuzz --enable-libiec61883 --enable-libjack --enable-libjxl --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librsvg --enable-librubberband --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-vapoursynth --enable-version3 --enable-vulkan
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
[aist#0:0/pcm_f32le @ 0x62ca6c2e0040] Guessed Channel Layout: stereo
Input #0, wav, from '2025-07-22 21.39.41.wav':
  Duration: 00:37:51.04, bitrate: 21273 kb/s
  Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 96000 Hz, stereo, flt, 6144 kb/s
File '2025-07-22_21.39.41.flac' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_f32le (native) -> flac (native))
Press [q] to stop, [?] for help
[flac @ 0x62ca6c2d9900] encoding as 24 bits-per-sample, more is considered experimental. Add -strict experimental if you want to encode more than 24 bits-per-sample
Output #0, flac, to '2025-07-22_21.39.41.flac':
  Metadata:
    encoder         : Lavf61.7.100
  Stream #0:0: Audio: flac, 96000 Hz, stereo, s32 (24 bit), 128 kb/s
      Metadata:
        encoder         : Lavc61.19.101 flac
[wav @ 0x62ca6c2d8c00] Packet corrupt (stream = 0, dts = NOPTS).30x    
[in#0/wav @ 0x62ca6c2d8940] corrupt input packet in stream 0
[out#0/flac @ 0x62ca6c2e0200] video:0KiB audio:1292022KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.000626%
size= 1292030KiB time=02:11:03.45 bitrate=1346.0kbits/s speed= 297x


While my issue is resolved now, I imagine this might pose an issue in the future for others; thus, please feel free to close this ticket once or if any flags or changes are made :)

Please let me know if a further sample or similar such thing is required, I can make another very very long recording on my phone :)

Thank you again for the assistance to all!

in reply to:  4 comment:6 by Marton Balint, 9 months ago

Replying to MasterQuestionable:

͏    Would it be sensible to automatically apply "-ignore_length":
͏    https://ffmpeg.org/ffmpeg-formats.html#wav (documentation yet incomplete)
͏    https://github.com/FFmpeg/FFmpeg/blob/ce01c7fb58597f525e130f47a13ff77f1db62bf4/libavformat/wavdec.c#L76
͏    ; for all WAV ≥ 4 GiB?

͏    Or why should this length field even be used..?

Because other chunks might follow the data chunk, and that should not be returned as audio data.

comment:7 by MasterQuestionable, 9 months ago

͏    So it would be fundamentally incurable..?
͏    ("other chunks" other than audio data in WAV??)

͏    Seems like inherent format design issue. (problematic handling ≥ 4 GiB)

Note: See TracTickets for help on using tickets.