Opened 4 months ago
#10163 new defect
FFprobe / libavformat gives wrong duration for MP3 file
Reported by: | Jonathan | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avformat |
Version: | unspecified | Keywords: | MP3 |
Cc: | Jonathan | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
libavformat seems to give a duration that's higher than the duration of the decoded samples. I've seen this with both files "from the wild" as well as files generated with FFmpeg lavfi.
All the files have contained "xing" tags that state the number of frames upfront. If I understand correctly, this frame count is then used to estimate the duration in mp3dec.c as st->duration = av_rescale_q(mp3->frames, (AVRational){spf, c.sample_rate}, st->time_base);
. I.e. the resulting duration value is based on the assumption that the media is made up of frames * spf
samples. However, this doesn't seem to account for the fact that some of those samples are meant to be skipped or discarded at decoding. If one subtracts skipped and discarded samples, the result seems to be a more accurate duration value.
How to reproduce:
# Generate 1s long test file: ffmpeg -f lavfi -i sine -t 1 test1.mp3 # Probe it to get its duration ffprobe -loglevel trace test1.mp3 ffprobe version 5.1.2 Copyright (c) 2007-2022 the FFmpeg developers built with Apple clang version 14.0.0 (clang-1400.0.29.202) configuration: --prefix=/usr/local/Cellar/ffmpeg/5.1.2_1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox libavutil 57. 28.100 / 57. 28.100 libavcodec 59. 37.100 / 59. 37.100 libavformat 59. 27.100 / 59. 27.100 libavdevice 59. 7.100 / 59. 7.100 libavfilter 8. 44.100 / 8. 44.100 libswscale 6. 7.100 / 6. 7.100 libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100 Input #0, mp3, from 'test1.mp3': Metadata: encoder : Lavf59.27.100 Duration: 00:00:01.04, start: 0.025057, bitrate: 65 kb/s Stream #0:0: Audio: mp3, 44100 Hz, mono, fltp, 64 kb/s
Here, FFprobe gives a duration of 1.04s, whereas I would have expected it to give 1s.
We can derive the resulting value 1.04 from the calculation frames*spf/sample_rate = 40*1152/44100 = 1.0449.
Running FFprobe with -loglevel trace
reveals:
[mp3 @ 0x7fbb70f05040] demuxer injecting skip 1105 / discard 0 [mp3float @ 0x7fbb72004680] skip 1105 / discard 0 samples due to side data [mp3float @ 0x7fbb72004680] skip 1105/1152 samples [mp3 @ 0x7fbb70f05040] demuxer injecting skip 0 / discard 875
We can modify the calculation to account for skip=1105 and discard=875: (frames*spf-skip-discard)/sample_rate = (40*1152-1105-875)/44100 = 1. This modified calculation gives the value I would have expected.
Proposed patch:
Below is a patch that I think may achieve the above (assuming I understood the fields of the mp3 and sti structs correctly). I should note that this change seems to break "seek-extra-mp3" in the fate tests.
diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c index 90d75626b4..4b9527a691 100644 --- a/libavformat/mp3dec.c +++ b/libavformat/mp3dec.c @@ -322,6 +322,7 @@ static int mp3_parse_vbr_tags(AVFormatContext *s, AVStream *st, int64_t base) int vbrtag_size = 0; MP3DecContext *mp3 = s->priv_data; int ret; + FFStream *sti; ffio_init_checksum(s->pb, ff_crcA001_update, 0); @@ -349,9 +350,13 @@ static int mp3_parse_vbr_tags(AVFormatContext *s, AVStream *st, int64_t base) /* Skip the vbr tag frame */ avio_seek(s->pb, base + vbrtag_size, SEEK_SET); - if (mp3->frames) - st->duration = av_rescale_q(mp3->frames, (AVRational){spf, c.sample_rate}, + if (mp3->frames) { + sti = ffstream(st); + st->duration = av_rescale_q(mp3->frames * spf - sti->start_skip_samples - mp3->end_pad, + (AVRational){1, c.sample_rate}, st->time_base); + } + if (mp3->header_filesize && mp3->frames && !mp3->is_cbr) st->codecpar->bit_rate = av_rescale(mp3->header_filesize, 8 * c.sample_rate, mp3->frames * (int64_t)spf);