Opened 14 months ago
Last modified 9 months ago
#9590 new defect
FFmpeg-written MP4 fails to be read-back without distortion of PTS/DTS
|Reported by:||Ulrik Mikaelsson||Owned by:|
|Version:||unspecified||Keywords:||mov pts DTS mp4|
|Cc:||Ulrik Mikaelsson||Blocked By:|
|Blocking:||Reproduced by developer:||no|
|Analyzed by developer:||no|
Summary of the bug: After writing an MP4 using libavformat, and reading it back using ffprobe, timestamps are distorted
How to reproduce:
% ffprobe -show_packets non-zero-start.mp4 ffmpeg version 4.4.1 built on ubuntu (probably not relevant)
After writing an MP4 using PyAV (using ffmpeg under the hood), and reading it back using ffprobe,
the timestamps have been altered from those provided.
The timestamps passed to
PTS=-1358 and DTS=-4243 PTS=3599 and DTS=-2367 PTS=1527 and DTS=-1358 PTS=518 and DTS=518 PTS=2563 and DTS=1527 PTS=5612 and DTS=2563 PTS=4584 and DTS=3599
The timestamps read back using
ffprobe -show_packets are:
PTS=-1876 and DTS=-4761 PTS=3081 and DTS=-2885 PTS=1009 and DTS=-1876 PTS=0 and DTS=0 PTS=2045 and DTS=1009 PTS=5094 and DTS=2045 PTS=4066 and DTS=3081
timebase passed to both the AVStream, and to the mvhd header, (through option
I've tried dumping the mp4 using mp4box (and MP4Box.js; https://gpac.github.io/mp4box.js/test/filereader.html). After manually subtracting the edit-list offset, the samples match the timestamps provided to the muxer, leading to the conclusion that something in the demuxing of this mp4 goes wrong.
After some bisecting in the demuxer, I've narrowed it down to this if-clause https://github.com/FFmpeg/FFmpeg/blob/f37e66b3937a914e16d89a9050f042ad89567245/libavformat/mov.c#L3792. The intention is to "make the minimum PTS (first non-discarded frame?) zero", but it's unclear to me _why_ it should be made zero? It's perfectly reasonable for a track to start a non-zero PTS? This is even intentionally supported, for tracks where the edit-list starts with a few empty edits. I do not see why a track with some frames before 0 should change that? AFAICT, this might even possibly break A/V-sync, since the corresponding adjustment does not seem to be done to other tracks?
- Should I not expect the inputs I pass to the libavformat-muxer here to be reflected back out, unchanged?
- What is the intention behind "making the first pts zero"? Why can't this clause simply be removed?
Change History (3)
by , 14 months ago
comment:1 by , 14 months ago
|Component:||undetermined → avformat|
comment:2 by , 9 months ago
So let me get this straight. First 3 frames with their PTS and DTS are discarded, so it makes the 4th frame first non-discarded frame (why do you need to discard video frames again?) which gets corrected to PTS 0 and when correcting it so happens to be DTS 0.
I have questions. Why discard video? Does it even matter what first PTS is if all before are discarded? Also, do you understand what is DTS and PTS? DTS are supposed to be from less to bigger always, because that is how you parse and decode the hevc /avc stream. PTS is how you then (after decoding) present those frames. So it means that after decoding and presenting first frame, you decode 2nd, 3rd and 4th and only 4th is presented and then 3rd is presented and then 5th decoded and 5th presnted and only then 2nd is presented. Okay?
It is a very common idea that discarded frames do not affect anything. And it is wrong. E.g., AAC has 1024 samples that are priming and that are discarded. But decoding happens since 1024 frame, its state is then used to decode 1025 frame, but 1024 frame is still then discarded. You need roll metadata to signal that. See this fix e7e1fbc49bf64e1a1d19e2a469dd1962d4bdb770
MP4 generated as per description