Opened 3 years ago

Last modified 2 years ago

#4768 new enhancement

FFmpeg preserving CFR during TS to MP4 conversion

Reported by: zer0z Owned by:
Priority: wish Component: undetermined
Version: unspecified Keywords: dts
Cc: nfxjfg@googlemail.com, jerome@mediaarea.net Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug: Problem due to technical limitation of TS format, duration are rounded and this rounding is not detected during TS to MP4 (not a bug, only a lack of feature). Instead FFmpeg should detect the CFR in TS to MP4 conversion and create a CFR stts atom. Instead of the actual stts (time table) atom which takes lot of space and which is not CFR stricly speaking.

I convert MP4 to TS, which has a hard coded (spec) frequency of 90 000 Hz. ffmpeg converts it as it can, with some rounding (in my time time line: sometime 3754, sometime 3753, but the real duration should be 3753.75, and Ts accepts no decimal).

Then I convert it back to MP4, and ffmpeg does not understand it is CFR, so it provides the same rounding. Resultant file ends up as VFR.

How to reproduce:

From ffmpeg wiki:
If you have MP4 files, these could be losslessly concatenated by first transcoding them to mpeg transport streams. With h.264 video and AAC audio, the following can be used:
ffmpeg -i input1.mp4 -c copy -bsf:v h264_mp4toannexb -f mpegts intermediate1.ts
ffmpeg -i input2.mp4 -c copy -bsf:v h264_mp4toannexb -f mpegts intermediate2.ts
ffmpeg -i "concat:intermediate1.ts|intermediate2.ts" -c copy -bsf:a aac_adtstoasc output.mp4

I convert 1.mp4, 2.mp4, 3.mp4 (all encoded at CFR) with above method to .ts. Then issue:
ffmpeg -i "concat:1.ts|2.ts|3.ts" -c copy -bsf:a aac_adtstoasc merged.mp4

Due to the above issue the result from concat will be VFR:

Frame rate mode : Variable
Frame rate : 23.976 fps
Minimum frame rate : 23.968 fps
Maximum frame rate : 23.981 fps

If this can be looked into and hopefully resolved to preserve the Constant frame rate I would really appreciate it.

Change History (18)

comment:1 follow-up: Changed 3 years ago by heleppkes

There is no way to know the original TS file actually is CFR (other than parsing the entire file and reading every timestamp), so any file converted from TS should assume to be VFR.

In fact, preserving the exact timestamps as present in the TS file is the goal, and not re-interpretation to set some arbitrary CFR flag in some other container. I believe ffmpeg has features to overwrite the timestamp behavior of that is so desired.

Last edited 3 years ago by heleppkes (previous) (diff)

comment:2 Changed 3 years ago by cehoyos

  • Component changed from ffmpeg to undetermined
  • Keywords dts added
  • Priority changed from normal to wish
  • Version changed from 1.0.10 to unspecified

comment:3 in reply to: ↑ 1 Changed 3 years ago by ffmpegreport

Replying to heleppkes:

There is no way to know the original TS file actually is CFR (other than parsing the entire file and reading every timestamp), so any file converted from TS should assume to be VFR.

In fact, preserving the exact timestamps as present in the TS file is the goal, and not re-interpretation to set some arbitrary CFR flag in some other container. I believe ffmpeg has features to overwrite the timestamp behavior of that is so desired.

If the source the CBR, the outfile should be CBR. Assuming a file to be VFR is unnecessary and produces false-positives. It seems obvious the proper approach would be to parse the file, even if that slows the process down at all. Accuracy and correct outfiles should trump assumptions.

comment:4 Changed 3 years ago by gjdfgh

  • Cc nfxjfg@googlemail.com added

Accuracy and correct outfiles should trump assumptions.

So how can we accurately detect that the source file is _actually_ CFR?

comment:5 follow-up: Changed 3 years ago by cehoyos

What happens if the input file has a frame rate of 25fps? Are the various output files CFR in that case?

comment:6 in reply to: ↑ 5 ; follow-up: Changed 3 years ago by Zenitram

Replying to ffmpegreport:

If the source the CBR, the outfile should be CBR. Assuming a file to be VFR is unnecessary and produces false-positives. It seems obvious the proper approach would be to parse the file, even if that slows the process down at all. Accuracy and correct outfiles should trump assumptions.

outfile is not CFR even if the source is CFR due to a rounding issue:
The source file is at 96000/4004 fps (user defined).
MPEG-TS has a precision of 90000 Hz (no choice).
So the conversion to TS do some rounding (outfile file is at 90000/3753.75 fps, and it is impossible to write .75 in a MPEG-TS file).

And when it is back to MP4, FFmpeg creates that:

0C1B9B08      Time to Sample (130664/130656 bytes)
0C1B9B08       Header (8 bytes)
0C1B9B08        Size:                               130664 (0x0001FE68)
0C1B9B0C        Name:                               stts
0C1B9B10       Version:                             0 (0x00)
0C1B9B11       Flags:                               0 (0x000000)
0C1B9B14       Number of entries:                   16331 (0x00003FCB)
0C1B9B18       Sample Count:                        4 (0x00000004)
0C1B9B1C       Sample Duration:                     3754 (0x00000EAA)
0C1B9B20       Sample Count:                        1 (0x00000001)
0C1B9B24       Sample Duration:                     3753 (0x00000EA9)
0C1B9B28       Sample Count:                        3 (0x00000003)
0C1B9B2C       Sample Duration:                     3754 (0x00000EAA)
0C1B9B30       Sample Count:                        1 (0x00000001)
0C1B9B34       Sample Duration:                     3753 (0x00000EA9)
...

Which is exact if we read the PTS of MPEG-TS.
Issue: the PTS was rounded, so the double transcoding MP4 --> MPEG-TS --> MP4 does not reproduce the exact source time stamps.
Note: this is the case for any MPEG-TS (including BDAV) at 23.976 fps, no need to transcode from MP4 to MPEG-TS first. Just use any TS file with 23.976 stream if you want to reproduce the issue.

Replying to gjdfgh:

Accuracy and correct outfiles should trump assumptions.

So how can we accurately detect that the source file is _actually_ CFR?

Not 100% "sure", but the AVC VUI has the frame rate info (time_scale 48000 / num_units_in_tick 1001) and there is a "template" in the PTS (3x 3753, then 1x 3754, then 3x 3753, then 1x 3754, then...) so it may be enough to consider it is CFR.
So this depends of the level of accuracy you want. when I see AVC VUI + the form of PTS like the one I described, I don't see any reason to do that except that is is actually CFR with rounding due to technical limitations, there is 99.999% (or 99.99%, or 99.9%, or 99%, not easy to say) chances that the goal is a CFR stream.

Replying to cehoyos:

What happens if the input file has a frame rate of 25fps? Are the various output files CFR in that case?

90000 / 25 = 3600.000000000 (exactly 3600) so there is no issue with 25 fps streams, nor 30.000 fps (always 3000 units in tick), nor 24.000 fps (always 3750 units in tick), nor 29.970 fps (always 3003 units in tick).
Issue is only with 23.976 fps (3x 3753, then 1x 3754 units in tick).

Replying to heleppkes:

In fact, preserving the exact timestamps as present in the TS file is the goal, and not re-interpretation to set some arbitrary CFR flag in some other container. I believe ffmpeg has features to overwrite the timestamp behavior of that is so desired.

re-interpretation is definitely the issue: should or shouldn't we re-interpret and consider it is rounding issue?
In all cases, this is a difficulty for analysers: a TS analyser is aware that there is a fixed frequency 90 kHz so can consider that rounding is possible and test the frame rate with this rounding, but a MP4 analyser can consider that there is no rounding because the format is more versatile and accept any time scale (and there is no metadata about the precision of the time stamp in MP4), there is no good solutions.

Maybe an option saying we would like to apply subjective CFR detection?

Last edited 3 years ago by Zenitram (previous) (diff)

comment:7 Changed 3 years ago by Zenitram

  • Cc jerome@mediaarea.net added

comment:8 in reply to: ↑ 6 ; follow-up: Changed 3 years ago by cehoyos

Replying to Zenitram:

Replying to cehoyos:

What happens if the input file has a frame rate of 25fps? Are the various output files CFR in that case?

90000 / 25 = 3600.000000000 (exactly 3600) so there is no issue with 25 fps streams, nor 30.000 fps (always 3000 units in tick), nor 24.000 fps (always 3750 units in tick), nor 29.970 fps (always 3003 units in tick).

Sorry if this sounds stupid:
Did you test this and it does indeed work as expected for 30fps and 24fps or are you assuming it would work?

comment:9 in reply to: ↑ 8 Changed 3 years ago by Zenitram

Replying to cehoyos:

90000 / 25 = 3600.000000000 (exactly 3600) so there is no issue with 25 fps streams, nor 30.000 fps (always 3000 units in tick), nor 24.000 fps (always 3750 units in tick), nor 29.970 fps (always 3003 units in tick).

Sorry if this sounds stupid:
Did you test this and it does indeed work as expected for 30fps and 24fps or are you assuming it would work?

I am not the OP, I just helped him in the debugging.
I just say that there is no rounding issue with any other classic frame rate (which can be transformed to 90 kHz units without rounding issues, only integers) and that the origin of the issue the OP has is that PTS in TS files are rounded in the case of 23.976 fps CFR streams and that FFmpeg is fooled by the rounding (this is not a critisism, it is only factual and FFmpeg has a correct behavior in the case we don't want to care about rounding issues of TS).

Anyway, I just tested with a 29.970 fps TS file:
ffmpeg -i test.ts -c copy test.mp4

Dump of stts:

034EAB33      Time to Sample (24 bytes)
034EAB33       Header (8 bytes)
034EAB33        Size:                               24 (0x00000018)
034EAB37        Name:                               stts
034EAB3B       Version:                             0 (0x00)
034EAB3C       Flags:                               0 (0x000000)
034EAB3F       Number of entries:                   1 (0x00000001)
034EAB43       Sample Count:                        945 (0x000003B1)
034EAB47       Sample Duration:                     3003 (0x00000BBB)

--> It is perfect (because the source PTS is perfect). Time scale is 90000 (TS time scale).

Any file with 23.976 streams (including official blu-rays) has the rounding issue (no Blu-ray at 23.976 is strictly speaking CFR) so we never get the expected remux (CFR) because the CFR is not detected (so we have a huge stts atom instead of Sample Count = total sample count and Sample Duration = 1001 and time scale = 24000 which is the theoritical output).

comment:10 follow-up: Changed 3 years ago by cehoyos

Does the input option -r 24000/1001 help?

comment:11 in reply to: ↑ 10 Changed 3 years ago by Zenitram

Replying to cehoyos:

Does the input option -r 24000/1001 help?

It is.
Test with a 23.976 TS file:

ffmpeg -i test.ts -c copy test.mp4

dump of stts:

009B8EC6      Time to Sample (656 bytes)
009B8EC6       Header (8 bytes)
009B8EC6        Size:                               656 (0x00000290)
009B8ECA        Name:                               stts
009B8ECE       Version:                             0 (0x00)
009B8ECF       Flags:                               0 (0x000000)
009B8ED2       Number of entries:                   80 (0x00000050)
009B8ED6       Sample Count:                        2 (0x00000002)
009B8EDA       Sample Duration:                     3754 (0x00000EAA)
009B8EDE       Sample Count:                        1 (0x00000001)
009B8EE2       Sample Duration:                     3753 (0x00000EA9)
009B8EE6       Sample Count:                        3 (0x00000003)
009B8EEA       Sample Duration:                     3754 (0x00000EAA)
009B8EEE       Sample Count:                        1 (0x00000001)
...

ffmpeg -i test.ts -c copy -r 24000/1001 test.mp4

009B8EC6      Time to Sample (24 bytes)
009B8EC6       Header (8 bytes)
009B8EC6        Size:                               24 (0x00000018)
009B8ECA        Name:                               stts
009B8ECE       Version:                             0 (0x00)
009B8ECF       Flags:                               0 (0x000000)
009B8ED2       Number of entries:                   1 (0x00000001)
009B8ED6       Sample Count:                        159 (0x0000009F)
009B8EDA       Sample Duration:                     1001 (0x000003E9)

I guess the OP has a workaround (it is not detected and set automaticly by FFmpeg, but he can force FFmpeg to do it) for doing what he wants to do.

comment:12 Changed 3 years ago by cehoyos

Note that you tested the output option, I don't know if it may drop (and duplicate) frames for this use case.

comment:13 Changed 3 years ago by kierank

@Zenitram

Just for the record the H264 spec (Annex C) and the MPEG-TS spec does cover the case of rounded timestamps. Blu-Ray files are still CFR irrespective of the precision of the timestamps.

A little bit besides the point for FFmpeg because of the way it's designed.

comment:14 follow-ups: Changed 3 years ago by zer0z

Thanks to cehoyos and Zenitram for the workaround! cehoyos (or anyone) is it best to put -r with both the input and the output? I am trying to avoid dropped/duplicated frames. I did a test (below) and specifying -r for both input and output was slightly faster than specifying -r for just the input. The resultant byte size was identical with either method.

time ffmpeg -i test.ts -r 24000/1001 -c copy -r 24000/1001 test.mp4

Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 720x404, q=2-31, 23.98 fps, 24k tbn, 23.98 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
[mp4 @ 0x9abcc00] pts has no value
    Last message repeated 100 times
pts has no value0.0 q=-1.0 size=   47628kB time=00:05:19.48 bitrate=1221.2kbits/s    
[mp4 @ 0x9abcc00] pts has no value
    Last message repeated 126 times
pts has no value15602 q=-1.0 size=  106538kB time=00:10:51.73 bitrate=1339.1kbits/s    
[mp4 @ 0x9abcc00] pts has no value
    Last message repeated 135 times
pts has no value18071 q=-1.0 size=  165451kB time=00:18:52.29 bitrate=1197.0kbits/s    
[mp4 @ 0x9abcc00] pts has no value
    Last message repeated 54 times
frame=32069 fps=17822 q=-1.0 Lsize=  198750kB time=00:22:17.46 bitrate=1217.4kbits/s    
video:198374kB audio:0kB subtitle:0 global headers:0kB muxing overhead 0.189591%

real 0m1.891s




time ffmpeg -i test.ts -r 24000/1001 -c copy test.mp4

Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 720x404, q=2-31, 23.98 fps, 24k tbn, 23.98 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
[mp4 @ 0x9486c00] pts has no value
    Last message repeated 71 times
pts has no value0.0 q=-1.0 size=   37083kB time=00:04:01.82 bitrate=1256.2kbits/s    
[mp4 @ 0x9486c00] pts has no value
    Last message repeated 90 times
pts has no value11599 q=-1.0 size=   72934kB time=00:08:04.10 bitrate=1234.2kbits/s    
[mp4 @ 0x9486c00] pts has no value
    Last message repeated 80 times
pts has no value11669 q=-1.0 size=  115578kB time=00:12:10.68 bitrate=1295.8kbits/s    
[mp4 @ 0x9486c00] pts has no value
    Last message repeated 118 times
pts has no value13592 q=-1.0 size=  165954kB time=00:18:54.96 bitrate=1197.8kbits/s    
[mp4 @ 0x9486c00] pts has no value
    Last message repeated 54 times
frame=32069 fps=13699 q=-1.0 Lsize=  198750kB time=00:22:17.46 bitrate=1217.4kbits/s    
video:198374kB audio:0kB subtitle:0 global headers:0kB muxing overhead 0.189591%

real 0m2.463s

comment:15 in reply to: ↑ 14 Changed 3 years ago by cehoyos

Replying to zer0z:

is it best to put -r with both the input and the output?

No, this is not a good idea imo. If it works as an input option then this is currently the option that solves your issue (and it will not drop or duplicate frames).
I believe it will not work for all containers though.

comment:16 in reply to: ↑ 14 Changed 3 years ago by cehoyos

Replying to zer0z:

time ffmpeg -i test.ts -r 24000/1001 -c copy -r 24000/1001 test.mp4

Sorry, I originally misread this post.
You are only testing the output option -r here, the question is if the input option works as expected:

$ ffmpeg -r 24000/1001 -i test.ts -codec copy out.mp4

comment:17 Changed 2 years ago by Misaki

This bug report had a solution, but what about the concat input format? It requires an extra file but I guess it might be possible to do it with pipes or something on Linux.

ffmpeg -f concat -i <(echo file "'input1.mp4'" ; echo file "'input2.mp4'") [...]

This would avoid the reported reason for variable frame-rate. However, I couldn't get the "duration" option to work so it's automatically inserted, and for files with audio+video, the offset can vary slightly (and seemingly unpredictably between different files). The 'stts' atom would be nearly the same size, since it would just have two main entries instead of one, but it still wouldn't be completely constant frame rate.

Last edited 2 years ago by Misaki (previous) (diff)

comment:18 Changed 2 years ago by Misaki

Note, if using process substitution as described above, you must provide the full path for the files, not the relative path from the current directory. Why, I do not know. I looked up my comment just to see how I did it since I couldn't get it to work again. (described in the bash manual.)

In fact, this is not a bug in bash, because of an error I got while trying to test it and forgetting to add everything.

ffmpeg -f concat -i <(echo 'tempfile.part')

[concat @ 0x1db0aa0] Line 1: unknown keyword 'tempfile.part'
/dev/fd/63: Invalid data found when processing input

ffmpeg -f concat -i <(echo file 'tempfile.part')

[concat @ 0xdd8aa0] Impossible to open '/dev/fd/tempfile.part'
/dev/fd/63: No such file or directory

So, as with some other cases, it's just a confusing error message. ffmpeg is accessing the input, as it can discriminate between incorrect keywords. It's just using the directory of the pipe (oh) instead of the current directory.

Note: See TracTickets for help on using tickets.