#3393 closed defect (invalid)
Interlaced H.264 packets are split causing MP4 STTS
| Reported by: | wim_arbor | Owned by: | |
|---|---|---|---|
| Priority: | normal | Component: | undetermined |
| Version: | git-master | Keywords: | h264 mov |
| Cc: | Blocked By: | ||
| Blocking: | Reproduced by developer: | yes | |
| Analyzed by developer: | no |
Description
Summary of the bug:
when remuxing a mpeg-ts containing interlaced H.264 into mp4, both fields of each video frame are split into seperate packets. Software such as Mediainfo uses the STTS to determine the frame rate. It will show as 50fps instead of 25fps
How to reproduce:
% ffmpeg.exe -i h264_aac_576i_tff.ts -c:a copy -c:v copy -bsf:a aac_adtstoasc -async 1 h264_aac_576i_tff.mp4
ffmpeg version N-60700-g07b4b0c Copyright (c) 2000-2014 the FFmpeg developers
built on Feb 17 2014 15:45:12 with gcc 4.8.2 (GCC)
configuration: --pkg-config=pkg-config --prefix=/home/arbor/software/packages/win32 --enable-memalign-hack --arch=x86 --target-os=mingw32 --cross-prefix=i686-w64-mingw32- --enable-libfaac --enable-libx264 --enable-gpl --enable-nonfree --disable-w32threads
libavutil 52. 64.100 / 52. 64.100
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 33.100 / 55. 33.100
libavdevice 55. 10.100 / 55. 10.100
libavfilter 4. 1.102 / 4. 1.102
libswscale 2. 5.101 / 2. 5.101
libswresample 0. 17.104 / 0. 17.104
libpostproc 52. 3.100 / 52. 3.100
Input #0, mpegts, from 'h264_aac_576i_tff.ts':
Duration: 00:00:02.80, start: 600.000000, bitrate: 9671 kb/s
Program 1
Stream #0:0[0x1011]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt470bg), 720x576 [SAR 16:15 DAR 4:3], 25 fps, 25 tbr, 90k tbn, 50 tbc
Stream #0:1[0x1100]: Audio: aac ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 127 kb/s
File 'h264_aac_576i_tff.mp4' already exists. Overwrite ? [y/N] y
Output #0, mp4, to 'h264_aac_576i_tff.mp4':
Metadata:
encoder : Lavf55.33.100
Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 720x576 [SAR 16:15 DAR 4:3], q=2-31, 25 fps, 90k tbn, 90k tbc
Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, 127 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
[mp4 @ 03820060] pts has no value
Last message repeated 68 times
[mp4 @ 03820060] pts has no value
Last message repeated 1 times
frame= 142 fps=0.0 q=-1.0 Lsize= 3205kB time=00:00:02.78 bitrate=9444.8kbits/s
video:3159kB audio:43kB subtitle:0 data:0 global headers:0kB muxing overhead 0.096098%
h264_aac_576i_tff.ts has been uploaded to /incoming on upload.ffmpeg.org
Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.
Attachments (1)
Change History (11)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
the output of mp4dump below shows
moov/trak/mdia/mdhd
timescale = 90000
duration = 255600
duration(ms) = 2840
moov/trak/mdia/minf/stbl/stsd/stts
entry_count = 1
entry 0 = sample_count=142, sample_duration=1800
Framerate calculation: 90000/1800 = 50 fps
or: 2840/142 = 20 ms, 1000/20 = 50 fps
Output of mp4dump --verbosity 1 h264_aac_576i_tff.mp4
Note not relevant parts replaced with (...)
[ftyp] size=8+24
major_brand = isom
minor_version = 200
compatible_brand = isom
compatible_brand = iso2
compatible_brand = avc1
compatible_brand = mp41
[free] size=8+0
[mdat] size=8+3278041
[moov] size=8+3991
[mvhd] size=12+96
timescale = 1000
duration = 2840
duration(ms) = 2840
[trak] size=8+2261
[tkhd] size=12+80, flags=3
enabled = 1
id = 1
duration = 2840
width = 768.000000
height = 576.000000
[edts] size=8+28
[elst] size=12+16
entry count = 1
entry/segment duration = 2840
entry/media time = 3600
entry/media rate = 1
[mdia] size=8+2125
[mdhd] size=12+20
timescale = 90000
duration = 255600
duration(ms) = 2840
language = und
[hdlr] size=12+33
handler_type = vide
handler_name = VideoHandler
[minf] size=8+2040
[vmhd] size=12+8, flags=1
graphics_mode = 0
op_color = 0000,0000,0000
[dinf] size=8+28
[dref] size=12+16
[url ] size=12+0, flags=1
location = [local to file]
[stbl] size=8+1976
[stsd] size=12+176
entry-count = 1
[avc1] size=8+164
data_reference_index = 1
width = 720
height = 576
compressor =
[avcC] size=8+62
Configuration Version = 1
Profile = High
Profile Compatibility = 0
Level = 41
NALU Length Size = 4
Sequence Parameter = [67 64 00 29 ac 2c a4 02 d0 91 7f e0 02 00 01 e9 41 41 41 50 00 00 03 00 10 00 00 03 03 2e 4a 00 02 71 00 00 07 a1 27 f1 8e 0e d0 a1 48 90]
Picture Parameter = [68 e9 8d 35 25]
[pasp] size=8+8
[stts] size=12+12
entry_count = 1
entry 0 = sample_count=142, sample_duration=1800
[stss] size=12+20
entry_count = 4
[ctts] size=12+356
entry_count = 44
[stsc] size=12+232
entry_count = 19
entry 0 = first_chunk=1, first_sample*=1, chunk_count*=1, samples_per_chunk=3, sample_desc_index=1
(...)
entry 18 = first_chunk=129, first_sample*=139, chunk_count*=0, samples_per_chunk=4, sample_desc_index=1
[stsz] size=12+576
sample_size = 0
sample_count = 142
[stco] size=12+520
entry_count = 129
entry 0 = 48
(...)
entry 128 = 3175684
[trak] size=8+1508
(...)
[udta] size=8+90
[meta] size=12+78
[hdlr] size=12+21
handler_type = mdir
handler_name =
[ilst] size=8+37
[.too] size=8+29
[data] size=8+21
type = 1
lang = 0
value = Lavf55.33.100
comment:3 by , 12 years ago
Note that after this patch to h264_parse() in libavcodec/h264_parser.c:
diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c
index 4432871..564ae14 100644
--- a/libavcodec/h264_parser.c
+++ b/libavcodec/h264_parser.c
@@ -471,6 +471,7 @@ static int h264_parse(AVCodecParserContext *s,
}
}
+ s->flags |= PARSER_FLAG_COMPLETE_FRAMES;
if (s->flags & PARSER_FLAG_COMPLETE_FRAMES) {
next = buf_size;
} else {
a mp4 is generated which is almost correct;
moov/trak/mdia/mdhd
timescale = 90000
duration = 253800
duration(ms) = 2820
moov/trak/mdia/minf/stbl/stsd/stts
entry_count = 2
entry 0 = sample_count=70, sample_duration=3600
entry 1 = sample_count=1, sample_duration=1800
So the first 70 samples are marked with the correct duration of 3600, except the last one has still a wrong duration.
But this change is not a valid patch as it causes problems elsewhere. The correct way is probably to fix this is h264_find_frame_end() also in libavcodec/h264_parser.c. But that is way more complex.
comment:4 by , 12 years ago
| Keywords: | h264 added |
|---|---|
| Reproduced by developer: | set |
| Status: | new → open |
Input sample contains 71 video frames:
$ ffmpeg -i h264_aac_576i_tff.ts -f null -
ffmpeg version N-60700-g07b4b0c Copyright (c) 2000-2014 the FFmpeg developers
built on Feb 18 2014 09:15:18 with gcc 4.7 (SUSE Linux)
configuration: --enable-gpl
libavutil 52. 64.100 / 52. 64.100
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 33.100 / 55. 33.100
libavdevice 55. 10.100 / 55. 10.100
libavfilter 4. 1.102 / 4. 1.102
libswscale 2. 5.101 / 2. 5.101
libswresample 0. 17.104 / 0. 17.104
libpostproc 52. 3.100 / 52. 3.100
Input #0, mpegts, from 'h264_aac_576i_tff.ts':
Duration: 00:00:02.80, start: 600.000000, bitrate: 9671 kb/s
Program 1
Stream #0:0[0x1011]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt470bg), 720x576 [SAR 16:15 DAR 4:3], 25 fps, 25 tbr, 90k tbn, 50 tbc
Stream #0:1[0x1100]: Audio: aac ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 127 kb/s
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf55.33.100
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 720x576 [SAR 16:15 DAR 4:3], q=2-31, 200 kb/s, 90k tbn, 25 tbc
Stream #0:1: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (h264 -> rawvideo)
Stream #0:1 -> #0:1 (aac -> pcm_s16le)
Press [q] to stop, [?] for help
[null @ 0x1980400] Encoder did not produce proper pts, making some up.
frame= 71 fps=0.0 q=0.0 Lsize=N/A time=00:00:02.84 bitrate=N/A
video:7kB audio:512kB subtitle:0 data:0 global headers:0kB muxing overhead -100.004142%
142 frames are remuxed:
$ ffmpeg -i h264_aac_576i_tff.ts -vcodec copy -strict -2 out.mov
ffmpeg version N-60700-g07b4b0c Copyright (c) 2000-2014 the FFmpeg developers
built on Feb 18 2014 09:15:18 with gcc 4.7 (SUSE Linux)
configuration: --enable-gpl
libavutil 52. 64.100 / 52. 64.100
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 33.100 / 55. 33.100
libavdevice 55. 10.100 / 55. 10.100
libavfilter 4. 1.102 / 4. 1.102
libswscale 2. 5.101 / 2. 5.101
libswresample 0. 17.104 / 0. 17.104
libpostproc 52. 3.100 / 52. 3.100
Input #0, mpegts, from 'h264_aac_576i_tff.ts':
Duration: 00:00:02.80, start: 600.000000, bitrate: 9671 kb/s
Program 1
Stream #0:0[0x1011]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt470bg), 720x576 [SAR 16:15 DAR 4:3], 25 fps, 25 tbr, 90k tbn, 50 tbc
Stream #0:1[0x1100]: Audio: aac ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 127 kb/s
Output #0, mov, to 'out.mov':
Metadata:
encoder : Lavf55.33.100
Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p, 720x576 [SAR 16:15 DAR 4:3], q=2-31, 25 fps, 90k tbn, 90k tbc
Stream #0:1: Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (aac -> aac)
Press [q] to stop, [?] for help
[mov @ 0x29384c0] pts has no value
Last message repeated 68 times
[mov @ 0x29384c0] pts has no value
Last message repeated 1 times
frame= 142 fps=0.0 q=-1.0 Lsize= 3171kB time=00:00:02.78 bitrate=9345.6kbits/s
video:3159kB audio:8kB subtitle:0 data:0 global headers:0kB muxing overhead 0.129618%
The framerate is incorrect for out.mov:
$ ffmpeg -i out.mov -strict -2 out2.mov
ffmpeg version N-60700-g07b4b0c Copyright (c) 2000-2014 the FFmpeg developers
built on Feb 18 2014 09:15:18 with gcc 4.7 (SUSE Linux)
configuration: --enable-gpl
libavutil 52. 64.100 / 52. 64.100
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 33.100 / 55. 33.100
libavdevice 55. 10.100 / 55. 10.100
libavfilter 4. 1.102 / 4. 1.102
libswscale 2. 5.101 / 2. 5.101
libswresample 0. 17.104 / 0. 17.104
libpostproc 52. 3.100 / 52. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mov':
Metadata:
major_brand : qt
minor_version : 512
compatible_brands: qt
encoder : Lavf55.33.100
Duration: 00:00:02.84, start: 0.014667, bitrate: 9148 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg), 720x576 [SAR 16:15 DAR 4:3], 9113 kb/s, 50 fps, 50 tbr, 90k tbn, 50 tbc (default)
Metadata:
handler_name : DataHandler
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 23 kb/s (default)
Metadata:
handler_name : DataHandler
Output #0, mov, to 'out2.mov':
Metadata:
major_brand : qt
minor_version : 512
compatible_brands: qt
encoder : Lavf55.33.100
Stream #0:0(eng): Video: mpeg4 (mp4v / 0x7634706D), yuv420p, 720x576 [SAR 16:15 DAR 4:3], q=2-31, 200 kb/s, 12800 tbn, 50 tbc (default)
Metadata:
handler_name : DataHandler
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : DataHandler
Stream mapping:
Stream #0:0 -> #0:0 (h264 -> mpeg4)
Stream #0:1 -> #0:1 (aac -> aac)
Press [q] to stop, [?] for help
frame= 142 fps=0.0 q=2.0 Lsize= 142kB time=00:00:02.84 bitrate= 408.4kbits/s dup=71 drop=0
video:129kB audio:8kB subtitle:0 data:0 global headers:0kB muxing overhead 2.731602%
comment:5 by , 12 years ago
| Keywords: | mov added |
|---|
comment:6 by , 12 years ago
A patch can be found on the mailing list:
http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/175525
by , 12 years ago
| Attachment: | 0003-avcodec-h264-merge-fields-fix-duration.patch added |
|---|
fixes duration of output file after previous patches have been applied
follow-up: 9 comment:7 by , 12 years ago
What I understand from the discussion on the mailing list is that merging the fields into field pairs violates the ISO specification.
AVC sample: An AVC sample is an access unit as defined in ISO/IEC 14496‐10
access unit: A set of NAL units that are consecutive in decoding order and contain exactly one primary coded picture. (...) The decoding of an access unit always results in a decoded picture.
Each (PAFF) field is encoded as a separate picture, so a sample in a MP4 file may only contain a single field.
So software which uses the sample count in the MP4 file to determine the frame rate is simply wrong. This includes mediainfo, vlc, quicktime and gspot. The same applies to other encoders which generate such files. I tested sorenson squeeze with the intel and mainconcept encoder. Both merged fields.
comment:8 by , 12 years ago
| Resolution: | → invalid |
|---|---|
| Status: | open → closed |
Closing as invalid because it violates the ISO spec.
I have seen other encoders encode files this way an thus violating the spec, but I can't prove a de-facto standard which does not agree with the ISO spec.
follow-up: 10 comment:9 by , 12 years ago
Replying to wim_arbor:
What I understand from the discussion on the mailing list is that merging the fields into field pairs violates the ISO specification.
AVC sample: An AVC sample is an access unit as defined in ISO/IEC 14496‐10
access unit: A set of NAL units that are consecutive in decoding order and contain exactly one primary coded picture. (...) The decoding of an access unit always results in a decoded picture.
Each (PAFF) field is encoded as a separate picture, so a sample in a MP4 file may only contain a single field.
I don't know much about H.264 but I would have expected that it needs two PAFF fields to get a decoded picture.
So software which uses the sample count in the MP4 file to determine the frame rate is simply wrong. This includes mediainfo, vlc, quicktime and gspot. The same applies to other encoders which generate such files.
I tested sorenson squeeze with the intel and mainconcept encoder. Both merged fields.
This sounds to me as if we should do the same, particularly if there is no playback application that fails for such output files.
comment:10 by , 12 years ago
Replying to cehoyos:
Replying to wim_arbor:
What I understand from the discussion on the mailing list is that merging the fields into field pairs violates the ISO specification.
AVC sample: An AVC sample is an access unit as defined in ISO/IEC 14496‐10
access unit: A set of NAL units that are consecutive in decoding order and contain exactly one primary coded picture. (...) The decoding of an access unit always results in a decoded picture.
Each (PAFF) field is encoded as a separate picture, so a sample in a MP4 file may only contain a single field.
I don't know much about H.264 but I would have expected that it needs two PAFF fields to get a decoded picture.
You should not read the english word "picture" here, but picture as defined in the same spec;
picture: A collective term for a field or a frame.
And related definitions:
decoded picture: A decoded picture is derived by decoding a coded picture. A decoded picture is either a decoded frame, or a decoded field. A decoded field is either a decoded top field or a decoded bottom field.
coded picture: A coded representation of a picture. A coded picture may be either a coded field or a coded frame. Coded picture is a collective term referring to a primary coded picture or a redundant coded picture, but not to both together.
So software which uses the sample count in the MP4 file to determine the frame rate is simply wrong. This includes mediainfo, vlc, quicktime and gspot. The same applies to other encoders which generate such files.
I tested sorenson squeeze with the intel and mainconcept encoder. Both merged fields.
This sounds to me as if we should do the same, particularly if there is no playback application that fails for such output files.
AFAIK there is no application which fails with the current output either. Just some software reports a wrong framerate for such files. The reaction on the mailing list is very clear that this violates the spec. And I agree now after reading it.
I will report a bug at mainconcept after I have checked their newest codec version. If they don't agree, I will report back here. But that will take some time, I did not want to keep this ticket open in the mean time.
But of course feel free if you want to implement this, currently I have to merge the patches myself to get a custom version.



Output of
ffmpeg.exe -v 9 -loglevel 99 -i h264_aac_576i_tff.tsffmpeg version N-60700-g07b4b0c Copyright (c) 2000-2014 the FFmpeg developers built on Feb 17 2014 15:45:12 with gcc 4.8.2 (GCC) configuration: --pkg-config=pkg-config --prefix=/home/arbor/software/packages/win32 --enable-memalign-hack --arch=x86 --target-os=mingw32 --cross-prefix=i686-w64-mingw32- --enable-libfaac --enable-libx264 --enable-gpl --enable-nonfree --disable-w32threads libavutil 52. 64.100 / 52. 64.100 libavcodec 55. 52.102 / 55. 52.102 libavformat 55. 33.100 / 55. 33.100 libavdevice 55. 10.100 / 55. 10.100 libavfilter 4. 1.102 / 4. 1.102 libswscale 2. 5.101 / 2. 5.101 libswresample 0. 17.104 / 0. 17.104 libpostproc 52. 3.100 / 52. 3.100 Splitting the commandline. Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'. Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'. Reading option '-i' ... matched as input file with argument 'h264_aac_576i_tff.ts'. Finished splitting the commandline. Parsing a group of options: global . Applying option v (set logging level) with argument 9. Successfully parsed a group of options. Parsing a group of options: input file h264_aac_576i_tff.ts. Successfully parsed a group of options. Opening an input file: h264_aac_576i_tff.ts. [mpegts @ 035e5f40] Format mpegts probed with size=2048 and score=100 [mpegts @ 035e5f40] stream=0 stream_type=1b pid=1011 prog_reg_desc=HDMV [mpegts @ 035e5f40] stream=1 stream_type=f pid=1100 prog_reg_desc=HDMV [mpegts @ 035e5f40] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0 [mpegts @ 035e5f40] All programs have pmt, headers found [h264 @ 003da520] no picture [mpegts @ 035e5f40] All info found [mpegts @ 035e5f40] After avformat_find_stream_info() pos: 0 bytes read:1036432 seeks:2 frames:81 Input #0, mpegts, from 'h264_aac_576i_tff.ts': Duration: 00:00:02.80, start: 600.000000, bitrate: 9671 kb/s Program 1 Stream #0:0[0x1011], 43, 1/90000: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt470bg), 720x576 [SAR 16:15 DAR 4:3], 1/50, 25 fps, 25 tbr, 90k tbn, 50 tbc Stream #0:1[0x1100], 38, 1/90000: Audio: aac ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 127 kb/s Successfully opened the file. At least one output file must be specified [AVIOContext @ 035ee5a0] Statistics: 1036432 bytes read, 2 seeks