Opened 5 weeks ago

Last modified 3 weeks ago

#11263 new defect

[Regression] Closed Captions in MPEG-TS no more recognized

Reported by: Selcuk Ozturk Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: mpegts
Cc: MasterQuestionable Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Description

Summary of the bug:
With some files, even though the video file contains closed caption data when played in VLC, the ffmpeg srt extraction produces and empty file. This happens with the git version 20240629 and release 5.0.

However, release 4.3.2 works fine with the same input and produces correct output.

How to reproduce:

% ffmpeg -f lavfi -i movie=fox5-202410230440.ts[out+subcc] z.srt
ffmpeg version: 
5.0 and git build 20240629 by John van Sickle
built on ...

Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.

Attachments (2)

file.ts (2.2 MB ) - added by Selcuk Ozturk 5 weeks ago.
diff.git.vs.4.3.2.txt.gz (1.2 MB ) - added by Selcuk Ozturk 4 weeks ago.
Output of "diff fox5-202410230440-git-master.ts.xml fox5-202410230440-r4.3.2.ts.xml"

Change History (12)

by Selcuk Ozturk, 5 weeks ago

Attachment: file.ts added

comment:1 by MasterQuestionable, 5 weeks ago

Cc: MasterQuestionable added
Component: ffmpegavfilter
Keywords: movie added
Summary: Closed Caption extraction produces empty output[Regression] "movie" filter Closed Caption extraction gave empty output

͏    Is it really sensible..?
͏    https://ffmpeg.org/ffmpeg-filters.html#movie

͏    Why not use ordinary "-i" and ͏"-map"?
͏    Also the semantic definition of "+" in stream label names appears uncertain.
͏    (really string literal?)

comment:2 by Selcuk Ozturk, 5 weeks ago

As far as I can find out, eia_608 closed captions in mpegts streams don't have their own stream. The only way to extract them is "movie=filename.ts[out+subcc]" which kind of converts them to a stream. All google searches point to this syntax. If you can give me your suggested syntax, I can try it.

comment:3 by MasterQuestionable, 4 weeks ago

͏    What does it report?
͏    ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"

comment:4 by Selcuk Ozturk, 4 weeks ago

It reports the following:

# /opt/ffmpeg-git-20240629-amd64-static/ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
Input #0, mpegts, from 'fox5-202410230440.ts':

Duration: 00:05:09.64, start: 1.767033, bitrate: 152 kb/s
Program 1

Metadata:

service_name : Service01
service_provider: FFmpeg

Stream #0:0[0x100]: Video: h264 (Constrained Baseline) ([27][0][0][0] / 0x001B), yuv420p(progressive), 128x72 [SAR 1:1 DAR 16:9], 59.94 fps, 59.94 tbr, 90k tbn

program.0.stream.0.time_base="1/90000"
stream.0.time_base="1/90000"
format.size="5897560"

comment:5 by Selcuk Ozturk, 4 weeks ago

When I use release 4.3.2 ffprobe, I get:

# /opt/ffmpeg-4.3.2-amd64-static/ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
ffmpeg-4.3.2-amd64-static/
root@mercury:~# /opt/ffmpeg-4.3.2-amd64-static/ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
Input #0, mpegts, from 'fox5-202410230440.ts':

Duration: 00:05:09.64, start: 1.767033, bitrate: 152 kb/s
Program 1

Metadata:

service_name : Service01
service_provider: FFmpeg

Stream #0:0[0x100]: Video: h264 (Constrained Baseline) ([27][0][0][0] / 0x001B), yuv420p(progressive), 128x72 [SAR 1:1 DAR 16:9], Closed Captions, 59.94 fps, 59.94 tbr, 90k tbn, 119.88 tbc

program.0.stream.0.time_base="1/90000"
stream.0.time_base="1/90000"
format.size="5897560"

Please note the difference of "Closed Captions" between the two versions.

comment:6 by MasterQuestionable, 4 weeks ago

Component: avfilteravcodec
Keywords: mpegts added; movie removed
Summary: [Regression] "movie" filter Closed Caption extraction gave empty output[Regression] Closed Captions in MPEG-TS no more recognized

͏    Diff the output XML:
͏    ffprobe -v warning -hide_banner -threads 0 -show_entries "frame" -select_streams v:0 -of "xml" "fox5-202410230440.ts" -o "fox5-202410230440.ts.xml"
͏    ("fox5-202410230440.ts.4.3.2.xml")

͏    Try also other "-show_entries" options if previous didn't contain the interested information.
͏    Refer: https://ffmpeg.org/ffprobe.html#Main-options

Last edited 4 weeks ago by MasterQuestionable (previous) (diff)

by Selcuk Ozturk, 4 weeks ago

Attachment: diff.git.vs.4.3.2.txt.gz added

Output of "diff fox5-202410230440-git-master.ts.xml fox5-202410230440-r4.3.2.ts.xml"

comment:7 by Selcuk Ozturk, 4 weeks ago

I have attached the out of the diff. To me there doesn't seem to be any substantial difference between the outputs beyond some renamed fields. Since, I don't know what is interesting information for this inquiry, I don't know what other options to try.

To me the only interesting difference is:

< <side_data type="ATSC A53 Part 4 Closed Captions">
< <side_datum key="side_data_type" value="ATSC A53 Part 4 Closed Captions"/>
< </side_data>
---

<side_data side_data_type="ATSC A53 Part 4 Closed Captions"/>

But, this seems to be just a different report formatting not a substantial difference. Maybe, I'm wrong. Btw, the original attachment "file.ts" is a truncated version of the "fox5-202410230440.ts" file and has exactly the same issue in closed caption extraction git-master vs 4.3.2.

comment:8 by MasterQuestionable, 4 weeks ago

͏    It appears both decoding include the "ATSC A53 Part 4 Closed Captions":
͏    So potentially just the "movie" filter's problem?

͏    Probably some `ffprobe` options could be used to extract the data?

͏    Side Note:
͏    Such diff is less favorable than 2 original XML outright. (but needless for now)

comment:9 by Selcuk Ozturk, 4 weeks ago

BTW, maybe this bit of additional information might help to determine the issue:

We record broadcast TV content as 5 minute long segments and then use the ffmpeg command with the 'movie' option to extract the closed caption data from the recorded 5 minute TS file. Git-master and release 5.0 DO NOT fail each and every time. It's random. They extract the closed caption data correctly from some 5 minute segments and create empty output on others. Version 4.3.2 always extracts the CC data correctly.

comment:10 by Marth64, 3 weeks ago

Reproduced by developer: set

I have tested with the sample on 4.3.2 vs. a newer version and can confirm this to be true. The sample material has Closed Captions but they aren't being extracted as expected.

Note: See TracTickets for help on using tickets.