#11263 closed defect (wontfix)
[Regression] Closed Captions in MPEG-TS no more recognized
Reported by: | Selcuk Ozturk | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avcodec |
Version: | git-master | Keywords: | ccaption_dec |
Cc: | MasterQuestionable | Blocked By: | |
Blocking: | Reproduced by developer: | yes | |
Analyzed by developer: | yes |
Description
Summary of the bug:
With some files, even though the video file contains closed caption data when played in VLC, the ffmpeg srt extraction produces and empty file. This happens with the git version 20240629 and release 5.0.
However, release 4.3.2 works fine with the same input and produces correct output.
How to reproduce:
% ffmpeg -f lavfi -i movie=fox5-202410230440.ts[out+subcc] z.srt ffmpeg version: 5.0 and git build 20240629 by John van Sickle built on ...
Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.
Attachments (2)
Change History (17)
by , 3 months ago
comment:1 by , 3 months ago
Cc: | added |
---|---|
Component: | ffmpeg → avfilter |
Keywords: | movie added |
Summary: | Closed Caption extraction produces empty output → [Regression] "movie" filter Closed Caption extraction gave empty output |
comment:2 by , 3 months ago
As far as I can find out, eia_608 closed captions in mpegts streams don't have their own stream. The only way to extract them is "movie=filename.ts[out+subcc]" which kind of converts them to a stream. All google searches point to this syntax. If you can give me your suggested syntax, I can try it.
comment:3 by , 3 months ago
͏ What does it report?
͏ ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
comment:4 by , 3 months ago
It reports the following:
# /opt/ffmpeg-git-20240629-amd64-static/ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
Input #0, mpegts, from 'fox5-202410230440.ts':
Duration: 00:05:09.64, start: 1.767033, bitrate: 152 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0:0[0x100]: Video: h264 (Constrained Baseline) ([27][0][0][0] / 0x001B), yuv420p(progressive), 128x72 [SAR 1:1 DAR 16:9], 59.94 fps, 59.94 tbr, 90k tbn
program.0.stream.0.time_base="1/90000"
stream.0.time_base="1/90000"
format.size="5897560"
comment:5 by , 3 months ago
When I use release 4.3.2 ffprobe, I get:
# /opt/ffmpeg-4.3.2-amd64-static/ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
ffmpeg-4.3.2-amd64-static/
root@mercury:~# /opt/ffmpeg-4.3.2-amd64-static/ffprobe -hide_banner -threads 0 -show_entries "stream=time_base:format=size" -of "flat=h=0" "fox5-202410230440.ts"
Input #0, mpegts, from 'fox5-202410230440.ts':
Duration: 00:05:09.64, start: 1.767033, bitrate: 152 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0:0[0x100]: Video: h264 (Constrained Baseline) ([27][0][0][0] / 0x001B), yuv420p(progressive), 128x72 [SAR 1:1 DAR 16:9], Closed Captions, 59.94 fps, 59.94 tbr, 90k tbn, 119.88 tbc
program.0.stream.0.time_base="1/90000"
stream.0.time_base="1/90000"
format.size="5897560"
Please note the difference of "Closed Captions" between the two versions.
comment:6 by , 3 months ago
Component: | avfilter → avcodec |
---|---|
Keywords: | mpegts added; movie removed |
Summary: | [Regression] "movie" filter Closed Caption extraction gave empty output → [Regression] Closed Captions in MPEG-TS no more recognized |
͏ Diff the output XML:
͏ ffprobe -v warning -hide_banner -threads 0 -show_entries "frame" -select_streams v:0 -of "xml" "fox5-202410230440.ts" -o "fox5-202410230440.ts.xml"
͏ ("fox5-202410230440.ts.4.3.2.xml")
͏ Try also other "-show_entries" options if previous didn't contain the interested information.
͏ Refer: https://ffmpeg.org/ffprobe.html#Main-options
by , 3 months ago
Attachment: | diff.git.vs.4.3.2.txt.gz added |
---|
Output of "diff fox5-202410230440-git-master.ts.xml fox5-202410230440-r4.3.2.ts.xml"
comment:7 by , 3 months ago
I have attached the out of the diff. To me there doesn't seem to be any substantial difference between the outputs beyond some renamed fields. Since, I don't know what is interesting information for this inquiry, I don't know what other options to try.
To me the only interesting difference is:
< <side_data type="ATSC A53 Part 4 Closed Captions">
< <side_datum key="side_data_type" value="ATSC A53 Part 4 Closed Captions"/>
< </side_data>
---
<side_data side_data_type="ATSC A53 Part 4 Closed Captions"/>
But, this seems to be just a different report formatting not a substantial difference. Maybe, I'm wrong. Btw, the original attachment "file.ts" is a truncated version of the "fox5-202410230440.ts" file and has exactly the same issue in closed caption extraction git-master vs 4.3.2.
comment:8 by , 3 months ago
͏ It appears both decoding include the "ATSC A53 Part 4 Closed Captions":
͏ So potentially just the "movie" filter's problem?
͏ Probably some `ffprobe` options could be used to extract the data?
͏ Side Note:
͏ Such diff is less favorable than 2 original XML outright. (but needless for now)
comment:9 by , 3 months ago
BTW, maybe this bit of additional information might help to determine the issue:
We record broadcast TV content as 5 minute long segments and then use the ffmpeg command with the 'movie' option to extract the closed caption data from the recorded 5 minute TS file. Git-master and release 5.0 DO NOT fail each and every time. It's random. They extract the closed caption data correctly from some 5 minute segments and create empty output on others. Version 4.3.2 always extracts the CC data correctly.
comment:10 by , 3 months ago
Reproduced by developer: | set |
---|
I have tested with the sample on 4.3.2 vs. a newer version and can confirm this to be true. The sample material has Closed Captions but they aren't being extracted as expected.
comment:11 by , 3 weeks ago
Investigating/bisecting. Causing commit is within 25 commits ahead of 04172d233de58cbb5a2dab6839696628a97c7b52
comment:12 by , 3 weeks ago
Analyzed by developer: | set |
---|---|
Resolution: | → fixed |
Status: | new → closed |
In "bbd0be04d0 avcodec/ccaption_dec: allow selection of second field captions",the ability was added to decode field 1/2 separately
Your sample likely has a field 2 with padding or junk bytes, and as such the auto selection is choosing field 2. I haven't dug into the hex yet but this is likely to be the cause.
Anyways, this works:
ffmpeg -data_field:1 first -f lavfi -i "movie=11263.ts[out+subcc]" -map 0:s -c:s ass -f ass -
Not a bug IMO, its working as designed within FFmpeg's Closed Captions limits. Will close ticket, please let me know if we need to re-open.
comment:13 by , 3 weeks ago
Maybe, it's not a bug but a significant change from how it worked before. Also, all the examples I could find on the internet for CC extraction are of the form "ffmpeg -f lavfi -i movie=<filename>[out+subcc]". If we need to select the field explicitly now, this should be made clear in the documentation, I think.
comment:15 by , 3 weeks ago
Keywords: | ccaption_dec added; mpegts removed |
---|---|
Resolution: | fixed → wontfix |
͏ Is it really sensible..?
͏ https://ffmpeg.org/ffmpeg-filters.html#movie
͏ Why not use ordinary "-i" and ͏"-map"?
͏ Also the semantic definition of "+" in stream label names appears uncertain.
͏ (really string literal?)