Opened 10 years ago
Last modified 5 years ago
#5851 open defect
Option to remove tags from Closed Captions
| Reported by: | edumj | Owned by: | |
|---|---|---|---|
| Priority: | minor | Component: | avcodec |
| Version: | git-master | Keywords: | cc |
| Cc: | Blocked By: | ||
| Blocking: | Reproduced by developer: | no | |
| Analyzed by developer: | no |
Description
I can extract Closed Captions from this NTSC DVD sample Starship_Troopers.vob with this:
"ffmpeg" -f lavfi -i "movie=Starship_Troopers.vob[out0+subcc]" -map s "output_map-s.srt"
output:
ffmpeg version N-81452-g01aee81 Copyright (c) 2000-2016 the FFmpeg developers
built with gcc 5.4.0 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-libebur128 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-decklink --enable-zlib
libavutil 55. 29.100 / 55. 29.100
libavcodec 57. 54.100 / 57. 54.100
libavformat 57. 48.100 / 57. 48.100
libavdevice 57. 0.102 / 57. 0.102
libavfilter 6. 54.100 / 6. 54.100
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 1.100 / 2. 1.100
libpostproc 54. 0.100 / 54. 0.100
Input #0, lavfi, from 'movie=Starship_Troopers.vob[out0+subcc]':
Duration: N/A, start: 1986.626100, bitrate: N/A
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 720x480 [SAR 1:1 DAR 3:2], 59.94 tbr, 90k tbn, 90k tbc
Stream #0:1: Subtitle: eia_608
[srt @ 0612b2c0] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
[null @ 0608cfa0] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, srt, to 'output_map-s.srt':
Metadata:
encoder : Lavf57.48.100
Stream #0:0: Subtitle: subrip (srt)
Metadata:
encoder : Lavc57.54.100 srt
Output #1, null, to 'nul':
Metadata:
encoder : Lavf57.48.100
Stream #1:0: Video: wrapped_avframe, yuv420p, 720x480 [SAR 1:1 DAR 3:2], q=2-31, 200 kb/s, 59.94 fps, 59.94 tbn, 59.94 tbc
Metadata:
encoder : Lavc57.54.100 wrapped_avframe
Stream mapping:
Stream #0:1 -> #0:0 (eia_608 (cc_dec) -> subrip (srt))
Stream #0:0 -> #1:0 (rawvideo (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
frame= 467 fps=0.0 q=-0.0 size= 0kB time=00:00:19.43 bitrate= 0.1kbits/s speed=38.9x
frame= 973 fps=973 q=-0.0 size= 1kB time=00:00:40.54 bitrate= 0.1kbits/s speed=40.5x
[mpeg2video @ 060527a0] ac-tex damaged at 3 27
[mpeg2video @ 060527a0] Warning MVs not available
[mpeg2video @ 060527a0] concealing 135 DC, 135 AC, 135 MV errors in I frame
frame= 1229 fps=980 q=-0.0 Lsize= 1kB time=00:00:51.30 bitrate= 0.2kbits/s speed=40.9x
video:461kB audio:0kB subtitle:1kB other streams:0kB global headers:0kB muxing overhead: unknown
but, srt has font tags, and some strange position tags:
1
00:00:11,745 --> 00:00:15,249
<font face="Monospace">{\an7}PILOT TRAINEE IBANEZ
REPORTING FOR DUTY, MA’AM.</font>
2
00:00:15,249 --> 00:00:18,252
<font face="Monospace">{\an7}- TAKE THE NUMBER TWO CHAIR,
\h\hIBANEZ.
- YES, MA’AM.</font>
3
00:00:22,756 --> 00:00:27,761
<font face="Monospace">{\an7}\h\h\h\h\h\h\h\h\h\h\h\h\h\h\h\h\h\h\h\hIDENTIFY.
IBANEZ, "T"-THREE-TWO-FIVE-"A,"
CLEAR.</font>
4
00:00:30,764 --> 00:00:34,768
<font face="Monospace">{\an7}[ Laughs ]
WHAT ARE YOU DOING HERE ?</font>
5
00:00:36,270 --> 00:00:39,273
<font face="Monospace">{\an7}I’M THE GUY WHO’S GONNA
TEACH YOU TO FLY THIS CRATE.</font>
6
00:00:39,273 --> 00:00:41,776
<font face="Monospace">{\an7}<i>AH.
ASSISTANT INSTRUCTOR.</i></font>
7
00:00:41,775 --> 00:00:44,778
<font face="Monospace">{\an7}SHOULD I CALL YOU
"SIR" ?</font>
8
00:00:44,778 --> 00:00:47,281
<font face="Monospace">{\an7}ONLY WHEN I GIVE YOU
AN ORDER.</font>
9
00:00:47,281 --> 00:00:49,283
<font face="Monospace">{\an7}PREPARE FOR DEPARTURE.</font>
These tags are not allowed by TXT2VobSub because subtitles are too long, and if I harsub them with this:
"ffmpeg" -i "Starship_Troopers.vob" -vf "subtitles=output_map-s.srt:force_style='FontName=Microsoft Sans Serif,Fontsize=18,Outline=1,PrimaryColour=&HFFFFFF'" -f avi -c:v libxvid -b:v 1500k -vtag XVID -c:a libmp3lame -b:a 128k "Starship_Troopers-ffmpeg.avi"
output:
ffmpeg version N-81452-g01aee81 Copyright (c) 2000-2016 the FFmpeg developers
built with gcc 5.4.0 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-libebur128 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-decklink --enable-zlib
libavutil 55. 29.100 / 55. 29.100
libavcodec 57. 54.100 / 57. 54.100
libavformat 57. 48.100 / 57. 48.100
libavdevice 57. 0.102 / 57. 0.102
libavfilter 6. 54.100 / 6. 54.100
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 1.100 / 2. 1.100
libpostproc 54. 0.100 / 54. 0.100
Input #0, mpeg, from 'Starship_Troopers.vob':
Duration: 00:00:51.30, start: 1986.626100, bitrate: 4618 kb/s
Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p(tv), 720x480 [SAR 32:27 DAR 16:9], Closed Captions, 29.97 fps, 59.94 tbr, 90k tbn, 59.94 tbc
Stream #0:1[0x83]: Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
Stream #0:2[0x82]: Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
Stream #0:3[0x80]: Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
Stream #0:4[0x81]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 384 kb/s
Stream #0:5[0x20]: Subtitle: dvd_subtitle
Stream #0:6[0x22]: Subtitle: dvd_subtitle
[Parsed_subtitles_0 @ 049ef6e0] Shaper: FriBidi 0.19.6 (SIMPLE)
[Parsed_subtitles_0 @ 049ef6e0] Using font provider directwrite
[avi @ 04942f60] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Last message repeated 1 times
[null @ 04942120] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Last message repeated 1 times
Output #0, avi, to 'Starship_Troopers-ffmpeg.avi':
Metadata:
ISFT : Lavf57.48.100
Stream #0:0: Video: mpeg4 (libxvid) (XVID / 0x44495658), yuv420p, 720x480 [SAR 32:27 DAR 16:9], q=2-31, 1500 kb/s, 29.97 fps, 29.97 tbn, 29.97 tbc
Metadata:
encoder : Lavc57.54.100 libxvid
Stream #0:1: Audio: mp3 (libmp3lame) (U[0][0][0] / 0x0055), 48000 Hz, stereo, fltp, delay 1105, padding 0, 128 kb/s
Metadata:
encoder : Lavc57.54.100 libmp3lame
Output #1, null, to 'nul':
Metadata:
encoder : Lavf57.48.100
Stream #1:0: Video: wrapped_avframe, yuv420p, 720x480 [SAR 32:27 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 29.97 tbn, 29.97 tbc
Metadata:
encoder : Lavc57.54.100 wrapped_avframe
Stream #1:1: Audio: pcm_s16le, 48000 Hz, 5.1(side), s16, 4608 kb/s
Metadata:
encoder : Lavc57.54.100 pcm_s16le
Stream mapping:
Stream #0:0 -> #0:0 (mpeg2video (native) -> mpeg4 (libxvid))
Stream #0:4 -> #0:1 (ac3 (native) -> mp3 (libmp3lame))
Stream #0:0 -> #1:0 (mpeg2video (native) -> wrapped_avframe (native))
Stream #0:4 -> #1:1 (ac3 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
[ac3 @ 04de9c80] frame sync error
Error while decoding stream #0:4: Invalid data found when processing input
[null @ 04942120] Application provided invalid, non monotonically increasing dts to muxer in stream 1: 1891 >= 1891
[libmp3lame @ 04debec0] Queue input is backward in time
frame= 95 fps=0.0 q=6.0 q=-0.0 size= 671kB time=00:00:03.94 bitrate=1394.7kbits/s speed=7.83x
frame= 185 fps=184 q=6.0 q=-0.0 size= 1326kB time=00:00:07.71 bitrate=1407.7kbits/s speed=7.67x
frame= 276 fps=183 q=9.0 q=-0.0 size= 2029kB time=00:00:11.49 bitrate=1446.2kbits/s speed=7.62x
[Parsed_subtitles_0 @ 049ef6e0] fontselect: (Microsoft Sans Serif, 400, 0) -> MicrosoftSansSerif, 0, MicrosoftSansSerif
[Parsed_subtitles_0 @ 049ef6e0] fontselect: (Monospace, 400, 0) -> CourierNewPSMT, 0, CourierNewPSMT
[mpeg @ 002eb780] New subtitle stream 0:7 at pos:8497166 and DTS:1999.51s
frame= 372 fps=185 q=5.0 q=-0.0 size= 2752kB time=00:00:15.52 bitrate=1451.8kbits/s speed=7.73x
frame= 459 fps=183 q=9.0 q=-0.0 size= 3439kB time=00:00:19.14 bitrate=1471.6kbits/s speed=7.63x
frame= 557 fps=185 q=7.0 q=-0.0 size= 4135kB time=00:00:23.18 bitrate=1460.6kbits/s speed= 7.7x
frame= 645 fps=184 q=9.0 q=-0.0 size= 4824kB time=00:00:26.88 bitrate=1469.7kbits/s speed=7.65x
frame= 733 fps=181 q=6.0 q=-0.0 size= 5313kB time=00:00:30.53 bitrate=1425.2kbits/s speed=7.53x
frame= 837 fps=184 q=4.0 q=-0.0 size= 5933kB time=00:00:34.88 bitrate=1393.0kbits/s speed=7.66x
frame= 935 fps=185 q=5.0 q=-0.0 size= 6631kB time=00:00:38.98 bitrate=1393.4kbits/s speed=7.71x
[Parsed_subtitles_0 @ 049ef6e0] fontselect: (Monospace, 400, 100) -> CourierNewPS-ItalicMT, 0, CourierNewPS-ItalicMT
frame= 1035 fps=186 q=5.0 q=-0.0 size= 7311kB time=00:00:43.17 bitrate=1387.1kbits/s speed=7.77x
frame= 1139 fps=188 q=6.0 q=-0.0 size= 8053kB time=00:00:47.48 bitrate=1389.5kbits/s speed=7.84x
[mpeg2video @ 049477c0] ac-tex damaged at 3 27
[mpeg2video @ 049477c0] Warning MVs not available
[mpeg2video @ 049477c0] concealing 135 DC, 135 AC, 135 MV errors in I frame
[ac3 @ 04de9c80] incomplete frame
frame= 1229 fps=189 q=6.0 Lq=-0.0 size= 8736kB time=00:00:51.31 bitrate=1394.6kbits/s speed= 7.9x
video:8300kB audio:29601kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Those font tags overwrite FontName from subtitles filter, and position tags puts subs above and aligned like this:
CCextrator removes those tags, and looks like this:
Is there an option to remove those tags, like "-txt_format text" does with other embed text subs? That way, we could also do soft subs (XSUBs) and not only hard subs from CC.
Change History (7)
comment:1 by , 10 years ago
| Component: | undetermined → avcodec |
|---|---|
| Keywords: | cc added |
| Priority: | normal → minor |
| Reproduced by developer: | set |
| Status: | new → open |
| Type: | enhancement → defect |
| Version: | unspecified → git-master |
comment:2 by , 10 years ago
| Reproduced by developer: | unset |
|---|
Otoh, the output looks very similar to what vlc produces, so the tabs may simply be correct.
comment:3 by , 10 years ago
comment:4 by , 5 years ago
\h is non-break space. See: https://github.com/libass/libass/issues/2
And {\an7} is Top-left and ASS tag, not SSA, that means it is pretty modern thingy.
comment:5 by , 5 years ago
What IS VERY FUNNY is that apparently
[Parsed_movie_0 @ 00000182e3099400] EOF timestamp not reliable
is true, since now we have this too:
10
00:00:49,283 --> 00:00:51,963
<font face="Monospace">{\an7}IT’S AMAZING, US RUNNING
INTO EACH OTHER LIKE THIS.
MAYBE IT’S FATE.</font>
comment:6 by , 5 years ago
Okay, so that is what it is. Positioning in 608 is of course different from used in ass but it is supported when converting (VLC does not support it, BTW, what a joke, but \h is supported there). You can for sure use raw binary cc format, see #4767.
Understood it from here: https://github.com/CCExtractor/ccextractor/issues/1108






I suspect there is a bug but I am unable to analyze it further: What is
\hand why is it put there by capture_screen()?