Opened 13 months ago

Last modified 7 months ago

#7362 new enhancement

Newline in subtitles: sub.ass - CRLF and sub.srt - LF

Reported by: KnightDanila Owned by:
Priority: wish Component: avcodec
Version: git-master Keywords: ass srt
Cc: beroal Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style
Can you add option -eol to choose the newline style

% ffmpeg.exe -i TEST_Input.mkv -map 0:m:language:eng -map -0:a -map -0:v -eol "\r\n" "TEST_output.srt"
For example MKVToolNix - extract .srt subtitles with CRLF (at least in windows release)

How to reproduce:

% ffmpeg.exe -i TEST_Input.mkv -map 0:m:language:eng -map -0:a -map -0:v "TEST_output.srt"
% ffmpeg.exe -i TEST_Input.mkv -map 0:m:language:eng -map -0:a -map -0:v "TEST_output.ass"
ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 7.3.1 (GCC) 20180722

Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.

Attachments (6)

FFMPEG_Bug_sub_1.png (227.8 KB) - added by KnightDanila 13 months ago.
FFMPEG_Bug_sub_2.png (517.9 KB) - added by KnightDanila 13 months ago.
FFMPEG_Bug_sub_3.png (227.9 KB) - added by KnightDanila 13 months ago.
FFMPEG_Bug_sub_4.png (524.9 KB) - added by KnightDanila 13 months ago.
FFMPEG_Bug_sub_5.png (90.3 KB) - added by KnightDanila 13 months ago.
Forced Sub Sample.ass (1.4 KB) - added by cehoyos 7 months ago.

Download all attachments as: .zip

Change History (13)

Changed 13 months ago by KnightDanila

Changed 13 months ago by KnightDanila

comment:1 in reply to: ↑ description ; follow-ups: Changed 13 months ago by cehoyos

  • Component changed from ffmpeg to avcodec
  • Keywords ass added; Newline character encoding subtitles CR+LF LF removed
  • Version changed from unspecified to git-master

Replying to KnightDanila:

Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style

Why do you believe that one of them is wrong?

Can you add option -eol to choose the newline style

Wouldn’t such an option allow to write invalid files?

Which programs fail for a subtitle file produced by FFmpeg?

Changed 13 months ago by KnightDanila

Changed 13 months ago by KnightDanila

comment:2 in reply to: ↑ 1 Changed 13 months ago by KnightDanila

Replying to cehoyos:

Replying to KnightDanila:

Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style

Why do you believe that one of them is wrong?

Hm... i do not think it is wrong, but:
1) MKVToolNix - extract .srt subtitles with CRLF (at least in windows release). Example https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_5.png
2) Windows OS use CRLF newline style https://en.wikipedia.org/wiki/Newline#Representation (Atari TOS, Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems)
3) Aegisub - create.srt subtitles with CRLF (at least in windows release)

Can you add option -eol to choose the newline style

Wouldn’t such an option allow to write invalid files?

Maybe, but it can use only two options -eol CRLF - for Win or -eol LF - for Unix :)
It is difficult question :)

Which programs fail for a subtitle file produced by FFmpeg?

Hm... notepad.exe :D - it read it, but without new lines. https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_2.png
Maybe, same DOS and Symbian DVD players fall :D

Also, .srt have CRLF - but in strange places (Why does it not have CRLF in all .srt file?):
I marked it green:
.srt https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_4.png
.ass https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_3.png

Last edited 13 months ago by KnightDanila (previous) (diff)

Changed 13 months ago by KnightDanila

comment:3 in reply to: ↑ 1 ; follow-up: Changed 7 months ago by beroal

Replying to cehoyos:

Replying to KnightDanila:

Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style

Why do you believe that one of them is wrong?

Can you add option -eol to choose the newline style

Wouldn’t such an option allow to write invalid files?

I recently extracted text subtitles from "mkv" on Linux, and the result has a mix of "\r\n" and "\n" line ends which is clearly incorrect. So "ffmpeg" *already* produced an incorrect file, see below.

0000000    1  \n   0   0   :   0   0   :   0   1   ,   4   4   3       -
0000010    -   >       0   0   :   0   0   :   0   4   ,   5   3   6  \n
0000020                                  342 231 252       M   e   e   t
0000030        R   e   b   e   c   c   a     342 231 252  \n  \n   2  \n
0000040    0   0   :   0   0   :   0   4   ,   5   4   7       -   -   >
0000050        0   0   :   0   0   :   0   7   ,   3   3   0  \n        
0000060          342 231 252       S   h   e   '   s       t   h   e    
0000070    c   o   o   l   e   s   t       g   i   r   l  \r  \n        
0000080                    i   n       t   h   e       w   o   r   l   d
0000090    ,       w   a   i   t     342 231 252  \n  \n   3  \n   0   0
00000a0    :   0   0   :   0   7   ,   4   0   7       -   -   >       0
00000b0    0   :   0   0   :   1   0   ,   5   4   2  \n                

IMHO, there is no need for an option because, according to http://www.textfiles.com/uploads/kds-srt.txt , lines must end with "\r\n". So "ffmpeg" must produce "\r\n" for all line ends on all operating systems.

On the other hand, an option for adding UTF-8 byte order mark would be useful as some rare programs ("dsrt", for example) require it in UTF-8 text files.

I see that "\r\n" are used between lines of a single subtitle record. I do not know how text subtitles are stored in "mkv", but I guess that a text of a subtitle record in the input file uses "\r\n" to separate lines, and "ffmpeg" inserts "\n" after timing and subtitle record numbers because my operating system is Linux.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 7 months ago by cehoyos

  • Cc beroal added

Replying to beroal:

I recently extracted text subtitles from "mkv" on Linux, and the result has a mix of "\r\n" and "\n" line ends which is clearly incorrect.

How can I reproduce this?

comment:5 in reply to: ↑ 4 Changed 7 months ago by beroal

Replying to cehoyos:

Replying to beroal:

I recently extracted text subtitles from "mkv" on Linux, and the result has a mix of "\r\n" and "\n" line ends which is clearly incorrect.

How can I reproduce this?

Run

ffmpeg -i "$INPUT_FILE" -map 0:2 -codec:subtitles subrip "$OUTPUT_FILE"

with INPUT_FILE containing the path to 2D Forced Subtitles Sample #1 (SRT) (from Kodi Samples). Here is the start of OUTPUT_FILE:

0000000    1  \n   0   0   :   0   0   :   0   2   ,   2   5   3       -
0000010    -   >       0   0   :   0   0   :   0   3   ,   4   2   0  \n
0000020    T   h   e   y   '   r   e       h   e   r   e   !  \n  \n   2
0000030   \n   0   0   :   0   0   :   0   4   ,   7   9   7       -   -
0000040    >       0   0   :   0   0   :   0   6   ,   2   1   4  \n   H
0000050    u   r   r   y   ,       M   u   s   e   !  \n  \n   3  \n   0
0000060    0   :   0   0   :   2   5   ,   1   5   1       -   -   >    
0000070    0   0   :   0   0   :   2   8   ,   3   2   0  \n   W   h   a
0000080    t       t   h   e       h   e   l   l       a   r   e       y
0000090    o   u       d   o   i   n   g   ?  \r  \n   W   h   y       a
00000a0    r   e   n   '   t       y   o   u       o   u   t       o   n
00000b0        t   h   e       w   a   t   e   r   ?  \n  \n   4  \n   0

comment:6 Changed 7 months ago by cehoyos

There may be a difference between a linebreak in the subtitle file and a linebreak that is meant to be shown when displaying the subtitles.

The main question is still if any application meant to read subtitles (read: other than Notepad) has issues reading subtitles written with (current) FFmpeg.

Changed 7 months ago by cehoyos

comment:7 Changed 7 months ago by cehoyos

Apart from a BOM mkvextract produces identical output as FFmpeg.

When transcoding attached ass subtitle file to srt, FFmpeg produces \n for the line feeds of the srt file and \r\n for the crlf's that are part of the subtitles. I don't know if this is expected.

Note: See TracTickets for help on using tickets.