Opened 5 years ago

Closed 5 years ago

#7661 closed defect (fixed)

SubViewer .sub files with UTF8 encoding are decoded incorrectly

Reported by: lukasf Owned by:
Priority: normal Component: avformat
Version: git-master Keywords: SubViewer
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Description

Summary of the bug:

When loading SubViewer subtitles from an external UTF8 encoded .sub file (not embedded into a movie), then the subtitles are decoded incorrectly.

Looking at the code, the reason seems to be that the BOM is not skipped in the subviewer_read_header() function. Skipping the BOM like in microdvddec.c would probably fix this.

We are using ffmpeg as library for video playback (with embedded and external subtitles), but the same bug can easily be reproduced when using the command line.

I attached two UTF8 SubViewer files, one with header, one without header. Both output a broken first entry.

How to reproduce:

% ffmpeg -i SubViewer_Header_UTF8.sub -map 0:s:0 output1.srt
ffmpeg version 4.1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20181017...

Second file:

% ffmpeg -i SubViewer_NoHeader_UTF8.sub -map 0:s:0 output2.srt

Expected output from both files:

1
00:04:35,030 --> 00:04:38,820
Hello guys... please sit down...

2
00:05:00,190 --> 00:05:03,470
M. Franklin,
are you crazy?

Output from SubViewer_Header_UTF8.sub:

1
00:00:00,000 --> 00:00:00,000
[INFORMATION]

2
00:04:35,030 --> 00:04:38,820
Hello guys... please sit down...

3
00:05:00,190 --> 00:05:03,470
M. Franklin,
are you crazy?

Output from SubViewer_NoHeader_UTF8.sub:

1
00:00:00,000 --> 00:00:00,000
00:04:35.03,00:04:38.82
Hello guys... please sit down...

2
00:05:00,190 --> 00:05:03,470
M. Franklin,
are you crazy?

Attachments (2)

SubViewer_Header_UTF8.sub (345 bytes ) - added by lukasf 5 years ago.
SubViewer_NoHeader_UTF8.sub (119 bytes ) - added by lukasf 5 years ago.

Download all attachments as: .zip

Change History (3)

by lukasf, 5 years ago

Attachment: SubViewer_Header_UTF8.sub added

by lukasf, 5 years ago

Attachment: SubViewer_NoHeader_UTF8.sub added

comment:1 by Carl Eugen Hoyos, 5 years ago

Reproduced by developer: set
Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.