Opened 9 days ago

#7661 new defect

SubViewer .sub files with UTF8 encoding are decoded incorrectly

Reported by: lukasf Owned by:
Priority: normal Component: avformat
Version: git-master Keywords: SubViewer
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

When loading SubViewer? subtitles from an external UTF8 encoded .sub file (not embedded into a movie), then the subtitles are decoded incorrectly.

Looking at the code, the reason seems to be that the BOM is not skipped in the subviewer_read_header() function. Skipping the BOM like in microdvddec.c would probably fix this.

We are using ffmpeg as library for video playback (with embedded and external subtitles), but the same bug can easily be reproduced when using the command line.

I attached two UTF8 SubViewer? files, one with header, one without header. Both output a broken first entry.

How to reproduce:

% ffmpeg -i SubViewer_Header_UTF8.sub -map 0:s:0 output1.srt
ffmpeg version 4.1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20181017...

Second file:

% ffmpeg -i SubViewer_NoHeader_UTF8.sub -map 0:s:0 output2.srt

Expected output from both files:

1
00:04:35,030 --> 00:04:38,820
Hello guys... please sit down...

2
00:05:00,190 --> 00:05:03,470
M. Franklin,
are you crazy?

Output from SubViewer_Header_UTF8.sub:

1
00:00:00,000 --> 00:00:00,000
[INFORMATION]

2
00:04:35,030 --> 00:04:38,820
Hello guys... please sit down...

3
00:05:00,190 --> 00:05:03,470
M. Franklin,
are you crazy?

Output from SubViewer_NoHeader_UTF8.sub:

1
00:00:00,000 --> 00:00:00,000
00:04:35.03,00:04:38.82
Hello guys... please sit down...

2
00:05:00,190 --> 00:05:03,470
M. Franklin,
are you crazy?

Attachments (2)

SubViewer_Header_UTF8.sub (345 bytes) - added by lukasf 9 days ago.
SubViewer_NoHeader_UTF8.sub (119 bytes) - added by lukasf 9 days ago.

Download all attachments as: .zip

Change History (2)

Changed 9 days ago by lukasf

Changed 9 days ago by lukasf

Note: See TracTickets for help on using tickets.