Opened 6 years ago

Closed 5 years ago

Last modified 5 years ago

#3496 closed enhancement (fixed)

Support UTF-16 subtitles

Reported by: klpu Owned by:
Priority: normal Component: avformat
Version: git-master Keywords: sub utf16
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
How to reproduce:

% ffmpeg -i *.ass

  ffmpeg version N-61585-ga1ce776 Copyright (c) 2000-2014 the FFmpeg developers
  built on Mar 20 2014 11:54:54 with gcc 4.8 (Ubuntu/Linaro 4.8.1-10ubuntu9)
  configuration: --enable-libfdk-aac --enable-libx264 --enable-openssl --enable-gpl --enable-nonfree --enable-librtmp --enable-x11grab
  libavutil      52. 67.100 / 52. 67.100
  libavcodec     55. 52.102 / 55. 52.102
  libavformat    55. 34.101 / 55. 34.101
  libavdevice    55. 11.100 / 55. 11.100
  libavfilter     4.  3.100 /  4.  3.100
  libswscale      2.  5.101 /  2.  5.101
  libswresample   0. 18.100 /  0. 18.100
  libpostproc    52.  3.100 / 52.  3.100
[mp3 @ 0x2ef99c0] Format mp3 detected only with low score of 1, misdetection possible!
[mp3 @ 0x2ef99c0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'The.Wolf.of.Wall.Street.2013.720p.BluRay.X264-AMIABLE .·±Ìå.ass':
  Duration: 00:00:26.35, start: 0.000000, bitrate: 160 kb/s
    Stream #0:0: Audio: mp3, 32000 Hz, stereo, s16p, 160 kb/s
At least one output file must be specified

Download subtitle frome https://bbs.vitamio.org/files/5326ae02421aa98fcb000752?locale=en&version=origin

Attachments (1)

The.Wolf.of.Wall.Street.2013.720p.BluRay.X264-AMIABLE .·±Ìå.ass (514.7 KB) - added by klpu 6 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 Changed 6 years ago by klpu

Can not detect Little-endian UTF-16 Unicode text subtitle.

comment:2 Changed 6 years ago by ubitux

  • Status changed from new to open
  • Summary changed from FFmpeg dont detect correct subtitle to Support UTF-16 subtitles
  • Type changed from defect to enhancement
00000000  ff fe 5b 00 53 00 63 00  72 00 69 00 70 00 74 00  |..[.S.c.r.i.p.t.|
00000010  20 00 49 00 6e 00 66 00  6f 00 5d 00 0d 00 0a 00  | .I.n.f.o.].....|
00000020  3b 00 20 00 53 00 63 00  72 00 69 00 70 00 74 00  |;. .S.c.r.i.p.t.|

Note: your link in the description is a .srt

Version 0, edited 6 years ago by ubitux (next)

comment:3 Changed 6 years ago by gjdfgh

This would be pretty simple to achieve:

  1. add a readline function that can convert utf16 to utf8 on the fly (but also can read utf8 alone - based on the function parameter)
  2. in the probe function, try all 3 fundamental encodings: 8 bit (codepage/multibyte, ASCII compatible), UTF16be, UTF16le.

This would be pretty simple, and is the approach used by mplayer. It wouldn't need any complicated charset detection and conversion code.

If nobody objects, I could write a patch.

comment:4 Changed 6 years ago by ubitux

  • Keywords subtitles added; subtitle removed

comment:5 Changed 6 years ago by cehoyos

  • Keywords sub added; subtitles removed

comment:6 Changed 5 years ago by ubitux

  • Resolution set to fixed
  • Status changed from open to closed

comment:7 Changed 5 years ago by cehoyos

  • Keywords utf16 added
Note: See TracTickets for help on using tickets.