Opened 11 years ago

Last modified 9 years ago

#3796 new enhancement

ffmpeg should automatically determine subtitle text encoding

Reported by: julian Owned by:
Priority: wish Component: undetermined
Version: 2.3 Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

ffmpeg currently can't auto detect the encoding of subtitle files and if the correct "sub_charenc" option is not specified it will fail to convert some subtitle parts. VLC and other players have no problem to detect the subtitle encoding automatically without manual help.

converted file has 11 of 18 subtitles missing. VLC plays the original file with external SRT file using right encoding.

% ffmpeg -i Film.mkv -i Film.srt -acodec copy -vcodec copy -scodec mov_text test.mp4
ffmpeg version 2.3-tessus Copyright (c) 2000-2014 the FFmpeg developers
  built on Jul 17 2014 22:19:03 with clang version 3.3 (tags/RELEASE_33/final)
  configuration: --cc=/opt/local/bin/clang-mp-3.3 --prefix=/Users/tessus/data/ext/ffmpeg/sw --as=yasm --extra-version=tessus --disable-shared --enable-static --disable-ffplay --enable-gpl --enable-pthreads --enable-postproc --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libx265 --enable-libxvid --enable-libspeex --enable-bzlib --enable-zlib --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libxavs --enable-version3 --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvpx --enable-libgsm --enable-libopus --enable-libmodplug --enable-fontconfig --enable-libfreetype --enable-libass --enable-libbluray --enable-filters --disable-indev=qtkit --enable-runtime-cpudetect
  libavutil      52. 92.100 / 52. 92.100
  libavcodec     55. 69.100 / 55. 69.100
  libavformat    55. 48.100 / 55. 48.100
  libavdevice    55. 13.102 / 55. 13.102
  libavfilter     4. 11.100 /  4. 11.100
  libswscale      2.  6.100 /  2.  6.100
  libswresample   0. 19.100 /  0. 19.100
  libpostproc    52.  3.100 / 52.  3.100
Input #0, matroska,webm, from 'Film.mkv':
  Metadata:
    COMPATIBLE_BRANDS: mp42avc1
    MAJOR_BRAND     : mp42
    MINOR_VERSION   : 1
    ENCODER         : Lavf55.33.100
  Duration: 00:01:00.03, start: 0.033000, bitrate: 636 kb/s
    Stream #0:0(eng): Video: h264 (Main), yuv420p(tv, smpte170m), 640x480, SAR 1:1 DAR 4:3, 30 fps, 30 tbr, 1k tbn, 2k tbc (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Video Mediensteuerung
    Stream #0:1(eng): Audio: aac, 44100 Hz, stereo, fltp (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Ton Mediensteuerung
Input #1, srt, from 'Film.srt':
  Duration: N/A, bitrate: N/A
    Stream #1:0: Subtitle: subrip
Output #0, mp4, to 'test.mp4':
  Metadata:
    COMPATIBLE_BRANDS: mp42avc1
    MAJOR_BRAND     : mp42
    MINOR_VERSION   : 1
    encoder         : Lavf55.48.100
    Stream #0:0(eng): Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 640x480 [SAR 1:1 DAR 4:3], q=2-31, 30 fps, 16k tbn, 1k tbc (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Video Mediensteuerung
    Stream #0:1(eng): Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Ton Mediensteuerung
    Stream #0:2: Subtitle: mov_text ([8][0][0][0] / 0x0008)
    Metadata:
      encoder         : Lavc55.69.100 mov_text
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
  Stream #1:0 -> #0:2 (subrip (native) -> mov_text (native))
Press [q] to stop, [?] for help
[mp4 @ 0x10283a000] Non-monotonous DTS in output stream 0:0; previous: 3728, current: 3728; changing to 3729. This may result in incorrect timestamps in the output file.
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
frame= 1800 fps=0.0 q=-1.0 Lsize=    4703kB time=00:00:59.97 bitrate= 642.4kbits/s    
video:3905kB audio:723kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.625584%

Attachments (1)

Film.srt (1.4 KB ) - added by Carl Eugen Hoyos 11 years ago.

Download all attachments as: .zip

Change History (8)

comment:2 by julian, 11 years ago

i know conversion works fine when specifying '-sub_charenc CP1252' but if other players can autodetect, ffmpeg should be able to do the same

comment:3 by Carl Eugen Hoyos, 11 years ago

Priority: normalwish

Is this problem reproducible with current FFmpeg git head?

Last edited 11 years ago by Carl Eugen Hoyos (previous) (diff)

by Carl Eugen Hoyos, 11 years ago

Attachment: Film.srt added

comment:4 by julian, 11 years ago

if you think anything changed in this regard i could take the time to compile git head and try there...

comment:5 by julian, 9 years ago

i had to change the location of the uploaded files to reproduce the issue:

http://www.filehosting.org/file/details/501405/ffmpeg_3796_subtitle.zip

comment:6 by Carl Eugen Hoyos, 9 years ago

Is this issue reproducible with current FFmpeg git head?

comment:7 by julian, 9 years ago

yes i just reproduced it with ffmpeg version N-74430-ge322b70-tessus

Note: See TracTickets for help on using tickets.