Opened 4 years ago

Last modified 3 years ago

#3796 new enhancement

ffmpeg should automatically determine subtitle text encoding

Reported by: julian Owned by:
Priority: wish Component: undetermined
Version: 2.3 Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

ffmpeg currently can't auto detect the encoding of subtitle files and if the correct "sub_charenc" option is not specified it will fail to convert some subtitle parts. VLC and other players have no problem to detect the subtitle encoding automatically without manual help.

converted file has 11 of 18 subtitles missing. VLC plays the original file with external SRT file using right encoding.

% ffmpeg -i Film.mkv -i Film.srt -acodec copy -vcodec copy -scodec mov_text test.mp4
ffmpeg version 2.3-tessus Copyright (c) 2000-2014 the FFmpeg developers
  built on Jul 17 2014 22:19:03 with clang version 3.3 (tags/RELEASE_33/final)
  configuration: --cc=/opt/local/bin/clang-mp-3.3 --prefix=/Users/tessus/data/ext/ffmpeg/sw --as=yasm --extra-version=tessus --disable-shared --enable-static --disable-ffplay --enable-gpl --enable-pthreads --enable-postproc --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libx265 --enable-libxvid --enable-libspeex --enable-bzlib --enable-zlib --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libxavs --enable-version3 --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvpx --enable-libgsm --enable-libopus --enable-libmodplug --enable-fontconfig --enable-libfreetype --enable-libass --enable-libbluray --enable-filters --disable-indev=qtkit --enable-runtime-cpudetect
  libavutil      52. 92.100 / 52. 92.100
  libavcodec     55. 69.100 / 55. 69.100
  libavformat    55. 48.100 / 55. 48.100
  libavdevice    55. 13.102 / 55. 13.102
  libavfilter     4. 11.100 /  4. 11.100
  libswscale      2.  6.100 /  2.  6.100
  libswresample   0. 19.100 /  0. 19.100
  libpostproc    52.  3.100 / 52.  3.100
Input #0, matroska,webm, from 'Film.mkv':
  Metadata:
    COMPATIBLE_BRANDS: mp42avc1
    MAJOR_BRAND     : mp42
    MINOR_VERSION   : 1
    ENCODER         : Lavf55.33.100
  Duration: 00:01:00.03, start: 0.033000, bitrate: 636 kb/s
    Stream #0:0(eng): Video: h264 (Main), yuv420p(tv, smpte170m), 640x480, SAR 1:1 DAR 4:3, 30 fps, 30 tbr, 1k tbn, 2k tbc (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Video Mediensteuerung
    Stream #0:1(eng): Audio: aac, 44100 Hz, stereo, fltp (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Ton Mediensteuerung
Input #1, srt, from 'Film.srt':
  Duration: N/A, bitrate: N/A
    Stream #1:0: Subtitle: subrip
Output #0, mp4, to 'test.mp4':
  Metadata:
    COMPATIBLE_BRANDS: mp42avc1
    MAJOR_BRAND     : mp42
    MINOR_VERSION   : 1
    encoder         : Lavf55.48.100
    Stream #0:0(eng): Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 640x480 [SAR 1:1 DAR 4:3], q=2-31, 30 fps, 16k tbn, 1k tbc (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Video Mediensteuerung
    Stream #0:1(eng): Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo (default)
    Metadata:
      CREATION_TIME   : 2014-07-23 08:48:44
      LANGUAGE        : eng
      HANDLER_NAME    : Apple Ton Mediensteuerung
    Stream #0:2: Subtitle: mov_text ([8][0][0][0] / 0x0008)
    Metadata:
      encoder         : Lavc55.69.100 mov_text
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
  Stream #1:0 -> #0:2 (subrip (native) -> mov_text (native))
Press [q] to stop, [?] for help
[mp4 @ 0x10283a000] Non-monotonous DTS in output stream 0:0; previous: 3728, current: 3728; changing to 3729. This may result in incorrect timestamps in the output file.
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
[subrip @ 0x102839a00] Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option
Error while decoding stream #1:0: Invalid data found when processing input
frame= 1800 fps=0.0 q=-1.0 Lsize=    4703kB time=00:00:59.97 bitrate= 642.4kbits/s    
video:3905kB audio:723kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.625584%

Attachments (1)

Film.srt (1.4 KB) - added by cehoyos 4 years ago.

Download all attachments as: .zip

Change History (8)

comment:2 Changed 4 years ago by julian

i know conversion works fine when specifying '-sub_charenc CP1252' but if other players can autodetect, ffmpeg should be able to do the same

comment:3 Changed 4 years ago by cehoyos

  • Priority changed from normal to wish

Is this problem reproducible with current FFmpeg git head?

Last edited 4 years ago by cehoyos (previous) (diff)

Changed 4 years ago by cehoyos

comment:4 Changed 4 years ago by julian

if you think anything changed in this regard i could take the time to compile git head and try there...

comment:5 Changed 3 years ago by julian

i had to change the location of the uploaded files to reproduce the issue:

http://www.filehosting.org/file/details/501405/ffmpeg_3796_subtitle.zip

comment:6 Changed 3 years ago by cehoyos

Is this issue reproducible with current FFmpeg git head?

comment:7 Changed 3 years ago by julian

yes i just reproduced it with ffmpeg version N-74430-ge322b70-tessus

Note: See TracTickets for help on using tickets.