Opened 5 months ago

Closed 5 months ago

Last modified 5 months ago

#7203 closed defect (invalid)

Problem with encoding type "Cyrillic (DOS)" with metadata delivery.

Reported by: max79 Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords: id3v2
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

TITLE metadata tag should looks like this- "Евангелие о следовании за Христом" for my example below.

To reproduce:

ffmpeg version N-91024-g293a6e8332 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7.3.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth
  libavutil      56. 18.100 / 56. 18.100
  libavcodec     58. 19.101 / 58. 19.101
  libavformat    58. 13.102 / 58. 13.102
  libavdevice    58.  4.100 / 58.  4.100
  libavfilter     7. 21.100 /  7. 21.100
  libswscale      5.  2.100 /  5.  2.100
  libswresample   3.  2.100 /  3.  2.100
  libpostproc    55.  2.100 / 55.  2.100
[mp3 @ 0000000000498380] Skipping 313 bytes of junk at 4096.
[mp3 @ 0000000000498380] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'https://audio3.azbyka.ru/Svjashhennoe_pisanie/nikola_serbski_tolk/01_Evangelie_o_sledovanii_za_Xristom.mp3':
  Metadata:
    title           : ├Е├в├а├н├г├е├л├и├е ├о ├▒├л├е├д├о├в├а├н├и├и ├з├а ├Х├░├и├▒├▓├о├м
    genre           : Other
    album           : ├Б├е├▒├е├д├╗ ├н├а ├Е├в├а├н├г├е├л├и├┐ 1├╖.
    track           : 1
    artist          : ├С├в├┐├▓├и├▓├е├л├╝ ├Н├и├к├о├л├а├й ├С├е├░├б├▒├к├и├й
    id3v2_priv.WM/MediaClassSecondaryID: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
    id3v2_priv.WM/MediaClassPrimaryID: \xbc}`\xd1#\xe3\xe2K\x86\xa1H\xa4*(D\x1e
    id3v2_priv.AverageLevel: {\x04\x00\x00
    id3v2_priv.PeakValue: !\x00\x00\x00
  Duration: 00:39:40.34, start: 0.000000, bitrate: 96 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, mono, fltp, 96 kb/s
At least one output file must be specified

Attachments (1)

01-vstuplenie.mp3 (1.0 MB) - added by max79 5 months ago.
Problem with encoding type "Cyrillic (DOS)" with metadata delivery.

Download all attachments as: .zip

Change History (8)

comment:1 Changed 5 months ago by cehoyos

  • Component changed from ffmpeg to undetermined
  • Priority changed from important to normal

Please provide an input sample.

Changed 5 months ago by max79

Problem with encoding type "Cyrillic (DOS)" with metadata delivery.

comment:2 Changed 5 months ago by max79

Please see attachment. TITLE metadata tag from this example should looks like this:

"Дорожка 1"

but now it looks like:

title : ├Д├о├░├о├ж├к├а 1

Anyway you can download it from here: https://azbyka.ru/audio/audio1/Zhitija-i-tvorenija-svjatykh/Nikolaj-Serbskij/molytvy-na-ozere/01-vstuplenie.mp3

comment:3 Changed 5 months ago by mkver

This file uses id3v2.3 tags. The TIT2-tag (the tag containing the title) is as follows in hex: 0x54 49 54 32 00 00 00 0A 00 00 00 C4 EE F0 EE E6 EA E0 20 31. According to the standard the 0x00 after the length field indicates that the tag uses ISO-8859-1 as encoding, an encoding that does not contain cyrillic characters. For such purposes Unicode could (and should) be used, but isn't. This is a bug in the tool that created said file, not in FFmpeg.
Btw: The last nine bytes are the actual titel; in Windows-1251 they would be read as "Дорожка 1"; in the Cyrillic DOS code page 866 that you are referring to it means "─юЁюцър 1". In ISO-8859-1 they mean "Äîðîæêà 1". FFmpeg's output to the console is encoded as UTF-8, but cmd.exe (that you seem to use) expects applications to use the native legacy codepage of the system (for Russian Windows versions, this is usually Code page 866; cmd.exe is by the way Unicode compatible). The UTF-8 that FFmpeg writes to the console is 0xC3 84 C3 AE C3 B0 C3 AE C3 A6 C3 AA C3 A0 20 31. In CP 866 0xC3 is "├" whereas 0x84 is "Д". That six of the seven characters of the word (seem to) have been preserved does not really have a deeper meaning. It is accidental.

Last edited 5 months ago by mkver (previous) (diff)

comment:4 Changed 5 months ago by max79

Thank you for clarification. But if it's not ffmpeg bug why you didn't close this ticket?

comment:5 follow-up: Changed 5 months ago by mkver

I am not an FFmpeg developer, but just an interested visitor to this board. I don't know whether I am allowed to close tickets and therefore I leave it to the real guys in charge.

comment:6 Changed 5 months ago by cehoyos

  • Keywords id3v2 added
  • Resolution set to invalid
  • Status changed from new to closed

comment:7 in reply to: ↑ 5 Changed 5 months ago by llogan

Replying to mkver:

I am not an FFmpeg developer, but just an interested visitor to this board. I don't know whether I am allowed to close tickets and therefore I leave it to the real guys in charge.

Please go ahead and close tickets if you think they should be closed, and provide an explanation why it should be closed (like you did in comment:3). If someone disagrees they can always add a comment and/or re-open it.

Note: See TracTickets for help on using tickets.