Opened 10 years ago
Last modified 7 years ago
#3718 open enhancement
ffmpeg does not correctly read input text file.
Reported by: | Maxwell175 | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | avformat |
Version: | git-master | Keywords: | concat |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
How to reproduce:
> ffmpeg -f concat -i t mp.txt -c copy output.wav ffmpeg version N-60592-gfd982f2 Copyright (c) 2000-2014 the FFmpeg developers built on Feb 13 2014 22:01:02 with gcc 4.8.2 (GCC) configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfi g --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libcaca --enable-libfreetyp e --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopenco re-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libsp eex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-li bvorbis --enable-libvpx --enable-libwavpack --enable-libx264 --enable-libxavs --enable-libxvid --enable-zlib libavutil 52. 63.101 / 52. 63.101 libavcodec 55. 52.101 / 55. 52.101 libavformat 55. 32.101 / 55. 32.101 libavdevice 55. 9.100 / 55. 9.100 libavfilter 4. 1.102 / 4. 1.102 libswscale 2. 5.101 / 2. 5.101 libswresample 0. 17.104 / 0. 17.104 libpostproc 52. 3.100 / 52. 3.100 [concat @ 003b36e0] Line 1: unknown keyword 'file' tmp.txt: Invalid data found when processing input
This is the Windows Zeranoe Build downloaded from here: http://ffmpeg.zeranoe.com/builds/win32/static/ffmpeg-20140612-git-3a1c895-win32-static.7z
The file is written from a self-made Visual Basic program using the method described here: http://msdn.microsoft.com/en-us/library/ms128035(v=vs.110).aspx. As you can see under the Remarks section, it uses the UTF-8 encoding.
Turns out, that method also writes 3 extra chars to the file, ef bb bf. It seems like this throws off FFMPEG and it gives the error above.
Attachments (1)
Change History (10)
by , 10 years ago
comment:1 by , 10 years ago
Component: | undetermined → avformat |
---|---|
Keywords: | concat added |
Resolution: | → worksforme |
Status: | new → closed |
Version: | unspecified → git-master |
Sounds as if FFmpeg behaves as expected.
Or is there anything in the documentation that implies that you may put random bytes in front of the file
keyword?
comment:2 by , 10 years ago
I am NOT putting in random bytes!
http://www.pcreview.co.uk/forums/extra-characters-beginning-file-ef-bb-bf-t3902307.html
The 2nd post there clearly states that is a "byte-order mark (BOM)" and I think that this SHOULD be supported.
Also see this page: http://www.unicode.org/faq/utf_bom.html#bom1. As you can see there, it is an official spec.
comment:3 by , 10 years ago
Resolution: | worksforme |
---|---|
Status: | closed → reopened |
follow-up: 5 comment:4 by , 10 years ago
Priority: | normal → wish |
---|---|
Type: | defect → enhancement |
A byte order mark is an invisible neutral character as human-readable text goes, but it is nonetheless a character, and therefore included in a computer-readable text. Supporting it wold be possible, but verly low in my priority list, and only if it can be done in a generic way that does not require changing all parts of the code that read text files.
As a side note, you will get the same problem from a lot of other program, so you should definitely try to learn how to produce files with just what you want in them and not what any random API decides to add.
comment:5 by , 10 years ago
Replying to Cigaes:
A byte order mark is an invisible neutral character as human-readable text goes, but it is nonetheless a character, and therefore included in a computer-readable text. Supporting it wold be possible, but verly low in my priority list, and only if it can be done in a generic way that does not require changing all parts of the code that read text files.
Since I am also a programmer, though not a C programmer, I looked around a bit and found some code samples: https://workspaces.codeproject.com/user-8645021/reading-utf-8-with-c-streams. Also, shouldn't there be a separate function that gets called from all the places to read text files, so you would not have the same code repeated many times.
As a side note, you will get the same problem from a lot of other program, so you should definitely try to learn how to produce files with just what you want in them and not what any random API decides to add.
This is NOT some "random" API. Many people use this method, especially beginners. Since I know other methods, I can use them, but still...
comment:6 by , 10 years ago
Status: | reopened → open |
---|
follow-up: 8 comment:7 by , 10 years ago
If ffmpeg refuses to deal with broken Microsoft bullshit (because that's what the BOM is), so be it.
Though It would be only 1 line of code or so to skip the Microsoft bullshit.
comment:8 by , 10 years ago
Replying to gjdfgh:
Though It would be only 1 line of code or so to skip the Microsoft bullshit.
Send the patch.
comment:9 by , 7 years ago
To me shows the error:
Line 1: unknown keyword ' ■f'
input.txt: Invalid data found when processing input
Use Notepad++ to change the mp.txt encoding from UCS-2 Little Endian to ANSI or UTF-8.
File used in the command