Opened 4 months ago
Last modified 4 months ago
#11241 new enhancement
CC_IDENT macro generation doesn't consider non-utf8 encodings causes warnings and scrambled log when compiling on Windows
Reported by: | violet | Owned by: | |
---|---|---|---|
Priority: | minor | Component: | build system |
Version: | git-master | Keywords: | CC_IDENT |
Cc: | violet, MasterQuestionable | Blocked By: | |
Blocking: | Reproduced by developer: | yes | |
Analyzed by developer: | yes |
Description (last modified by )
Summary of the bug:
How to reproduce:
ENV: Win10 with Simplified-Chinese encoding "gb2312", msys2, ffmpeg 7.1+ master branch, VS2022 % ./configure \src\ffmpeg\ffmpeg-7.1\config.h(1): warning C4828: 文件包含在偏移 0x66e 处开始的字符,该字符在当前源字符集中无效(代码页 65001)
In config.h, due to the forced "/utf8" cflag, the generated CC_IDENT looks like:
#define CC_IDENT "���� x64 �� Microsoft (R) C/C++ �Ż������� 19.41.34123 ��"
It is actually a "gb2312" Chinese string:
#define CC_IDENT "用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.41.34123 版"
(slow fix)
I think it might be better to support non-utf8 encodings in CC_IDENT macro generation code .
(fast fix)
Or, let ./configure
inherit CC_IDENT from environment or params, just like ./configure --extra-cflags
, add another option ./configure --cc-ident
. In this way, users can provide CC_IDENT in correct encoding manually.
Change History (6)
comment:1 by , 4 months ago
Description: | modified (diff) |
---|
comment:2 by , 4 months ago
Priority: | normal → critical |
---|
comment:3 by , 4 months ago
Analyzed by developer: | set |
---|---|
Reproduced by developer: | set |
comment:4 by , 4 months ago
Cc: | added |
---|---|
Priority: | critical → minor |
Type: | defect → enhancement |
comment:5 by , 4 months ago
I'm afraid the utf-8 encoding is the de facto standard only in Linux and Internet, not Windows desktop. There are plenty of non-utf8 Windows desktop softwares all over the world.
If you reproduce this problem on a non-english Windows, you would find the CC_IDENT is generated by truncating the C compiler description automatically. Unfortunately, the C compiler in VS, i.e., cl.exe uses current system encoding to describe themselves, not utf-8. It's unacceptable to change the system encoding type to just satisfy ffmpeg, other desktop softwares will crash.
Before I report this issue, I've already tried numerous times to make ffmpeg use the utf-8 MY_CC_IDENT
string converted by myself. But ./configure
just override my correct CC_IDENT
agin and agin. That's why I suggested adding --cc-ident
parameter to let user decide CC_IDENT
if necessary.
In short, ffmpeg/configure
script need provide a way to let user override CC_IDENT
for non-utf8 users. Add a parameter like --cc-ident
to let users override original CC_IDENT
is the simplest way.
comment:6 by , 4 months ago
͏ So this is essentially a compiler issue.
͏ The compilation things in many cases are unjustifiably sophisticated.
͏ And much beyond the scope of FFmpeg: but programming languages and hardware, in general.
͏ UTF-8 has become the de facto standard of everything Plain Text.
͏ If you use certain legacy randomly defined charset: normalize your input first.