Opened 6 years ago

Closed 5 years ago

#3048 closed defect (fixed)

ICL and HAVE_INLINE_ASM

Reported by: nikolaynnov Owned by:
Priority: normal Component: undetermined
Version: git-master Keywords: icl
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description (last modified by Timothy_Gu)

I have old version of ffmpeg (0.8) with was compiled with ICL compiler manually. Manually means what only required c-files (for 2-3 required for me codecs) was included in my icproj. Now I see that ffmpeg has native support for icl (through --toolchain=icl). So I have compiled last ffmpeg (2.0.1) with standard instructions. Then I have notice that encoding to MPEG4/AV_CODEC_ID_MSMPEG4V2 with new ffmpeg take too much time. I have some tests: old ffmpeg encode test video for a 0:55 and new ffmpeg encode the same video for a 1:19 - it is slowly on 43%!!! It is unacceptable in real applications.

I make some investigation and found that problem is that some optimizations are disabled for icl by default. The main degradation is that some high optimized asm code is not included in icl. HAVE_INLINE_ASM is 0 by default. I have defined HAVE_INLINE_ASM to 1, make code compilable and linkable with some minor changes and now the test video is encoded for 0:55 min as expected.

I believe that icl build of ffmpeg have to be fully optimized by default. I have attached some example of how to change code to make it compilable with ICL and HAVE_INLINE_ASM. Please note that I have disabled some asm code as I don't need it for my set of codecs.
Typically, there are 2 kind of changes:
1) icl doesn't know non ia32 - cltd instruction. It has to be replaced with cdq instruction instead.
example:

#ifdef __ICL
#define MASK_ABS(mask, level)                   \
    __asm__ ("cdq                   \n\t"      \
             "xorl %1, %0            \n\t"      \
             "subl %1, %0            \n\t"      \
             : "+a"(level), "=&d"(mask))
#else
#define MASK_ABS(mask, level)                   \
    __asm__ ("cltd                   \n\t"      \
             "xorl %1, %0            \n\t"      \
             "subl %1, %0            \n\t"      \
             : "+a"(level), "=&d"(mask))
#endif

2)

#ifdef __ICL
    __asm__ volatile(
        ...
        "movq %N, %%mm5     \n\t"
        ...
        : "m"(var)
    );
#else
    __asm__ volatile(
        ...
        "movq "MANGLE(var)", %%mm5     \n\t"
        ...
    );
#endif

I believe that icl syntax will work with gcc as well. So may be the best solution will be to rewrite code to don't use MANGLE in movq.

Attachments (3)

ICL_HAVE_INLINE.git.patch (17.4 KB) - added by nikolaynnov 6 years ago.
small path to compile ffmpeg with ICL and HAVE_INLINE_ASM
fate.log (11.1 KB) - added by nikolaynnov 6 years ago.
fate output
fate1.log (2.3 KB) - added by nikolaynnov 6 years ago.

Download all attachments as: .zip

Change History (17)

Changed 6 years ago by nikolaynnov

small path to compile ffmpeg with ICL and HAVE_INLINE_ASM

comment:1 Changed 6 years ago by cehoyos

  • Priority changed from normal to important
  • Version changed from unspecified to git-master

Sounds important to me because FFmpeg apparently does not only not warn that binaries are slow but - on the contrary - claims to "support" certain compilers.

If you have a patch against git master fixing this issue, please send it to the ffmpeg-devel mailing list where it can be discussed. (Is no solution with a macro possible?)
If there is anything we can do to make people more aware that patches on the bug tracker are typically ignored, please let us know!

comment:2 Changed 6 years ago by nikolaynnov

Unfortunately I don't have complete patch for git master. Note: My goal was to make encoding with MPEG4/AV_CODEC_ID_MSMPEG4V2 faster. When I have defined HAVE_INLINE_ASM macro all my required optimized functions was compiled without any additional changes. So in most cases where code was not compiled (code for the codecs which are not interesting for me) I just turn this optimization off (#if HAVE_INLINE_ASM && !defined(ICL)). Or even I have disabled some decoders like dirac/vc1 because I didn't taste time to fix everything. My patch provided may be useful as some start point only to see how to fix some kinds of compilation errors.

comment:3 Changed 6 years ago by michael

look in configure, theres a if enbled icl ... "disable inline_asm" with comments explaning why
can you try to remove that disable and check if build and very important fate tests (make fate) pass with with the result ?

comment:4 Changed 6 years ago by michael

  • Priority changed from important to normal

warning added

comment:5 follow-up: Changed 6 years ago by nikolaynnov

1) "look in configure, theres a if enbled icl ... "disable inline_asm" with comments explaning why"

# icl will pass the inline asm tests but inline asm is currently
# not supported (build will fail)

It is true, ICL can't compile some asm code which causes build failure. This ticket not for changing configure script. This request is for making code in ffmpeg to compilable with ICL. Changing configure script will be on second stage.

2) Unfortunately, I am Windows programmer and I never ran FATE before so I cannot test my patch with FATE. I suggest to complete this ticket (make code compilable with ICL), then try to change configure script and run FATE tests.

comment:6 in reply to: ↑ 5 Changed 6 years ago by cehoyos

Replying to nikolaynnov:

ICL can't compile some asm code which causes build failure. This ticket not for changing configure script. This request is for making code in ffmpeg to compilable with ICL.

How will we know which code has to be made compatible with icl if you don't change configure locally and tell us about the results?

2) Unfortunately, I am Windows programmer and I never ran FATE before so I cannot test my patch with FATE.

To the best of my knowledge, fate runs fine on Windows, if it fails for you, please ask either here or on ffmpeg-user for help. http://ffmpeg.org/platform.html#Microsoft-Visual-C_002b_002b-or-Intel-C_002b_002b-Compiler-for-Windows mentions one binary you have to download (bc).

Please understand that while I (also) consider this an important issue, nearly all developers do not use Windows at all and I expect most of the few who do not to have an icl license. So if you are interested in this ticket, please help fixing it!

Changed 6 years ago by nikolaynnov

fate output

comment:7 Changed 6 years ago by nikolaynnov

Ok, I have installed bc and did "make fate". Console output is attached.

comment:8 Changed 6 years ago by cehoyos

warning: only a subset of the fate tests will be run because SAMPLES is not spec

Several steps are necessary if (and only if) you want to help fixing this ticket (not necessarily in this order, just identifying the actual problems will already help):

  • Download fate samples after installing rsync: $ make SAMPLES=fate-suite fate-rsync
  • Confirm that fate passes for you with unpatched FFmpeg: $ make SAMPLES=fate-suite fate
  • Patch configure locally to allow inline asm with icl
  • Post the problems here: Run make, run make again (you see twice the same problem), then run make V=1 and post the output here (different files will probably fail, especially for mathops.h and motion_est.c different outputs will be needed as in $ make V=1 libavcodec/x86/motion_est.o)
  • Fix the problem in libavcodec/x86/mathops.h (I suspect a macro will suffice) and test if fate still passes (I assume you will need to revert the configure change and force the usage of MASK_ABS or verify that the assembly output is correct / identical)

Generally: Please use current git head if you don't already do so, patches against release branches are only accepted in some circumstances (typically regressions).

Changed 6 years ago by nikolaynnov

comment:9 Changed 6 years ago by nikolaynnov

I have an issue on the second step: it seems that "make SAMPLES=fate-suite fate" don't use samples, only standard tests.

comment:10 Changed 6 years ago by nikolaynnov

I can confirm (tested with my test video samples) that patched ffmpeg works:

  • decoding: MPEG4, H.264, MJPEG
  • encoding: MPEG4, MJPEG, MP3(via lame)

comment:11 Changed 6 years ago by michael

from #ffmpeg-devel:
<Daemon404> michaelni, just fyi for trac #3048, you should refer to TheRyuu?
<Daemon404> since he wrote the icl support
<Daemon404> he's working on inline asm too (supposedly) in ffmpeg
...
<michaelni> TheRyuu?, Daemon404 says you wrote ICL support
<michaelni> first thanks for doing that :)
<michaelni> and maybe you want to comment on https://trac.ffmpeg.org/ticket/3048 ?
<Daemon404> he said the patch wasnt complete when i asked in here yesterday
<Daemon404> but didnt elaborate
...
<michaelni> Daemon404, hmm, ok, thx, i guess ill copy and paste a few lines from here to trak then so the others who look at trak are aware
...
<michaelni> Daemon404, the patch in the ticket is incomplete and ugly yes (i assume that patch was what was meant)
<Daemon404> TheRyuu?, has a WIP
<Daemon404> i think
<Daemon404> on HEAD
<Daemon404> and some ffmpeg-specific changes
<michaelni> ok, that would be great :)
<Daemon404> but he's lazy, so feel free to stab him

Version 0, edited 6 years ago by michael (next)

comment:12 Changed 5 years ago by Timothy_Gu

  • Description modified (diff)
  • Resolution set to fixed
  • Status changed from new to closed

Should be fixed with 0f2588d and related commits by Matt Oliver.

comment:13 Changed 5 years ago by cehoyos

  • Resolution fixed deleted
  • Status changed from closed to reopened

To the best of my knowledge, this is not fixed (yet).

comment:14 Changed 5 years ago by cehoyos

  • Resolution set to fixed
  • Status changed from reopened to closed

Fixed by Matt Oliver in a6b0c0e1

Note: See TracTickets for help on using tickets.