Opened 3 years ago

Closed 5 months ago

#5569 closed enhancement (fixed)

POWER8 VSX vectorization libswscale/output.c

Reported by: edelsohn Owned by:
Priority: wish Component: swscale
Version: git-master Keywords: bounty vsx
Cc: cand@gmx.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Optimize approximately 30 functions in libswscale/output.c for POWER8 VSX

SIMD instructions on PPC64 Linux.

Change History (9)

comment:1 Changed 3 years ago by richardpl

Why?

comment:2 Changed 3 years ago by edelsohn

To improve performance on IBM POWER architecture. The swscale methods frequently appear high in profiles. The functions have optimizations for x86 SSE and could benefit from similar optimization on POWER VSX.

comment:3 Changed 3 years ago by edelsohn

  • Keywords bounty added
  • Version changed from unspecified to git-master

comment:4 Changed 3 years ago by cehoyos

  • Priority changed from normal to wish
  • Status changed from new to open

comment:5 Changed 3 years ago by cehoyos

  • Keywords vsx added

comment:6 Changed 5 months ago by cand

PPC output.c coverage now exceeds x86. Speedups 4-16x, usually in line with x86, sometimes faster, sometimes a bit slower. Two functions only got 2x, but so did their x86 equivalents, and in those cases the C version uses a LUT, making it not as directly comparable.

Some rarely used pixfmts were not vectorized (things like 2/3/3 bit packed), but no other platform accelerates them either. Are those important for IBM, or is matching x86 the target?

comment:7 Changed 5 months ago by edelsohn

Enablement / vectorization equivalent to x86 is all that is necessary. It seems that Power VSX now matches x86 SSE.

Does the community close the issue or should I close the issue?

comment:8 Changed 5 months ago by cand

  • Cc cand@gmx.com added

As the reporter, you should close it.

comment:9 Changed 5 months ago by edelsohn

  • Resolution set to fixed
  • Status changed from open to closed

The VSX optimization patches resolve this issue.

Note: See TracTickets for help on using tickets.