Opened 6 years ago

Closed 3 years ago

#5569 closed enhancement (fixed)

POWER8 VSX vectorization libswscale/output.c

Reported by: David Edelsohn Owned by:
Priority: wish Component: swscale
Version: git-master Keywords: bounty vsx
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no


Optimize approximately 30 functions in libswscale/output.c for POWER8 VSX

SIMD instructions on PPC64 Linux.

Change History (9)

comment:1 by Elon Musk, 6 years ago


comment:2 by David Edelsohn, 6 years ago

To improve performance on IBM POWER architecture. The swscale methods frequently appear high in profiles. The functions have optimizations for x86 SSE and could benefit from similar optimization on POWER VSX.

comment:3 by David Edelsohn, 6 years ago

Keywords: bounty added
Version: unspecifiedgit-master

comment:5 by Carl Eugen Hoyos, 6 years ago

Keywords: vsx added

comment:6 by cand, 3 years ago

PPC output.c coverage now exceeds x86. Speedups 4-16x, usually in line with x86, sometimes faster, sometimes a bit slower. Two functions only got 2x, but so did their x86 equivalents, and in those cases the C version uses a LUT, making it not as directly comparable.

Some rarely used pixfmts were not vectorized (things like 2/3/3 bit packed), but no other platform accelerates them either. Are those important for IBM, or is matching x86 the target?

comment:7 by David Edelsohn, 3 years ago

Enablement / vectorization equivalent to x86 is all that is necessary. It seems that Power VSX now matches x86 SSE.

Does the community close the issue or should I close the issue?

comment:8 by cand, 3 years ago

Cc: added

As the reporter, you should close it.

comment:9 by David Edelsohn, 3 years ago

Resolution: fixed
Status: openclosed

The VSX optimization patches resolve this issue.

Note: See TracTickets for help on using tickets.