Opened 9 years ago
Closed 6 years ago
#5569 closed enhancement (fixed)
POWER8 VSX vectorization libswscale/output.c
Reported by: | David Edelsohn | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | swscale |
Version: | git-master | Keywords: | bounty vsx |
Cc: | cand@gmx.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Optimize approximately 30 functions in libswscale/output.c for POWER8 VSX
SIMD instructions on PPC64 Linux.
Change History (9)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
To improve performance on IBM POWER architecture. The swscale methods frequently appear high in profiles. The functions have optimizations for x86 SSE and could benefit from similar optimization on POWER VSX.
comment:3 by , 9 years ago
Keywords: | bounty added |
---|---|
Version: | unspecified → git-master |
comment:4 by , 9 years ago
Priority: | normal → wish |
---|---|
Status: | new → open |
comment:5 by , 8 years ago
Keywords: | vsx added |
---|
comment:6 by , 6 years ago
PPC output.c coverage now exceeds x86. Speedups 4-16x, usually in line with x86, sometimes faster, sometimes a bit slower. Two functions only got 2x, but so did their x86 equivalents, and in those cases the C version uses a LUT, making it not as directly comparable.
Some rarely used pixfmts were not vectorized (things like 2/3/3 bit packed), but no other platform accelerates them either. Are those important for IBM, or is matching x86 the target?
comment:7 by , 6 years ago
Enablement / vectorization equivalent to x86 is all that is necessary. It seems that Power VSX now matches x86 SSE.
Does the community close the issue or should I close the issue?
comment:9 by , 6 years ago
Resolution: | → fixed |
---|---|
Status: | open → closed |
The VSX optimization patches resolve this issue.
Why?