Author | SHA1 | Message | Date |
---|---|---|---|
|
2fb11f873b |
POWER10: Improve copy performance
This patch aligns the stores to 32 byte boundary for scopy and dcopy before entering into vector pair loop. For ccopy, changed the store instructions to stxv to improve performance of unaligned cases. |
4 years ago |
|
ad745c0bae |
Optimize scopy/ccopy for POWER10
This patch makes use of new POWER10 vector pair instructions for loads and stores. Also reorganized all variants of copy functions to make use of same kernel. |
5 years ago |