Iha, Taisei
|
f1e628b889
|
Further performance improvements to [SD]GEMV.
|
5 months ago |
Iha, Taisei
|
4918beecbe
|
Loop-unrolled transposed [SD]GEMV kernels for A64FX and Neoverse V1
|
10 months ago |
iha fujitsu
|
0985fdc82b
|
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
|
1 year ago |
Chris Sidebottom
|
ecae1389df
|
Reduce duplication in kernel definitions
These files are exactly the same, so I believe we can reduce these files
down. Other files require a slightly more complex unpicking.
|
1 year ago |
Martin Kroeker
|
e7d05402e0
|
Fix up S/D GEMM copy function definitions after #4009
|
2 years ago |
Bine Brank
|
f1315288a8
|
add sve ztrsm
|
3 years ago |
Bine Brank
|
f33543d029
|
combine zchemm into single file
|
3 years ago |
Bine Brank
|
d30157d891
|
update configuration of kernels for A64FX and ARMV8SVE
|
3 years ago |
Bine Brank
|
ce329ab686
|
add sve zhemm copy routines
|
3 years ago |
Bine Brank
|
0140373802
|
add sve ztrmm
|
3 years ago |
Bine Brank
|
e3c9947c0f
|
prepare kernel for sve zgemm
|
3 years ago |
Bine Brank
|
a8f62a347b
|
fix UNROLL_MN and add to targets for SVE
|
3 years ago |
Bine Brank
|
86ae89bf33
|
add sgemm kernel and copy functions for sgemm and ssymm
|
3 years ago |
Bine Brank
|
9b9cb90bb1
|
modify Makefile for SVE copy
|
3 years ago |
Bine Brank
|
ab7917910d
|
add v2x8 kernel + fix sve dtrmm
|
3 years ago |
Martin Kroeker
|
22bf5c27ba
|
Add basic support for the Fujitsu A64FX (#3415)
* Add initial support for Fujitsu A64FX as generic ARMV8
|
4 years ago |