gxw
9c39e969f5
mips64: Fixed MSA optimization bugs for zgemv and cgemv
1 year ago
Martin Kroeker
4c03ed437f
Fix SICORTEX ASUM/ZASUM and SUM/ZSUM for INCX <=0 ( #4640 )
* Exit early if INCX <= 0
1 year ago
Martin Kroeker
7cfd433d0c
revert the C/Z NRM2 kernels to the base NEON kernel as well
1 year ago
Martin Kroeker
93d975d8fd
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset
loongarch: Optimizing the performance of the GEMM on servers
1 year ago
gxw
d8c4ea8793
loongarch: Optimizing the performance of the GEMM on servers
1 year ago
Chen Yu
8e39c05efd
Get the l2 cache size via environment variable on confidential VM
The CPUID(leaf:2 or leaf:0x80000006) is not supported on some confidential
VMs. As a result the get_l2_size() returns the default 512M which brings
performance issues.
Introduce the environment variable OPENBLAS_L2_SIZE provided by the user
to get the l2 cache size.
Suggested-by: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
1 year ago
Martin Kroeker
441c81026e
Add support for Cortex-A76
1 year ago
Martin Kroeker
9ead81bd39
Revert S/DNRM2 to the base NEON kernel to fix precision loss
1 year ago
gxw
96607cbb98
loongarch: Fixed dzamax
Initialize the registers to prevent sporadic errors.
1 year ago
gxw
50869f6ca8
loongarch: Fixed zrot LSX opt
1 year ago
gxw
b5eb9d6bac
loongarch: Fixed {sc/dz}amax LSX opt
1 year ago
gxw
ad13e04669
loongarch: Fixed {s/d/sc/dz}amin LSX opt
1 year ago
gxw
bbf82cb624
loongarch: Fixed {s/d}axpby LSX opt
1 year ago
gxw
ac460eb42a
loongarch: Fixed i{c/z}amin LSX opt
1 year ago
gxw
60e251a1f8
loongarch: Fixed {sc/dz}amax LASX opt
1 year ago
gxw
a10dde5554
loongarch: Fixed {s/d/sc/dz}amin LASX opt
1 year ago
gxw
6534d378b7
loongarch: Fixed {s/d/c/z}sum LASX opt
1 year ago
gxw
6159cffc58
loongarch: Fixed i{s/c/z}amin LASX opt
1 year ago
gxw
7d755912b9
loongarch: Fixed {s/d/c/z}axpby LASX opt
1 year ago
Martin Kroeker
cf80bd8500
Update nrm2_rvv.c
1 year ago
Martin Kroeker
9baa757905
Update nrm2_vector.c
1 year ago
Martin Kroeker
18a6db6862
Update nrm2_vector.c
1 year ago
Martin Kroeker
3752e73919
handle incx < 0
1 year ago
Martin Kroeker
db70c7f7fb
handle incx < 0
1 year ago
Martin Kroeker
dee8557d58
handle incx < 0
1 year ago
Martin Kroeker
d9dff17aec
handle incx < 0
1 year ago
Martin Kroeker
552c521353
remove another early exit for incx < 0
1 year ago
Martin Kroeker
ed532dc75b
remove another early exit for incx < 0
1 year ago
Martin Kroeker
6b89e1f1d7
fix loop condition for incx < 0
1 year ago
Martin Kroeker
20016a0096
fix loop condition for incx < 0
1 year ago
Martin Kroeker
09e84bd29a
fix loop condition for incx < 0
1 year ago
Martin Kroeker
f747aedb52
fix loop condition for incx < 0
1 year ago
Martin Kroeker
23796f8d31
fix loop condition for incx < 0
1 year ago
Martin Kroeker
bf93459746
fix loop condition for incx < 0
1 year ago
Martin Kroeker
e41d01bad9
remove early exit on negative inc_x
1 year ago
Martin Kroeker
02a025f9c1
remove early exit on negative inc_x
1 year ago
pengxu
680a77fafc
Optimized ssymv and dsymv kernel LSX for LoongArch
1 year ago
pengxu
6546600342
Optimized ssymv and dsymv kernel LASX for LoongArch
1 year ago
Chip-Kerchner
99384933ff
Revert "Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code"
This reverts commit accea1555159d0928a6aa2db740c042c7e8f0dd3, reversing
changes made to b925353006
.
1 year ago
Martin Kroeker
577d480c62
Merge pull request #4529 from ErnstPeng/feature-branch
Optimized sgemv and dgemv kernel LSX for LoongArch
1 year ago
pengxu
b2db064285
Optimized sgemv and dgemv kernel LSX for LoongArch
1 year ago
Martin Kroeker
cfbb701497
Merge pull request #4536 from XiWeiGu/loongarch64-cgemv-zgemv-opt
Loongarch64 cgemv zgemv opt
1 year ago
gxw
8e05c053be
LoongArch64:Fixed the failed test cases test_{c/z}gemv_n in test_extensions
1 year ago
gxw
3f22fc2233
LoongArch64: Add zgemv LSX opt
1 year ago
gxw
c508a10cf2
LoongArch64: Add cgemv LSX opt
1 year ago
Martin Kroeker
accea15551
Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code
Cgemm zgemm c code
1 year ago
Martin Kroeker
8e872a91a9
Fix erroneous mapping of SUM kernels to ASUM
1 year ago
Martin Kroeker
6699227d45
Merge pull request #4525 from XiWeiGu/loongarch64_fixed_kernel_regress_skx_avx
LoongArch64: Fixed utest kernel_regress:skx_avx
1 year ago
gxw
8dea25ffff
LoongArch64: Fixed utest kernel_regress:skx_avx
1 year ago
Martin Kroeker
7d506984fa
fix assignment of default CSUM kernel
1 year ago