Martin Kroeker
8da6f7e5f2
Merge pull request #4686 from XiWeiGu/loongarch64_dgemm_kernel_16x6
Loongarch64: Improving the Performance and Stability of dgemm
1 year ago
gxw
f9a26240a7
loongarch64: Fixed icamax_lsx
1 year ago
gxw
cb0f707409
loongarch64: Fixed utest fork:safety
1 year ago
Martin Kroeker
b45d8e1ab2
remove stray comma
1 year ago
gxw
6017ad7146
loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6
1 year ago
Martin Kroeker
992b71fea2
remove stray comma
1 year ago
Martin Kroeker
d421dec278
Merge pull request #4656 from zboszor/fix-x86-64-build-v2
Add forgotten conditional uses of PREFETCH
1 year ago
Martin Kroeker
ae695d4ca0
Merge pull request #4642 from XiWeiGu/loongarch64_clang
CI: Add clang test for loongarch64
1 year ago
gxw
7cd438a5ac
loongarch64: Fixed clang compilation issues
1 year ago
Zoltán Böszörményi
ca64861ce8
Add forgotten conditional uses of PREFETCH
This fixes a (cross-)compilation/linker error for PRESCOTT
on Yocto.
Signed-off-by: Zoltán Böszörményi <zoltan.boszormenyi@xenial.com>
1 year ago
gxw
9c39e969f5
mips64: Fixed MSA optimization bugs for zgemv and cgemv
1 year ago
Martin Kroeker
4c03ed437f
Fix SICORTEX ASUM/ZASUM and SUM/ZSUM for INCX <=0 ( #4640 )
* Exit early if INCX <= 0
1 year ago
Martin Kroeker
7cfd433d0c
revert the C/Z NRM2 kernels to the base NEON kernel as well
1 year ago
Martin Kroeker
93d975d8fd
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset
loongarch: Optimizing the performance of the GEMM on servers
1 year ago
gxw
d8c4ea8793
loongarch: Optimizing the performance of the GEMM on servers
1 year ago
Chen Yu
8e39c05efd
Get the l2 cache size via environment variable on confidential VM
The CPUID(leaf:2 or leaf:0x80000006) is not supported on some confidential
VMs. As a result the get_l2_size() returns the default 512M which brings
performance issues.
Introduce the environment variable OPENBLAS_L2_SIZE provided by the user
to get the l2 cache size.
Suggested-by: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
1 year ago
Martin Kroeker
441c81026e
Add support for Cortex-A76
1 year ago
Martin Kroeker
9ead81bd39
Revert S/DNRM2 to the base NEON kernel to fix precision loss
1 year ago
gxw
96607cbb98
loongarch: Fixed dzamax
Initialize the registers to prevent sporadic errors.
1 year ago
gxw
50869f6ca8
loongarch: Fixed zrot LSX opt
1 year ago
gxw
b5eb9d6bac
loongarch: Fixed {sc/dz}amax LSX opt
1 year ago
gxw
ad13e04669
loongarch: Fixed {s/d/sc/dz}amin LSX opt
1 year ago
gxw
bbf82cb624
loongarch: Fixed {s/d}axpby LSX opt
1 year ago
gxw
ac460eb42a
loongarch: Fixed i{c/z}amin LSX opt
1 year ago
gxw
60e251a1f8
loongarch: Fixed {sc/dz}amax LASX opt
1 year ago
gxw
a10dde5554
loongarch: Fixed {s/d/sc/dz}amin LASX opt
1 year ago
gxw
6534d378b7
loongarch: Fixed {s/d/c/z}sum LASX opt
1 year ago
gxw
6159cffc58
loongarch: Fixed i{s/c/z}amin LASX opt
1 year ago
gxw
7d755912b9
loongarch: Fixed {s/d/c/z}axpby LASX opt
1 year ago
Martin Kroeker
cf80bd8500
Update nrm2_rvv.c
1 year ago
Martin Kroeker
9baa757905
Update nrm2_vector.c
1 year ago
Martin Kroeker
18a6db6862
Update nrm2_vector.c
1 year ago
Martin Kroeker
3752e73919
handle incx < 0
1 year ago
Martin Kroeker
db70c7f7fb
handle incx < 0
1 year ago
Martin Kroeker
dee8557d58
handle incx < 0
1 year ago
Martin Kroeker
d9dff17aec
handle incx < 0
1 year ago
Martin Kroeker
552c521353
remove another early exit for incx < 0
1 year ago
Martin Kroeker
ed532dc75b
remove another early exit for incx < 0
1 year ago
Martin Kroeker
6b89e1f1d7
fix loop condition for incx < 0
1 year ago
Martin Kroeker
20016a0096
fix loop condition for incx < 0
1 year ago
Martin Kroeker
09e84bd29a
fix loop condition for incx < 0
1 year ago
Martin Kroeker
f747aedb52
fix loop condition for incx < 0
1 year ago
Martin Kroeker
23796f8d31
fix loop condition for incx < 0
1 year ago
Martin Kroeker
bf93459746
fix loop condition for incx < 0
1 year ago
Martin Kroeker
e41d01bad9
remove early exit on negative inc_x
1 year ago
Martin Kroeker
02a025f9c1
remove early exit on negative inc_x
1 year ago
pengxu
680a77fafc
Optimized ssymv and dsymv kernel LSX for LoongArch
1 year ago
pengxu
6546600342
Optimized ssymv and dsymv kernel LASX for LoongArch
1 year ago
Chip-Kerchner
99384933ff
Revert "Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code"
This reverts commit accea1555159d0928a6aa2db740c042c7e8f0dd3, reversing
changes made to b925353006
.
1 year ago
Martin Kroeker
577d480c62
Merge pull request #4529 from ErnstPeng/feature-branch
Optimized sgemv and dgemv kernel LSX for LoongArch
1 year ago