Henry Chen
ef94b96530
Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A
This fix is similar to
2d8064174c
.
1 year ago
gxw
34b80ce03f
mips64: Fixed numpy CI failure
1 year ago
Martin Kroeker
bd47630bcf
exclude the alpha=0 branch as it does not handle NaN or Inf in x
1 year ago
Martin Kroeker
4c03ed437f
Fix SICORTEX ASUM/ZASUM and SUM/ZSUM for INCX <=0 ( #4640 )
* Exit early if INCX <= 0
1 year ago
Martin Kroeker
e5d2725e5a
Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
1 year ago
Martin Kroeker
7dd441d5db
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
gxw
4d0f000db6
MIPS: Enable MSA
2 years ago
Martin Kroeker
f6f35a4288
fix copyobj declarations to work with DYNAMIC_ARCH
3 years ago
Martin Kroeker
b1d69fb3ac
Add MIPS64_GENERIC as a copy of GENERIC
3 years ago
gxw
365936ae1b
MIPS64: Using the macro MTC rather than MTC1
3 years ago
Jiaxun Yang
a50b29c540
Provide a fallback MIPS64_GENERIC target
It is really dangerous to fallback to Loongson core on other
MIPS64 processors.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
3 years ago
gxw
cce4b1d956
MIPS64: Fix dnrm2_tiny testcase failure
3 years ago
gxw
4b548857d6
Add msa support for loongson
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
4 years ago
gxw
8d07cf9b67
Fix compilation problem on loongson platform
Using "make TARGET=GENERIC" on loongson platform will get the following
error messages:
"make[1]: *** No rule to make target 'sgemm_incopy.o', needed by 'libs'"
Add kernel/mips64/KERNEL.generic to slove the problem.
5 years ago
Martin Kroeker
07454bf4d5
Add proper defaults for IxMIN/IxMAX kernels
the fallbacks from Makefile.L1 assume a combined source for absolute value and non-absolute (with ifdef USE_ABS) but here we have separate implementations
5 years ago
Martin Kroeker
688fa9201c
Add MIPS64 implementation of ?sum
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
6 years ago
Martin Kroeker
6c7b691083
Really revert xDOT changes from 1832
neglected to rebase #1892 on merging
6 years ago
Martin Kroeker
5f4c550c27
Merge pull request #1892 from martin-frbg/mipsdot
revert MIPS64 xDOT kernel changes from #1832
6 years ago
Martin Kroeker
95a5542e3c
Revert DOT kernel changes from #1834
as the failures seen on Loongson3A appear to be limited to DSDOT/SDSDOT (i.e. my hackish "fix" from #1684 )
6 years ago
Martin Kroeker
7a2e1bc804
Use generic kernel for DSDOT/SDSDOT
as discussed in #1834
6 years ago
fengruilin
43bb386b10
fix dot problem on 64bit mips
6 years ago
fengrl
2d8064174c
register push/pop command change
64bit push/pop register command should be used. Otherwise, data will lost.
7 years ago
fengruilin
6fc85a6359
test_axpy work error on LOONGSON3A platform #1777
7 years ago
Martin Kroeker
4e103c822c
typo fix
7 years ago
Martin Kroeker
d2142760e0
Fix precision problem in DSDOT
7 years ago
Martin Kroeker
2fbfc64da8
Use C kernels for default c/zAXPY, xROT, c/zSWAP
7 years ago
Shivraj Patil
e3d844b062
Added mips I6500 core
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
8 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
9 years ago
Shivraj Patil
2c3dfe2bf3
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
2fb02626da
Update organization info.
11 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
11 years ago
Wang Qian
8e53b57bb2
Appending gemmkernel and trmmkernel C code in kernel/generic, this code can be used to execute on a new platform which dose not have optimized assemble kernel.
13 years ago
Wang Qian
66904fc4e8
BLAS3 used standard MIPS instructions without extensions on Loongson 3B.
14 years ago
Xianyi Zhang
0884f6b78d
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3b
14 years ago
traz
2d78fb05c8
Add conjugate condition to gemv.
14 years ago
Xianyi Zhang
b95ad4cfaf
Support detecting ICT Loongson-3B CPU.
14 years ago
traz
a32e56500a
Fix the compute error of gemv when incx and incy are negative numbers.
14 years ago
traz
c1e618ea2d
Add complete gemv function on Loongson3a platform.
14 years ago
traz
e08cfaf9ca
Complete all the complex single-precision functions of level3, but the performance needs further improve.
14 years ago
traz
ee4bb8bd25
Add ctrmm part in cgemm_kernel_loongson3a_4x2_ps.S.
14 years ago
traz
7fa3d23dd9
Complete cgemm function, but no optimization.
14 years ago
traz
9679dd077e
Fix some compute error.
14 years ago
traz
d238a768ab
Use ps instructions in cgemm.
14 years ago
traz
74d4cdb81a
Fix an illegal instruction for strmm_RTLU.
14 years ago
traz
7906146836
Fix an error for strmm_LLTN.
14 years ago
traz
3274ff47b8
Fix an error for strmm_LLTN.
14 years ago
traz
a059c553a1
Fix a compute error for strmm.
14 years ago
traz
23e182ca7c
Fix stack-pointer bug for strmm.
14 years ago
traz
a15bc95824
Add strmm part.
14 years ago