Martin Kroeker
b3c90564d7
resync with the generic arm version for inf/nan handling
3 months ago
tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
8 months ago
gxw
f6d6c14a96
mips: Fixed numpy CI failure
1 year ago
Martin Kroeker
a11f086c17
Update sscal_msa.c
1 year ago
Martin Kroeker
541e1b6959
disable the fast path for inc=1, alpha=0 as it does not handle x=NaN or Inf
1 year ago
Martin Kroeker
c08113c279
fix special cases of x= NAN or INF
1 year ago
Martin Kroeker
5ed4f24d6e
Handle corner cases with INF and NAN arguments
1 year ago
Martin Kroeker
8c05765a5a
fix other corner cases where x=INF
1 year ago
gxw
9c39e969f5
mips64: Fixed MSA optimization bugs for zgemv and cgemv
1 year ago
Martin Kroeker
09e84bd29a
fix loop condition for incx < 0
1 year ago
Martin Kroeker
f747aedb52
fix loop condition for incx < 0
1 year ago
Martin Kroeker
e5d2725e5a
Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
1 year ago
Martin Kroeker
7df363e1e2
temporarily disable the MSA C/ZSCAL kernels
1 year ago
Martin Kroeker
25b0c48082
Update zscal.c
1 year ago
Martin Kroeker
5e7f714e93
Update zscal.c
1 year ago
Martin Kroeker
acf17a825d
Handle NAN in input
1 year ago
Martin Kroeker
f692178792
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
gxw
4d0f000db6
MIPS: Enable MSA
2 years ago
gxw
edea1bcfaf
MIPS64: Fixed failed utest dsdot:dsdot_n_1 when TARGET=I6500
3 years ago
Martin Kroeker
b7df500106
Add generic mips32 target
3 years ago
gxw
4b548857d6
Add msa support for loongson
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
4 years ago
Martin Kroeker
7f11e33e8d
Merge pull request #3025 from TiredNotTear/develop
MIPS: Fix two bugs
4 years ago
Hao Chen
ad38bd0e89
Fix failed cgemv and zgemv test case after using msa optimization
The cgemv and zgemv test case will call cgemv_n/t_msa.c zgemv_n/t_msa.c files in MIPS environment.
When the macro CONJ is defined, the calculation result will be wrong due to the wrong definition of OP2.
This patch updates the value of OP2 and passes the corresponding test.
4 years ago
Hao Chen
47b639cc9b
Fix failed sswap and dswap case by using msa optimization
The swap test case will call sswap_msa.c and dswap_msa.c files in MIPS environmnet.
When inc_x or inc_y is equal to zero, the calculation result of the two functions will be wrong.
This patch adds the processing of inc_x or inc_y equal to zero, and the swap test case has passed.
4 years ago
Jin Bo
65de6f5957
Fix test errors reported by cblas_cgemm & cblas_ctrmm
The file cgemm_kernel_8x4_msa.c holds the MSA optimization
codes of cblas_cgemm and cblas_ctrmm. It defines two
macros: CGEMM_SCALE_1X2 and CGEMM_TRMM_SCALE_1X2. The pc1
array index in the two macros should be 0 and 1.
4 years ago
Martin Kroeker
e55ec82bb9
Delete KERNEL.1004K
5 years ago
Martin Kroeker
7353ea5afc
Delete KERNEL.24K
5 years ago
Martin Kroeker
6a04efb122
Rename KERNEL files to include MIPS prefix
5 years ago
Martin Kroeker
d712ea724c
Add MIPS24K support
5 years ago
Martin Kroeker
cdbe0f0235
Add MIPS implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
6 years ago
Martin Kroeker
86a824c97f
Fix wrong comparison that made IMIN identical to IMAX
as reported by aarnez in #1990
6 years ago
Martin Kroeker
8dd3515fa2
Merge pull request #1565 from martin-frbg/mipstypo
Remove extraneous brace from previous commit of mips dsdot fix
7 years ago
Martin Kroeker
95f7f0229c
Remove extraneous brace from previous commit
7 years ago
Martin Kroeker
893b535540
Use correct data type for initializers of v2f64, v4f32
Fixes #1561
7 years ago
Martin Kroeker
9d5098dbc9
Add MIPS 1004K target (Mediatek MT7621 SOC)
7 years ago
Martin Kroeker
954f1832de
Merge pull request #1540 from martin-frbg/mips32-zasum
Fix typo in MIPS P5600 complex ASUM code selection
7 years ago
Martin Kroeker
941ad280a8
Fix typo in MIPS P5600 complex ASUM code selection
7 years ago
Martin Kroeker
0fe434598b
Fix precision of mips dsdot
7 years ago
Andrew
13e137fbc9
Initialize uninitialized variables (cppcheck)
7 years ago
Shivraj Patil
a4d97d980f
Added rot functions.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
8 years ago
kaustubh
1480f3df71
Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
8 years ago
kaustubh
88afb3bc94
Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
8 years ago
Shivraj Patil
a9bf8a781a
Added prefetch to CGEMV and ZGEMV.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
8 years ago
kaustubh
5f93aa5f87
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
8 years ago
kaustubh
9db451acd0
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
8 years ago
kaustubh
3eaff85191
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
8 years ago
kaustubh
00abce3b93
Add data prefetch in DOT and ASUM functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
kaustubh
f3419e634c
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
kaustubh
90e2321ac3
STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
Martin Kroeker
91610f3835
Update zdot_msa.c
9 years ago