You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Martin Kroeker f4194fc65f
Merge branch 'develop' into la64_fixed_cscal_zscal
3 months ago
..
KERNEL Further rearranged the rotm kernel for the different architectures. 8 months ago
KERNEL.LA64_GENERIC Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC 1 year ago
KERNEL.LA264 LoongArch64: Rename core 1 year ago
KERNEL.LA464 LoongArch64: Opt somatcopy_ct with LASX 11 months ago
KERNEL.generic Further rearranged the rotm kernel for the different architectures. 8 months ago
Makefile Add support for LOONGARCH64 4 years ago
amax.S Add support for LOONGARCH64 4 years ago
amax_lasx.S Loongarch64: fixed amax_lasx 5 months ago
amax_lsx.S LoongArch64: Fixed amax_lsx.S 7 months ago
amin.S Add support for LOONGARCH64 4 years ago
amin_lasx.S loongarch: Fixed {s/d/sc/dz}amin LASX opt 1 year ago
amin_lsx.S loongarch: Fixed {s/d/sc/dz}amin LSX opt 1 year ago
asum.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
asum_lasx.S Loongarch64: fixed asum_lasx 5 months ago
asum_lsx.S loongarch64: Add and Refine asum optimization functions. 1 year ago
axpby_lasx.S loongarch: Fixed {s/d/c/z}axpby LASX opt 1 year ago
axpby_lsx.S loongarch: Fixed {s/d}axpby LSX opt 1 year ago
axpy_lasx.S loongarch64: Refine and add axpy optimization functions. 1 year ago
axpy_lsx.S loongarch64: Refine and add axpy optimization functions. 1 year ago
camax_lasx.S loongarch: Fixed dzamax 1 year ago
camax_lsx.S loongarch: Fixed dzamax 1 year ago
camin_lasx.S loongarch: Fixed {s/d/sc/dz}amin LASX opt 1 year ago
camin_lsx.S loongarch: Fixed {s/d/sc/dz}amin LSX opt 1 year ago
casum_lasx.S loongarch64: Add and Refine asum optimization functions. 1 year ago
casum_lsx.S loongarch64: Add and Refine asum optimization functions. 1 year ago
caxpby_lasx.S loongarch: Fixed {s/d/c/z}axpby LASX opt 1 year ago
caxpby_lsx.S LoongArch64: Opt {c/z}axpby 1 year ago
caxpy_lasx.S loongarch64: Refine and add axpy optimization functions. 1 year ago
caxpy_lsx.S loongarch64: Refine and add axpy optimization functions. 1 year ago
ccopy_lasx.S loongarch64: Add c/zcopy optimization functions. 1 year ago
ccopy_lsx.S loongarch64: Add c/zcopy optimization functions. 1 year ago
cdot_lasx.S Loongarch64: fixed cdot_lasx 5 months ago
cdot_lsx.S loongarch64: Add c/zdot optimization functions. 1 year ago
cgemm_kernel_2x2_lasx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
cgemm_kernel_2x2_lsx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
cgemm_kernel_8x4_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
cgemm_kernel_16x4_lasx.S expressly use fld.d/fst.d for floating point registers instead of LD/ST macros 1 year ago
cgemm_ncopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
cgemm_ncopy_2_lsx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
cgemm_ncopy_4_lasx.S Optimized cgemm kernel 16x4 LASX for LoongArch 1 year ago
cgemm_ncopy_4_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
cgemm_ncopy_8_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
cgemm_ncopy_16_lasx.S Loongarch64: fixed cgemm_ncopy_16_lasx 4 months ago
cgemm_tcopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
cgemm_tcopy_2_lsx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
cgemm_tcopy_4_lasx.S Optimized cgemm kernel 16x4 LASX for LoongArch 1 year ago
cgemm_tcopy_4_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
cgemm_tcopy_8_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
cgemm_tcopy_16_lasx.S Optimized cgemm kernel 16x4 LASX for LoongArch 1 year ago
cgemv_n_4_lsx.S loongarch64: Fixed clang compilation issues 1 year ago
cgemv_n_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
cgemv_t_4_lsx.S loongarch64: Fixed clang compilation issues 1 year ago
cgemv_t_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
cnrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
cnrm2_lasx.S Loongarch64: fixed cnrm2_lasx 5 months ago
cnrm2_lsx.S LoongArch64: Fixed snrm2_lsx.S and cnrm2_lsx.S 7 months ago
copy.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
copy_lasx.S Loongarch64: fixed copy_lasx 5 months ago
copy_lsx.S LoongArch64: Fixed copy_lsx.S 7 months ago
crot_lasx.S loongarch64: Add c/zrot optimization functions. 1 year ago
crot_lsx.S LoongArch64: Fixed rot_lsx.S ane crot_lsx.S 7 months ago
cscal_lasx.S Merge branch 'develop' into la64_fixed_cscal_zscal 3 months ago
cscal_lsx.S LoongArch64: Fixed LSX version of cscal and zscal 8 months ago
csum_lasx.S loongarch: Fixed {s/d/c/z}sum LASX opt 1 year ago
csum_lsx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 1 year ago
cswap_lasx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 1 year ago
cswap_lsx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 1 year ago
dgemm_kernel_8x4.S Add dgemm_kernel_8x4.S file. 1 year ago
dgemm_kernel_16x4.S expressly use fld.d/fst.d for floating point registers instead of LD/ST macros 1 year ago
dgemm_kernel_16x6.S loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
dgemm_ncopy_4.S loongarch64: Optimize dgemm_kernel 3 years ago
dgemm_ncopy_4_lsx.S Optimize copy functions with lsx. 1 year ago
dgemm_ncopy_8_lsx.S loongarch64: Fixed utest fork:safety 1 year ago
dgemm_ncopy_16.S loongarch64: Fixed utest fork:safety 1 year ago
dgemm_small_kernel_nn_lasx.S LoongArch: DGEMM small matrix opt 1 year ago
dgemm_small_kernel_nt_lasx.S LoongArch: DGEMM small matrix opt 1 year ago
dgemm_small_kernel_tn_lasx.S LoongArch: DGEMM small matrix opt 1 year ago
dgemm_small_kernel_tt_lasx.S LoongArch: DGEMM small matrix opt 1 year ago
dgemm_small_matrix_permit.c LoongArch: DGEMM small matrix opt 1 year ago
dgemm_tcopy_4.S loongarch64: Optimize dgemm_kernel 3 years ago
dgemm_tcopy_4_lsx.S loongarch64: Fixed clang compilation issues 1 year ago
dgemm_tcopy_6.S loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
dgemm_tcopy_8_lsx.S loongarch64: Fixed clang compilation issues 1 year ago
dgemm_tcopy_16.S loongarch64: Optimize dgemm_kernel 3 years ago
dgemv_n_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
dgemv_n_lsx.S Optimized sgemv and dgemv kernel LSX for LoongArch 1 year ago
dgemv_t_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
dgemv_t_lsx.S Optimized sgemv and dgemv kernel LSX for LoongArch 1 year ago
dnrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
dnrm2_lasx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 1 year ago
dnrm2_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 1 year ago
dot.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
dot_lasx.S Loongarch64: fixed dot_lasx 5 months ago
dot_lsx.S LoongArch64: Fixed dot_lsx.S 7 months ago
dscal_lasx.S loongarch64: Add optimizations for scal. 1 year ago
dscal_lsx.S loongarch64: Add optimizations for scal. 1 year ago
dsymv_L_lasx.S LoongArch64: Fix dsymv and ssymv LASX version 8 months ago
dsymv_L_lsx.S LoongArch64: Update dsymv LSX version 8 months ago
dsymv_U_lasx.S LoongArch64: Fix dsymv and ssymv LASX version 8 months ago
dsymv_U_lsx.S LoongArch64: Update dsymv LSX version 8 months ago
dtrsm_kernel_LN_16x4_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
dtrsm_kernel_LT_16x4_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
dtrsm_kernel_RN_16x4_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
dtrsm_kernel_RT_16x4_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
dtrsm_kernel_macro.S LoongArch64: Add dtrsm kernel 2 years ago
gemm_kernel.S Add support for LOONGARCH64 4 years ago
gemm_ncopy_6.prefx.c loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
gemv_n.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
gemv_t.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
iamax.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
iamax_lasx.S Loongarch64: fixed iamax_lasx 5 months ago
iamax_lsx.S LoongArch64: Fixed iamax_lsx.S 7 months ago
iamin.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
iamin_lasx.S loongarch: Fixed i{s/c/z}amin LASX opt 1 year ago
iamin_lsx.S loongarch64: Refine iamin optimization. 1 year ago
icamax_lasx.S Loongarch64: fixed icamax_lasx 5 months ago
icamax_lsx.S loongarch64: Fixed icamax_lsx 1 year ago
icamin_lasx.S loongarch: Fixed i{s/c/z}amin LASX opt 1 year ago
icamin_lsx.S loongarch: Fixed i{c/z}amin LSX opt 1 year ago
imax_lasx.S loongarch64: Refine imax optimization. 1 year ago
imax_lsx.S loongarch64: Refine imax optimization. 1 year ago
imin_lasx.S loongarch64: Refine imin optimization. 1 year ago
imin_lsx.S loongarch64: Refine imin optimization. 1 year ago
izamax.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
izamin.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
loongarch64_asm.S loongarch64: Fixed clang compilation issues 1 year ago
max.S Add support for LOONGARCH64 4 years ago
max_lasx.S loongarch64: Refine amax,amin,max,min optimization. 1 year ago
max_lsx.S loongarch64: Refine amax,amin,max,min optimization. 1 year ago
min.S Add support for LOONGARCH64 4 years ago
min_lasx.S loongarch64: Refine amax,amin,max,min optimization. 1 year ago
min_lsx.S loongarch64: Refine amax,amin,max,min optimization. 1 year ago
rot_lasx.S Loongarch64: fixed rot_lasx 5 months ago
rot_lsx.S LoongArch64: Fixed rot_lsx.S ane crot_lsx.S 7 months ago
scal.S LoongArch: Fixed numpy CI failure 1 year ago
scal_lasx.S LoongArch: Fixed numpy CI failure 1 year ago
scal_lsx.S LoongArch: Fixed numpy CI failure 1 year ago
sgemm_kernel_16x8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemm_ncopy_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemm_ncopy_16_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemm_tcopy_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemm_tcopy_16_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemv_n_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemv_n_lsx.S Optimized sgemv and dgemv kernel LSX for LoongArch 1 year ago
sgemv_t_8_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
sgemv_t_lsx.S Optimized sgemv and dgemv kernel LSX for LoongArch 1 year ago
snrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
snrm2_lasx.S Loongarch64: fixed snrm2_lasx 5 months ago
snrm2_lsx.S LoongArch64: Fixed snrm2_lsx.S and cnrm2_lsx.S 7 months ago
somatcopy_cn_lasx.c LoongArch64: Opt somatcopy_cn with LASX 11 months ago
somatcopy_ct_lasx.c LoongArch64: Opt somatcopy_ct with LASX 11 months ago
somatcopy_rn_lasx.c LoongArch64: Opt somatcopy_rn with LASX 11 months ago
somatcopy_rt_lasx.c LoongArch64: Opt somatcopy_rt with LASX 11 months ago
ssymv_L_lasx.S LoongArch64: Fix dsymv and ssymv LASX version 8 months ago
ssymv_L_lsx.S LoongArch64: Update ssymv LSX version 8 months ago
ssymv_U_lasx.S LoongArch64: Fix dsymv and ssymv LASX version 8 months ago
ssymv_U_lsx.S LoongArch64: Update ssymv LSX version 8 months ago
sum_lasx.S loongarch: Fixed {s/d/c/z}sum LASX opt 1 year ago
sum_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 1 year ago
swap.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
swap_lasx.S Loongarch64: fixed swap_lasx 5 months ago
swap_lsx.S LoongArch64: Fixed swap_lsx.S 7 months ago
trsm_kernel_LN.S Add support for LOONGARCH64 4 years ago
trsm_kernel_LN_UNROLLN6.c loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
trsm_kernel_LT.S Add support for LOONGARCH64 4 years ago
trsm_kernel_LT_UNROLLN6.c loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
trsm_kernel_RN_UNROLLN6.c loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
trsm_kernel_RT.S Add support for LOONGARCH64 4 years ago
trsm_kernel_RT_UNROLLN6.c loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
zamax.S Add support for LOONGARCH64 4 years ago
zamin.S Add support for LOONGARCH64 4 years ago
zasum.S Add support for LOONGARCH64 4 years ago
zcopy.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
zdot.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
zgemm3m_kernel.S Add support for LOONGARCH64 4 years ago
zgemm_kernel.S Add support for LOONGARCH64 4 years ago
zgemm_kernel_2x2.S loongarch64: Add zgemm and cgemm optimization 1 year ago
zgemm_kernel_2x2_lasx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
zgemm_kernel_4x4_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemm_kernel_8x4_lasx.S expressly use fld.d/fst.d for floating point registers instead of LD/ST macros 1 year ago
zgemm_ncopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
zgemm_ncopy_4_lasx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemm_ncopy_4_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemm_ncopy_8_lasx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemm_tcopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 1 year ago
zgemm_tcopy_4_lasx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemm_tcopy_4_lsx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemm_tcopy_8_lasx.S Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
zgemv_n.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
zgemv_n_2_lsx.S loongarch64: Fixed clang compilation issues 1 year ago
zgemv_n_4_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
zgemv_t.S Delete the macro instruction "li" and use "li.d" instead 4 years ago
zgemv_t_2_lsx.S loongarch64: Fixed clang compilation issues 1 year ago
zgemv_t_4_lasx.S loongarch64: Fixed clang compilation issues 1 year ago
znrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
znrm2_lasx.S loongarch64: Add c/znrm2 optimization functions. 1 year ago
znrm2_lsx.S loongarch64: Add c/znrm2 optimization functions. 1 year ago
zscal.S LoongArch64: Fixed scalar version of cscal and zscal 8 months ago
ztrsm_kernel_LT.S Add support for LOONGARCH64 4 years ago
ztrsm_kernel_RT.S Add support for LOONGARCH64 4 years ago