2411 Commits (f66ca05b313cf936eb8f75cc7ea1a87549e5b2a9)

Author SHA1 Message Date
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 7 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 10 months ago
  Martin Kroeker 8d487ef6eb
Merge pull request #5124 from XiWeiGu/LoongArch64-LA264-lapack-fixed 7 months ago
  Martin Kroeker 81eed868b6
Restore the non-vectorized code from before PR4880 for POWER8 7 months ago
  Martin Kroeker 98b5ef929c
Restore the non-vectorized code from before PR4880 for POWER8 7 months ago
  gxw 2c4a5cc6e6 LoongArch64: Fixed snrm2_lsx.S and cnrm2_lsx.S 7 months ago
  gxw 9e75d6b3d1 LoongArch64: Fixed swap_lsx.S 7 months ago
  gxw e8c740368c LoongArch64: Fixed rot_lsx.S ane crot_lsx.S 7 months ago
  Hao Chen c2212d0abd LoongArch64: Fixed copy_lsx.S 7 months ago
  Hao Chen 7f1ebc7ae6 LoongArch64: Fixed iamax_lsx.S 7 months ago
  Hao Chen 31d326f895 LoongArch64: Fixed dot_lsx.S 8 months ago
  Hao Chen 5d6356bc16 LoongArch64: Fixed amax_lsx.S 8 months ago
  Ye Tao c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256) 10 months ago
  Aditya Tewari 4379a6fbe3 * checkpoint sbgemm for SVE-256 11 months ago
  Martin Kroeker d7036cfd74
Remove trailing blanks that break the cmake parser 8 months ago
  Martin Kroeker 6e393a5599
Merge branch 'develop' into gemv_t 8 months ago
  Martin Kroeker 876ba58e28
Merge pull request #5091 from goplanid/develop 8 months ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 8 months ago
  Deeksha Goplani d1bfa979f7 small gemm kernel packing modifications 8 months ago
  Martin Kroeker 1a6a9fb22f
add another generator line for rotm 8 months ago
  Martin Kroeker 4924319c50
fix position of srotm, qrotm 8 months ago
  Martin Kroeker b58cba9eb6
fix qrotm build rules 8 months ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 8 months ago
  Annop Wongwathanarat c0318cea6e Simplify gemv_t_sve_v1x3 kernel 8 months ago
  Martin Kroeker 87083fdbf6
[WIP] Work around assembler limitations in current LLVM for Windows on Arm (#5076) 8 months ago
  tingbo.liao ef7f54b357 Optimized the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256. 8 months ago
  gxw e0a8216554 LoongArch64: Update dsymv LSX version 8 months ago
  gxw a9070ba3f9 LoongArch64: Update ssymv LSX version 8 months ago
  Xi Ruoyao af10c132b8
LoongArch64: Fix dsymv and ssymv LASX version 8 months ago
  Martin Kroeker d74eb02954
Merge pull request #5057 from martin-frbg/issue5050 8 months ago
  Martin Kroeker 30f7a4120b
Merge pull request #5056 from tingboliao/dev_omatcopy_20250108 8 months ago
  gxw 20a8e48f25 LoongArch64: Update ssymv LASX version 8 months ago
  gxw e0748588b8 LoongArch64: Update dsymv LASX version 8 months ago
  Martin Kroeker d91d4fa6e9
convert the beta=0 branch to a for loop as well 8 months ago
  Martin Kroeker 09e75f1588
fix absurd typo 8 months ago
  Martin Kroeker 2891fd8d6d
Replace while loop with for 8 months ago
  tingbo.liao 0a5dbf13d3 Optimize the omatcopy_cn and zomatcopy_cn kernels with RVV 1.0 intrinsic. 8 months ago
  Sergey Fedorov 229efa42ff scal.S: use r11 on 32-bit Darwin on powerpc 9 months ago
  Sergey Fedorov 81e1be8d90 Revert "temporarily disable the default S/DSCAL kernel" 9 months ago
  Martin Kroeker 9b9c0aa5c9
temporarily disable the default S/DSCAL kernel 9 months ago
  tingbo.liao c37509c213 Optimize the nrm2_rvv function to further improve performance. 9 months ago
  tingbo.liao 0bea1cfd9d Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256. 9 months ago
  tingbo.liao d00cc400b1 Replaced the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling. 9 months ago
  Martin Kroeker 229d8a025e
Merge pull request #4959 from CDAC-Bengaluru/level-1-sve 9 months ago
  SushilPratap04 3368a4e697
Update swap_kernel_sve.c 9 months ago
  CDAC-SSDG dd71e4234a
Added Updated swap and rot sve kernels. 9 months ago
  CDAC-SSDG 06ffd411a5
Update KERNEL.ARMV8SVE 9 months ago
  CDAC-SSDG 765850194e
Delete kernel/arm64/swap_kernel_sve.c 9 months ago
  CDAC-SSDG c17c19fbcf
Delete kernel/arm64/swap_kernel_c.c 9 months ago
  CDAC-SSDG f6416c0e37
Delete kernel/arm64/swap.c 9 months ago