2430 Commits (61b9339d3a1fd7a4c4d91fce92ac55e41f80a08a)

Author SHA1 Message Date
  Egbert Eich ea6515c4b3 On zarch don't produce objects from assembler with a writable stack section 8 months ago
  Ye Tao f27ba5efd1 fix bugs in aarch64 sbgemv_n kernel 8 months ago
  Annop Wongwathanarat edef2e4441 Fix bug in ARM64 sbgemv_t 8 months ago
  Martin Kroeker b55ca71d5b
Merge pull request #5182 from annop-w/sgemm_ncopy 8 months ago
  Martin Kroeker 2f778554b8
Merge pull request #5181 from taoye9/change_sbgemn_cast_bf16 8 months ago
  Annop Wongwathanarat 9807f56580 Optimize aarch64 sgemm_ncopy 8 months ago
  Martin Kroeker a3e7b16072
Merge pull request #5157 from manaalmj/feature 8 months ago
  Ye Tao 4c00099ed6 replace customize bf16_to_fp32 with arm neon vcvtah_f32_bf16 8 months ago
  Annop Wongwathanarat a085b6c9ec Fix aarch64 sbgemv_t compilation error for GCC < 13 8 months ago
  manjam01 5c4e38ab17 Optimize gemv_n_sve kernel 8 months ago
  Martin Kroeker 1d5ed5c46b
Merge pull request #5168 from taoye9/add_sbgemvn_on_neonversen2 8 months ago
  Ye Tao 6b8b35cdf2 fix minior issues of redeclaration of float x0,x1 in sbgemv_n_neon.c 8 months ago
  Ye Tao 38ee7c9301 Add dispatch of SBGEMVNKERNEL for NEOVERSEN2 and NEOVERSEV2 8 months ago
  Martin Kroeker 2b941c44b5
Merge branch 'develop' into sbgemv_n_neon 8 months ago
  Ye Tao 35bdbca153 Add sbgemv_n_neon kernel for arm64. 8 months ago
  Annop Wongwathanarat edaf51dd99 Add sbgemv_t_bfdot kernel for ARM64 9 months ago
  Martin Kroeker 77fba0f400
Fix "dummy2" flag handling 9 months ago
  Martin Kroeker eb84aac7ad
Merge pull request #5084 from quic/topic/sgemm_direct_sme1 9 months ago
  Martin Kroeker b9ae246f20
define USE_TRMM for RISCV64 targets as well 9 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 9 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 11 months ago
  Martin Kroeker 8d487ef6eb
Merge pull request #5124 from XiWeiGu/LoongArch64-LA264-lapack-fixed 9 months ago
  Martin Kroeker 81eed868b6
Restore the non-vectorized code from before PR4880 for POWER8 9 months ago
  Martin Kroeker 98b5ef929c
Restore the non-vectorized code from before PR4880 for POWER8 9 months ago
  gxw 2c4a5cc6e6 LoongArch64: Fixed snrm2_lsx.S and cnrm2_lsx.S 9 months ago
  gxw 9e75d6b3d1 LoongArch64: Fixed swap_lsx.S 9 months ago
  gxw e8c740368c LoongArch64: Fixed rot_lsx.S ane crot_lsx.S 9 months ago
  Hao Chen c2212d0abd LoongArch64: Fixed copy_lsx.S 9 months ago
  Hao Chen 7f1ebc7ae6 LoongArch64: Fixed iamax_lsx.S 9 months ago
  Hao Chen 31d326f895 LoongArch64: Fixed dot_lsx.S 10 months ago
  Hao Chen 5d6356bc16 LoongArch64: Fixed amax_lsx.S 10 months ago
  Ye Tao c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256) 11 months ago
  Aditya Tewari 4379a6fbe3 * checkpoint sbgemm for SVE-256 1 year ago
  Martin Kroeker d7036cfd74
Remove trailing blanks that break the cmake parser 10 months ago
  Martin Kroeker 6e393a5599
Merge branch 'develop' into gemv_t 10 months ago
  Martin Kroeker 876ba58e28
Merge pull request #5091 from goplanid/develop 10 months ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 10 months ago
  Deeksha Goplani d1bfa979f7 small gemm kernel packing modifications 10 months ago
  Martin Kroeker 1a6a9fb22f
add another generator line for rotm 10 months ago
  Martin Kroeker 4924319c50
fix position of srotm, qrotm 10 months ago
  Martin Kroeker b58cba9eb6
fix qrotm build rules 10 months ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 10 months ago
  Annop Wongwathanarat c0318cea6e Simplify gemv_t_sve_v1x3 kernel 10 months ago
  Martin Kroeker 87083fdbf6
[WIP] Work around assembler limitations in current LLVM for Windows on Arm (#5076) 10 months ago
  tingbo.liao ef7f54b357 Optimized the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256. 10 months ago
  gxw e0a8216554 LoongArch64: Update dsymv LSX version 10 months ago
  gxw a9070ba3f9 LoongArch64: Update ssymv LSX version 10 months ago
  Xi Ruoyao af10c132b8
LoongArch64: Fix dsymv and ssymv LASX version 10 months ago
  Martin Kroeker d74eb02954
Merge pull request #5057 from martin-frbg/issue5050 10 months ago
  Martin Kroeker 30f7a4120b
Merge pull request #5056 from tingboliao/dev_omatcopy_20250108 10 months ago