2558 Commits (06c09deee94e4d03ab814d576da95fb047acbdda)

Author SHA1 Message Date
  CDAC-SSDG f62519cc87
Delete kernel/arm64/rot_kernel_sve.c 11 months ago
  CDAC-SSDG 10857c9df4
Delete kernel/arm64/rot_kernel_c.c 11 months ago
  CDAC-SSDG b9f51a5cf7
Delete kernel/arm64/rot.c 11 months ago
  Martin Kroeker 81666de4ef
Merge pull request #5007 from martin-frbg/issue5006 11 months ago
  Martin Kroeker 3345007d8f
retire the thunderx2 NRM2 kernels due to reported inaccuracies and NAN 11 months ago
  Martin Kroeker 5fe983db29
retire the thunderx2 nrm2 kernels for now due to NAN and inaccuracies 11 months ago
  Iha, Taisei 4918beecbe Loop-unrolled transposed [SD]GEMV kernels for A64FX and Neoverse V1 11 months ago
  Juliya32 3b2421cba0
Add files via upload 1 year ago
  Juliya32 012fe4da36
Delete kernel/arm64/rot_kernel_sve.c 1 year ago
  Juliya32 d90ee00f85
Delete kernel/arm64/rot_kernel_c.c 1 year ago
  Juliya32 668e28adc4
Delete kernel/arm64/rot.c 1 year ago
  SushilPratap04 fa880ab1cf
Update KERNEL.ARMV8SVE 1 year ago
  SushilPratap04 7822ae9617
Added sve kernels for rot routine. 1 year ago
  SushilPratap04 b8bc2a752e
Added sve optimized kernels for swap routine 1 year ago
  CDAC-SSDG 0667cf6c92
Added optimized scal routine files 1 year ago
  gxw 73c6a28073 x86_64: opt somatcopy_ct with AVX 1 year ago
  Ayappan Perumal 020cce1068 Fix build issues with gcc compiler as well 1 year ago
  Ayappan Perumal b6ec73e77c Fix AIX build 1 year ago
  Martin Kroeker 016bdb9b0b
Merge pull request #4946 from XiWeiGu/la64_omatcopy_lasx 1 year ago
  Chip Kerchner ab71a1edf2 Better VSX. 1 year ago
  gxw bb31bbef52 LoongArch64: Opt somatcopy_ct with LASX 1 year ago
  gxw b37129341b LoongArch64: Opt somatcopy_cn with LASX 1 year ago
  gxw acf6cab304 LoongArch64: Opt somatcopy_rn with LASX 1 year ago
  gxw 15edb441bf LoongArch64: Opt somatcopy_rt with LASX 1 year ago
  Chip Kerchner 36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 1 year ago
  Martin Kroeker e52d9b4cf1
Merge pull request #4928 from austinpagan/czgemm_in_c 1 year ago
  Gordon Fossum 0b7fb5c791 CGEMM & ZGEMM using C code. 1 year ago
  Martin Kroeker 9783dd07ab
Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC 1 year ago
  Martin Kroeker c9e92348a6
Handle inf/nan if dummy2 flag is set 1 year ago
  Martin Kroeker d714013ab9
change sgemm kernel to 4x4 as the 16x4 altivec goes out of bounds 1 year ago
  Martin Kroeker de421b7764
Merge pull request #4904 from XiWeiGu/la64_cross_cmake 1 year ago
  gxw 30af9278dc LoongArch64: Enable cmake cross-compilation 1 year ago
  gxw 48698b2b1d LoongArch64: Rename core 1 year ago
  Deeksha Goplani 4894c54055 Improve TN case with further unrolling 1 year ago
  Martin Kroeker e05d98d00a
expressly use fld.d/fst.d for floating point registers instead of LD/ST macros 1 year ago
  Chip Kerchner a0aeba631d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Chip Kerchner 083faf7556 Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Chip Kerchner 75472b830a Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Henry Chen ef94b96530 Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A 1 year ago
  Martin Kroeker 7ca835a82c
address clang array overflow warning 1 year ago
  Martin Kroeker 46e331a917
remove the unworkable GEMM3M restriction from GENERIC again 1 year ago
  Martin Kroeker ccc23338d7
have the dummy GEMM3M kernel at least forward to regular GEMM 1 year ago
  Martin Kroeker f1c9803f9a
add proper return statement 1 year ago
  Martin Kroeker 60abcc3991
add proper return statement 1 year ago
  Chip Kerchner 1a7b8c650d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Martin Kroeker 9afd0c8afd
Merge pull request #4814 from Mousius/gemv-proxy 1 year ago
  Martin Kroeker edbf093c98
Update zarch SCAL kernels to handle INF and NAN arguments (#4829) 1 year ago
  Chris Sidebottom ba2e989c67 Add accumulators to AArch64 GEMV Kernels 1 year ago
  Martin Kroeker a875304eb0
fix inverted conditional for NAN handling 1 year ago
  Martin Kroeker 24acdd6bbb
correct offset 1 year ago