318 Commits (06c09deee94e4d03ab814d576da95fb047acbdda)

Author SHA1 Message Date
  Chris Sidebottom 114316f361 Optimize SBGEMM / BGEMM for NEOVERSEV1 further 1 month ago
  Masato Nakagawa 7e29f11396 Multi-thread GEMM Performance Improvement on NeoverseV1 (DIVIDE_RATE=1) 2 months ago
  Martin Kroeker c504aedca1
Merge pull request #5400 from Mousius/neoversev2-target 2 months ago
  Chris Sidebottom 87247daadc Add NEOVERSEV2 target support 2 months ago
  Chris Sidebottom ea2faf0c9a Add optimized BGEMM for NEOVERSEN2 target 2 months ago
  Chris Sidebottom 740efd71c4 Add optimized BGEMM kernel for NEOVERSEV1 target 2 months ago
  Chris Sidebottom f95e7b0e32 Add infrastructure for BGEMM 3 months ago
  Masato Nakagawa 5253c8f165 Multi-thread Performance Improvement of GEMM with DIVIDE_RATE=1 for 3 months ago
  h-motoki bba75d5e45 GEMM_PREFERED_SIZE parameter has been changed for A64FX. 3 months ago
  Martin Kroeker d96daa220d
Merge pull request #5290 from Srangrang/develop 3 months ago
  davidz-ampere aa90ab4142 Add support for Ampere AmpereOne processors 3 months ago
  davidz-ampere be68ef03b4 Add support for Ampere processors 3 months ago
  gkdddd 670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B 4 months ago
  Srangrang 0a967797a1 Add FP16 support for RISCV 4 months ago
  Martin Kroeker a34b487f22
Remove spurious cast from Alpha and Cell's DEFAULT_ALIGN 5 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 7 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 10 months ago
  Ye Tao c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256) 10 months ago
  Aditya Tewari 4379a6fbe3 * checkpoint sbgemm for SVE-256 11 months ago
  Martin Kroeker 926e56e389
Align GEMM3M parameters for GENERIC with ZGEMM and add P/Q/R 10 months ago
  Martin Kroeker a47b3c8867
Fix unroll parameter selection for MIPS64_GENERIC 11 months ago
  Martin Kroeker 7c4f3638fd
switch PPCG4 SGEMM kernel to 4x4 1 year ago
  gxw 48698b2b1d LoongArch64: Rename core 1 year ago
  Chip Kerchner b1737698db Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences. 1 year ago
  Piotr Kubaj 4c12090776
Fix build on FreeBSD/powerpc64* 1 year ago
  gxw 6017ad7146 loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 1 year ago
  Usui, Tetsuzo ca673ca774 Add GEMM_PREFERED_SIZE parameter for Neoverse V1 1 year ago
  Martin Kroeker 93d975d8fd
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset 1 year ago
  gxw d8c4ea8793 loongarch: Optimizing the performance of the GEMM on servers 1 year ago
  Martin Kroeker ba6d485102
Adjust SWITCH_RATIO for ZEN and apply GEMM_PREFERRED_SIZE 1 year ago
  Martin Kroeker 584e87661d
set SWITCH_RATIO for Cortex-A76 1 year ago
  Martin Kroeker b925f61fb0
Add support for Cortex-A76 1 year ago
  Rajalakshmi Srinivasaraghavan f5b2a877e2 POWER9: Use default param values from POWER8 on AIX 1 year ago
  pengxu 4787a55c64 Optimized cgemm kernel 16x4 LASX for LoongArch 1 year ago
  pengxu fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 1 year ago
  Martin Kroeker e5d2725e5a
Merge pull request #4185 from XiWeiGu/mips_enable_msa 1 year ago
  Sergei Lewis 1093def0d1 Merge branch 'risc-v' into develop 1 year ago
  Martin Kroeker 889c5d026a
Merge pull request #4456 from kseniyazaytseva/riscv-rvv10 1 year ago
  kseniyazaytseva b193ea3d7b Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics 1 year ago
  Dirreke ec89466e14 Add CSKY support 1 year ago
  Martin Kroeker 504f9b0c5e
Increase S/D GEMM PQ to match typical L2 size as forNeoverseV1 1 year ago
  Martin Kroeker 2802478449
revert change to Loongson2k1000 zgemm 1 year ago
  Martin Kroeker 44b5b9e39f
Update C/ZGEMM MN for Loongson2k1000 1 year ago
  Martin Kroeker 519b40fad9
Merge pull request #4398 from yinshiyou/la-dev 1 year ago
  pengxu a5d0d21378 loongarch64: Add zgemm and cgemm optimization 1 year ago
  Hao Chen 179ed51d3b Add dgemm_kernel_8x4.S file. 1 year ago
  Darshan Patel dab0da8243 Update GEMM param for NEOVERSEV1 1 year ago
  Octavian Maghiar e4586e81b8 [RISC-V] Add RISC-V Vector 128-bit target 1 year ago
  Rajalakshmi Srinivasaraghavan 980f702f72 POWER: AIX: Make use of power10 optimization 1 year ago
  gxw 553cc1372f LoongArch64: Add sgemm_kernel 2 years ago