82 Commits (cd8ac192a901b38980755583faaa35559df7910a)

Author SHA1 Message Date
  Martin Kroeker 61d803547a
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC 2 years ago
  Martin Kroeker 898cf5faf3
Add Elbrus e2k architecture support 3 years ago
  Bine Brank b6a445cfd8 adapt Makefile for SVE trsm 3 years ago
  Bine Brank bb33446b40 fix makefile.L3 3 years ago
  Bine Brank 07fa6fa3b1 configure Makefile for sve 3 years ago
  Bine Brank 0140373802 add sve ztrmm 3 years ago
  Bine Brank 774267fdac adjust Makefile.L3 for SVE 4 years ago
  Bine Brank 86ae89bf33 add sgemm kernel and copy functions for sgemm and ssymm 4 years ago
  Bine Brank 9b9cb90bb1 modify Makefile for SVE copy 4 years ago
  Bine Brank 9388f05a3c configure SVE Makefile 4 years ago
  Wangyang Guo 3dc6052c7e initial support for Sapphire Rapids platform 4 years ago
  Martin Kroeker f1e3305974
Add workaround for Windows10 macro name clash 4 years ago
  Wangyang Guo 619588fbab sbgemm: remove unnecessary b0 files 4 years ago
  Wangyang Guo 1d83ca4bca Small Matrix: support BFLOAT16 data type 4 years ago
  Wangyang Guo 989e6bbdd3 Small Matrix: reduce generic kernel source files 4 years ago
  Wangyang Guo 5dc7c3c8e5 Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case 4 years ago
  Xianyi Zhang 57ed58cefe Refs #2587 Add small matrix optimization reference kernel for c/zgemm. 5 years ago
  Xianyi Zhang 17d32a4a82 Change a1b0 gemm to b0 gemm. 5 years ago
  Xianyi Zhang 59cb5de46b Refs #2587 Fix typos. 5 years ago
  Xianyi Zhang be3349405d Add alpha=1.0 beta=0.0 for small gemm. 5 years ago
  Xianyi Zhang 0a2077901c Add small marix optimization kernel interface. 5 years ago
  Martin Kroeker c4da892ba0
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels 4 years ago
  Martin Kroeker bd60fb6ffc
filter out -mavx flag on zgemm kernels as it can cause problems with older gcc 4 years ago
  gxw 4b548857d6 Add msa support for loongson 5 years ago
  Zhang Xianyi d7ba7679b6 Merge branch 'develop' into risc-v 5 years ago
  Rajalakshmi Srinivasaraghavan b5d30b390d Fix build issues with bfloat16 5 years ago
  Martin Kroeker 3aecafad80
Change "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
  Martin Kroeker 6b6adf8a4a
Allow compiling only a subset of kernels for specific variable types 5 years ago
  Martin Kroeker 9ee21a0a39
Merge pull request #2780 from Guobing-Chen/CPL_build_support 5 years ago
  Martin Kroeker 75eeb265d7
[WIP] Refactor the driver code for direct SGEMM (#2782) 5 years ago
  Chen, Guobing e740c4873d Enable COOPERLAKE build target 5 years ago
  Rajalakshmi Srinivasaraghavan 475b5c95b9 Remove extra symbol in Makefile 5 years ago
  Martin Kroeker da17abec87
fix trailing whitespace 5 years ago
  Martin Kroeker b144423f0f
Do not define USE_TRMM for 32bit POWER8 5 years ago
  Martin Kroeker ed7e155c35
Merge branch 'develop' into aix 5 years ago
  Martin Kroeker c854ef5471
Fix variable names in conditional 5 years ago
  Martin Kroeker c0afc11742
Fix POWERPC builds on AIX (gcc/gfortran 7) 5 years ago
  Kavana Bhat df4ade070f Fix for #2671 5 years ago
  Rajalakshmi Srinivasaraghavan 9fe930f205 powerpc: Add support for future processor 5 years ago
  Martin Kroeker 5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 5 years ago
  Rajalakshmi Srinivasaraghavan ff010f496e Build shgemm for all architecture 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  Xianyi Zhang 4aa2d89217 Merge branch 'develop' into risc-v 5 years ago
  Martin Kroeker 1a6ea8ee6d
Merge pull request #2338 from kavanabhat/aix_mod 6 years ago
  Kavana Bhat 6baa9b07d7 AIX changes for Power8 6 years ago
  Kavana Bhat 3938e59569 AIX changes for Power8 6 years ago
  Martin Kroeker e7c4d6705a
Revert #2051 and replace with a better fix (#2261) 6 years ago
  Kavana Bhat 3dc6b26eff AIX changes for Power8 6 years ago
  Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 6 years ago
  AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 6 years ago