79 Commits (740efd71c4ad0f6f56371cabd23f086b985e0602)

Author SHA1 Message Date
  Chris Sidebottom 740efd71c4 Add optimized BGEMM kernel for NEOVERSEV1 target 2 months ago
  Martin Kroeker 343830c26f
Add BGEMM parameter tables 2 months ago
  Chris Sidebottom f95e7b0e32 Add infrastructure for BGEMM 3 months ago
  gkdddd 670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B 4 months ago
  Martin Kroeker 5141a90993
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS (#5222) 4 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 7 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 10 months ago
  Martin Kroeker 4924319c50
fix position of srotm, qrotm 8 months ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 8 months ago
  gxw 48698b2b1d LoongArch64: Rename core 1 year ago
  Mark Ryan 3b715e6162 Add autodetection for riscv64 1 year ago
  Martin Kroeker 93d975d8fd
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset 1 year ago
  gxw d8c4ea8793 loongarch: Optimizing the performance of the GEMM on servers 1 year ago
  Chen Yu 8e39c05efd Get the l2 cache size via environment variable on confidential VM 1 year ago
  Honglin Zhu 90f041e348 Invoke the syscall to allow the use of amx tiles 2 years ago
  Martin Kroeker 437c0bf2b4
Merge pull request #3843 from Mousius/switch-ratio 2 years ago
  Chris Sidebottom 32f2fafde7 Propagate SWITCH_RATIO to DYNAMIC_ARCH builds 2 years ago
  Martin Kroeker 38d6fb4225
Fix dependencies in builds with specified subsets of precision types 2 years ago
  Martin Kroeker 5481c328e8
fix DYNAMIC_ARCH builds that use only a subset of precisions 2 years ago
  Martin Kroeker c9d78dc3b2
Remove excess initializer (leftover from rework of PR 3793) 2 years ago
  Honglin Zhu 4989e039a5 Define SBGEMM_ALIGN_K for DYNAMIC_ARCH build 2 years ago
  Honglin Zhu 843e9fd0b9 Fix typo error 2 years ago
  Honglin Zhu b00d5b9746 New sbgemm implementation for Neoverse N2 2 years ago
  gxw fbfe1daf6e LoongArch64: Add DYNAMIC_ARCH support 3 years ago
  Martin Kroeker 40302558ed
Remove extraneous (and wrong) definition of sbgemm_r on x86_64 3 years ago
  Martin Kroeker d9894f45d3
Define sbgemm_r to fix DYNAMIC_ARCH builds 3 years ago
  Wangyang Guo 3dc6052c7e initial support for Sapphire Rapids platform 4 years ago
  Wangyang Guo 1d83ca4bca Small Matrix: support BFLOAT16 data type 4 years ago
  Wangyang Guo 478d1086c1 Small Matrix: support DYNAMIC_ARCH build 4 years ago
  gxw 4b548857d6 Add msa support for loongson 4 years ago
  Chen, Guobing a7b1f9b1bb Implementation of BF16 based gemv 5 years ago
  Martin Kroeker 10379fc83b
Use ifdef instead of if 5 years ago
  Martin Kroeker 3aecafad80
Change "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
  Martin Kroeker 6b6adf8a4a
Allow compiling only a subset of kernels for specific variable types 5 years ago
  Martin Kroeker dfbc62ef7e
Support building only a subset of types 5 years ago
  Chen, Guobing deaeb6c5b8 Add bfloat16 based dot and conversion with single/double 5 years ago
  Martin Kroeker 9ee21a0a39
Merge pull request #2780 from Guobing-Chen/CPL_build_support 5 years ago
  Martin Kroeker 75eeb265d7
[WIP] Refactor the driver code for direct SGEMM (#2782) 5 years ago
  Chen, Guobing e740c4873d Enable COOPERLAKE build target 5 years ago
  Martin Kroeker 5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 5 years ago
  Rajalakshmi Srinivasaraghavan 67cc4b9e16 Fix warnings in clang and export symbol 5 years ago
  Rajalakshmi Srinivasaraghavan a87793e03c Fix DYNAMIC_ARCH compilation errors 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  int_13h 96ad579428 add in runtime cpu detection for zarch (#2349) 5 years ago
  Martin Kroeker ccfb7ead15
Merge pull request #2072 from martin-frbg/sum 6 years ago
  Rashmica Gupta bcdf1d4917 Add in runtime CPU detection for POWER. 6 years ago
  Martin Kroeker b9f4943a14
Add ?sum 6 years ago
  Ashwin Sekhar T K d5aeff636f ARM64: Enable DYNAMIC_ARCH 7 years ago
  Ashwin Sekhar T K e7b66cd36e ARM64: Fix DYNAMIC_ARCH compilation for cores which dont use GEMM3M 7 years ago
  Martin Kroeker 6f71c0fce4
Return a somewhat sane default value for L2 cache size if cpuid retur… (#1611) 7 years ago