114 Commits (616cc28d82da8981b853cccf365439905f8efdba)

Author SHA1 Message Date
  yamazaki-mitsufumi 51ab1903e7 Expanding the scop of 2D thread distribution 1 year ago
  shivammonaka d49ebc54e1 Merge branch 'shivam-develop' into shivam-Locks 1 year ago
  shivammonaka bc191015e3 Using OpenMP locks with NUM_PARALLEL 1 year ago
  Martin Kroeker c4bd4a2e5d
fix improper function prototypes (empty parentheses) 2 years ago
  Chris Sidebottom 32f2fafde7 Propagate SWITCH_RATIO to DYNAMIC_ARCH builds 2 years ago
  Honglin Zhu 4989e039a5 Define SBGEMM_ALIGN_K for DYNAMIC_ARCH build 2 years ago
  Honglin Zhu b00d5b9746 New sbgemm implementation for Neoverse N2 2 years ago
  Wangyang Guo 3dc6052c7e initial support for Sapphire Rapids platform 4 years ago
  Martin Kroeker 2f8220d757
Add sbgemm 4 years ago
  Martin Kroeker 307c4c0786
Fix typo 4 years ago
  Martin Kroeker e83df93975
Work around another recent macro name collision with winnt.h 4 years ago
  Martin Kroeker a554712439
remove extra/intermediate size step for min_jj introduced in PR747 4 years ago
  Martin Kroeker 5d26223f4a
remove extra/intermediate size step of min_jj from PR747 4 years ago
  Martin Kroeker d3ff1f889f
Convert ifndefs to ifneq 4 years ago
  Rajalakshmi Srinivasaraghavan b5d30b390d Fix build issues with bfloat16 5 years ago
  Martin Kroeker 006c7f6671
Change "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
  Martin Kroeker 886a8e3190
Adapt for supporting only a subset of variable types 5 years ago
  Martin Kroeker ac653c94f3
Merge branch 'develop' into issue2588-cmake 5 years ago
  Martin Kroeker 988a6f429e
Add BUILD_vartype defines 5 years ago
  Martin Kroeker e5e2fbd593
Support building only selected types 5 years ago
  y00512012 06cf73a239 fix a bug of trmm 5 years ago
  Martin Kroeker ddec244a5a
Merge pull request #2838 from austinpagan/gordon_trmm 5 years ago
  fossum dfeca46098 Adding performance patch for trmm, just like #2836 5 years ago
  fossum 274d6e015b Fixing a performance bug in trsm_[LR].c. 5 years ago
  Martin Kroeker 330044d821
Fix potentiol domain error in sqrt 5 years ago
  Chen, Guobing e740c4873d Enable COOPERLAKE build target 5 years ago
  Martin Kroeker ce45af8151
Update conditional for atomics to use HAVE_C11 5 years ago
  Martin Kroeker 6f38de06d2
Update conditional for atomics to use HAVE_C11 5 years ago
  Martin Kroeker 5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  Ali Saidi 97ce6bbce2 Fix barriers in level3_thread 5 years ago
  wjc404 2f96a2c55b
Update trmm_R.c 5 years ago
  wjc404 833bd0f8ff
Update trmm_L.c 5 years ago
  wjc404 77b8f49556
Update level3_thread.c 5 years ago
  wjc404 1c3e20ce48
Update level3.c 5 years ago
  wjc404 e9fb8f62b1
Update level3_gemm3m_thread.c 5 years ago
  wjc404 4c35b8dbaa
Update gemm3m_level3.c 5 years ago
  Martin Kroeker f3065a0eed
Fix race conditions in multithreaded GEMM3M 5 years ago
  Martin Kroeker f343ed65b5
Avoid taking the root of a negative number 6 years ago
  Martin Kroeker f72fdf525c
Merge pull request #1875 from martin-frbg/issue1851 6 years ago
  Martin Kroeker 113cb00b95
fix missing parenthesis 6 years ago
  Martin Kroeker 5192651706
Add CriticalSection handling instead of mutexes for Windows 6 years ago
  Martin Kroeker 2e6fae2aad
Serialize accesses to parallelized level3 functions from multiple callers 6 years ago
  Arjan van de Ven 5b708e5eb1 sgemm/dgemm: add a way for an arch kernel to specify prefered sizes 7 years ago
  Martin Kroeker 5f2a3c05cd
Revert "Rewrite &= -> = and simplify the initial blocking phase." 7 years ago
  Craig Donner 0144068537 Rewrite &= -> = and simplify the initial blocking phase. 7 years ago
  Arjan van de Ven 73de17664d Add missing barriers in gemm scheduler 7 years ago
  Arjan van de Ven d148ec4ea1 Don't use _Atomic for jobs sometimes... 7 years ago
  Arjan van de Ven 9e162146a9 Only initialize the part of the jobs array that will get used 7 years ago
  Martin Kroeker a91f1587b9
Work around name clash with Windows10's winnt.h 7 years ago