35 Commits (02ea3db8e720b0ffb3e212d76bffeb285c325c87)

Author SHA1 Message Date
  Wangyang Guo 3dc6052c7e initial support for Sapphire Rapids platform 4 years ago
  Martin Kroeker a554712439
remove extra/intermediate size step for min_jj introduced in PR747 4 years ago
  Chen, Guobing e740c4873d Enable COOPERLAKE build target 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  Ali Saidi 97ce6bbce2 Fix barriers in level3_thread 5 years ago
  wjc404 77b8f49556
Update level3_thread.c 5 years ago
  Martin Kroeker f72fdf525c
Merge pull request #1875 from martin-frbg/issue1851 6 years ago
  Martin Kroeker 113cb00b95
fix missing parenthesis 6 years ago
  Martin Kroeker 5192651706
Add CriticalSection handling instead of mutexes for Windows 6 years ago
  Martin Kroeker 2e6fae2aad
Serialize accesses to parallelized level3 functions from multiple callers 6 years ago
  Arjan van de Ven 5b708e5eb1 sgemm/dgemm: add a way for an arch kernel to specify prefered sizes 7 years ago
  Martin Kroeker 5f2a3c05cd
Revert "Rewrite &= -> = and simplify the initial blocking phase." 7 years ago
  Craig Donner 0144068537 Rewrite &= -> = and simplify the initial blocking phase. 7 years ago
  Arjan van de Ven 73de17664d Add missing barriers in gemm scheduler 7 years ago
  Arjan van de Ven d148ec4ea1 Don't use _Atomic for jobs sometimes... 7 years ago
  Arjan van de Ven 9e162146a9 Only initialize the part of the jobs array that will get used 7 years ago
  Zhiyong Dang 3716267124 Change _STDC_VERSION__ to __STDC_VERSION__ 7 years ago
  Martin Kroeker 6a99fcce94
Use _Atomic instead of volatile for thread safety where C11 is supported 7 years ago
  Andrew 11a627c54e remove surplus parentheses to silence clang5 7 years ago
  Tim Moon 30486a356c Reduce number of data partitions in n. 8 years ago
  Tim Moon 9de52b489a Cleaning up and documenting multi-threaded GEMM code. 8 years ago
  Tim Moon 860dcfc703 Use 2D thread distribution for small GEMMs. 8 years ago
  Tim Moon 6aaa107865 Reducing threads for multi-threaded GEMMs on small matrices. 8 years ago
  Werner Saar a2672d5589 prepared driver/level3 functions for UNROLL values, that are not a power of two 8 years ago
  Werner Saar b07d733a71 added updates for syrk and syr2k 9 years ago
  Ralph Campbell fbc21266e6 Minor C code fixes in driver/ 10 years ago
  wernsaar 1d33547222 optimized zgemm kernel for haswell 11 years ago
  Timothy Gu 6c2ead30f0 Remove all trailing whitespace except lapack-netlib 11 years ago
  wernsaar c947ab85dc changed level3.c 12 years ago
  wernsaar 2840d56aeb added dgemm_kernel for Piledriver 12 years ago
  Zhang Xianyi 32d2ca3035 Refs #214, #221, #246. Fixed the getrf overflow bug on Windows. 12 years ago
  wernsaar 6f008abcef replaced defined(DOUBLE) by !defined(XDOUBLE) 12 years ago
  Zhang Xianyi 5d3312142a Refs #221 #246. Fixed the overflowing stack bug in mutlithreading BLAS3. 12 years ago
  wernsaar 25491e42f9 New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S 12 years ago
  Xianyi Zhang 342bbc3871 Import GotoBLAS2 1.13 BSD version codes. 14 years ago