752 Commits (20bdb658828e62a01dcc0b97edf14cb56f3ea6a8)

Author SHA1 Message Date
  Martin Kroeker 753c7ebe17
Merge pull request #4835 from martin-frbg/revertwin4359 1 year ago
  Martin Kroeker 50397e017a
Merge pull request #4838 from martin-frbg/fix4662-3 1 year ago
  Martin Kroeker 5257f807a9
fix invalid ifdef syntax in HUGETLB handling 1 year ago
  Martin Kroeker 2aed90171a
Add riscv sources for DYNAMIC_ARCH 1 year ago
  Martin Kroeker 6468dc1142
restore the coarse locking of the pre-4359 version 1 year ago
  yamazaki-mitsufumi 821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 1 year ago
  Martin Kroeker a815594fd1
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch 1 year ago
  Martin Kroeker a373d0f107
Improve the error message for thread creation failure 1 year ago
  Mark Ryan 3b715e6162 Add autodetection for riscv64 1 year ago
  Martin Kroeker d0b9948b23
Guard against invalid thread_status.queue 1 year ago
  Martin Kroeker 7e9a4ba427
Merge pull request #4741 from shivammonaka/Pthread_Scalability_Improvement 1 year ago
  Martin Kroeker 9b2a0c79cb
Add Zhaoxin KX7000 1 year ago
  shivammonaka 9e22d70957 Dynamic locking in Pthread Backend to allow multiple BLAS calls to be executed parallelly 1 year ago
  Martin Kroeker db070a9223
add gemm_batch drivers 1 year ago
  Martin Kroeker d0794f88dc
add gemm_batch driver 1 year ago
  Martin Kroeker 0073affe63
Merge pull request #4693 from goplanid/locks-improvement 1 year ago
  Martin Kroeker 6ca9ffa7f5
Merge pull request #4655 from yamazakimitsufumi/update_2d_thread_distribution 1 year ago
  Deeksha Goplani 0dc80a5c8d locks improvement 1 year ago
  Martin Kroeker 8da6f7e5f2
Merge pull request #4686 from XiWeiGu/loongarch64_dgemm_kernel_16x6 1 year ago
  gxw 637c650f4f loongarch64: Add buffer offset for target LOONGSON3R5 1 year ago
  Martin Kroeker 5500b4ab26
Merge pull request #4680 from theAeon/develop 1 year ago
  Martin Kroeker f0f1ff7820
fix HUGETLB allocation for TLS mode as well 1 year ago
  Andrew Robbins edfe1aa471
Expose whether locking is enabled in get_config 1 year ago
  Martin Kroeker dc99b61380
sort unwanted interdependencies of alloc_shm and alloc_hugetlb 1 year ago
  Martin Kroeker ddcd7d6fa8
Merge branch 'develop' into Threading_Callback 1 year ago
  yamazaki-mitsufumi 51ab1903e7 Expanding the scop of 2D thread distribution 1 year ago
  gxw d8c4ea8793 loongarch: Optimizing the performance of the GEMM on servers 1 year ago
  shivammonaka 7102367fde Introduced callback to Pthread, Win32 and OpenMP backend 1 year ago
  Mark Seminatore b0ad8a78ff code to fix lost work in case of re-entrant calls to exec_blas_async() 1 year ago
  Martin Kroeker 88b5330ae7
Restore outer loop of blas_buffer_inuse setup 1 year ago
  shivammonaka d49ebc54e1 Merge branch 'shivam-develop' into shivam-Locks 1 year ago
  shivammonaka bc191015e3 Using OpenMP locks with NUM_PARALLEL 1 year ago
  Mark Seminatore b29fd48998
Merge branch 'develop' into win_tidy 1 year ago
  Mark Seminatore 98c56a7314 more cleanup 1 year ago
  Chip Kerchner d408ecedba Add environment variable to display coretype for dynamic arch. 1 year ago
  Chip Kerchner ac6b4b7aa4 Make sure CPU ID works for all POWER_10 conditions 1 year ago
  Chip Kerchner 08ce6b1c1c Add missing CPU ID definitions for old versions of AIX. 1 year ago
  Martin Kroeker a4fde2c5ac
Merge pull request #4451 from martin-frbg/overflow_reset 1 year ago
  Martin Kroeker e61d96303d
Fix missing NO_AVX2 fallback for SapphireRapids 1 year ago
  Mark Seminatore 42cb567f0f more cleanup 1 year ago
  Mark Seminatore 0d7fe5ea61 clean up whitespace 1 year ago
  Martin Kroeker d938aed7fe
reset "mem structure overflowed" state on shutdown 1 year ago
  Chris Sidebottom aaf65210cc Add dynamic support for Arm(R) Neoverse(TM) V2 processor 1 year ago
  Martin Kroeker 152a6c43b6
Add blas_omp_threads_local 1 year ago
  Martin Kroeker 8a9d492af7
Add default for blas_omp_threads_local 1 year ago
  Martin Kroeker 87d31af2ae
Add openblas_set_num_threads_local() 1 year ago
  Martin Kroeker e7a895e714
Add Apple M as NeoverseN1 1 year ago
  Chris Sidebottom dc20a78188 Use functionally equivalent dynamic targets 1 year ago
  Mark Seminatore 6bd7c54af5 introduce MT_TRACE to clean up SMP_DEBUG code 1 year ago
  Mark Seminatore edac80d7e8 some cleanup, dynamically scale threads, add missing WIN_CASE defn 1 year ago