445 Commits (5e43ba948c3cb35864c7b0953b8dd02374dd3967)

Author SHA1 Message Date
  Martin Kroeker 30d11bc92c
Adjust multithreading threshold and add an intermediate step 2 months ago
  Martin Kroeker a9e8fa06bf
Introduce a (crude) threshold to multithreading 2 months ago
  Martin Kroeker 965463f177
Include float-bfloat conversion functions in ONLY_CBLAS builds as well 2 months ago
  youcai 41f9701ebc Fix cmake building with cblas_bgemm 2 months ago
  Martin Kroeker 30dbca5051
fix misleading indentation to silence a gcc warning 2 months ago
  Martin Kroeker 39c90f9859
Merge pull request #5380 from quic/topic/sgemm_direct_sme1_alpha_beta 2 months ago
  Rajendra Prasad Matcha eae0abfdb6 SME1 based direct kernel with alpha and beta for cblas_sgemm level 3 API. 2 months ago
  Chris Sidebottom 947d7af4c9 Fix CMake references to bscal and bgemv 2 months ago
  Chris Sidebottom e105411460 Add infrastructure for bgemv/bscal 2 months ago
  Chris Sidebottom 740efd71c4 Add optimized BGEMM kernel for NEOVERSEV1 target 2 months ago
  Chris Sidebottom 66d9185ebe Fix CMake support 2 months ago
  Chris Sidebottom f95e7b0e32 Add infrastructure for BGEMM 3 months ago
  Usui, Tetsuzo 14107e37d9 Add parallel laed3 3 months ago
  Martin Kroeker d96daa220d
Merge pull request #5290 from Srangrang/develop 3 months ago
  Srangrang ec14e1648c fix: resolve non-RISCV host build failed issue 3 months ago
  Martin Kroeker 5e393f207c
fix source file used for sbgemmt/sbgemmtr 3 months ago
  Martin Kroeker 11ff18bb0f
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal 3 months ago
  gkdddd 670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B 4 months ago
  Martin Kroeker 42b7d1f897
Fix addressing of alpha in CBLAS 4 months ago
  Martin Kroeker 6680e0592f
Fix conditional inclusion of SGEMM_KERNEL_DIRECT 4 months ago
  Martin Kroeker 70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake 5 months ago
  Ruiyang Wu 02fd1df10b CMake: Pass `OpenMP` compiler and linker flags through CMake targets 6 months ago
  Martin Kroeker 51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0 6 months ago
  shubham.chaudhari 8e289ecddc Simplified thread throttling function in gemv 6 months ago
  shubham.chaudhari 189dbbc04f Add thread throttling for dynamic arch neoversev1 7 months ago
  shubham.chaudhari b6cb5ece58 Add thread throttling profile for DGEMV on NEOVERSEV1 7 months ago
  Martin Kroeker 7338a473a7
Merge pull request #5150 from Harishmcw/WoA-Experiments 7 months ago
  Martin Kroeker 09ba099461
make throttling code conditional on SMP 7 months ago
  Harishmcw 030ae1fd97 Redefined threading logic for WoA 7 months ago
  Martin Kroeker c03a81b927
Merge pull request #5141 from michalowski-arm/fork-throttle 7 months ago
  Martin Kroeker 75b958a018
Transform the B array back if necessary before returning 7 months ago
  Marek Michalowski 650a062e19 Add thread throttling profile for SGEMV on `NEOVERSEV2` 7 months ago
  Marek Michalowski b723c1b7b7 Add thread throttling profile for SGEMM on `NEOVERSEV2` 7 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 7 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 10 months ago
  Harish-Gits daf16b8229 Adjusted GESV threading logic for optimal performance on WoA 7 months ago
  Martin Kroeker 60d0be0e97
Update nrm2.c 7 months ago
  Martin Kroeker 0fd5448b2c
Handle INCX=0 7 months ago
  Martin Kroeker db7e5f1fa7
Update gemmt.c 7 months ago
  Martin Kroeker ff30ac9666
Update Makefile 7 months ago
  Martin Kroeker 7c3e169b67
Update gemmt.c 7 months ago
  Martin Kroeker 09414a4187
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such 7 months ago
  Marek Michalowski 838bb57e27
Merge branch 'develop' into develop 8 months ago
  Martin Kroeker a54f9a9c69
Merge pull request #5071 from annop-w/sgemm_throttling 8 months ago
  Marek Michalowski 4d5b13f765 Add thread throttling profile for SGEMV on `NEOVERSEV1` 8 months ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 8 months ago
  gxw e114880dc4 kernel/generic: Fixed cscal and zscal 8 months ago
  Annop Wongwathanarat c8cd8da496 Add thread throttling profile for SGEMM on NEOVERSEV1 8 months ago
  Martin Kroeker a1075477c3
Merge pull request #4994 from martin-frbg/issue4886 9 months ago
  Martin Kroeker 0c440f8a27
disable multithreading for small workloads 10 months ago