Martin Kroeker
70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
5 months ago
Ruiyang Wu
02fd1df10b
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
Using `OpenMP::OpenMP_LANG` targets for CMake is less error-prone than
passing the compiler and linker flags manually. Furthermore, it allows
the user to customize those flags by setting `OpenMP_LANG_FLAGS`,
`OpenMP_LANG_LIB_NAMES`, and `OpenMP_omp_LIBRARY`.
6 months ago
Martin Kroeker
51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0
6 months ago
shubham.chaudhari
8e289ecddc
Simplified thread throttling function in gemv
6 months ago
shubham.chaudhari
189dbbc04f
Add thread throttling for dynamic arch neoversev1
7 months ago
shubham.chaudhari
b6cb5ece58
Add thread throttling profile for DGEMV on NEOVERSEV1
7 months ago
Martin Kroeker
7338a473a7
Merge pull request #5150 from Harishmcw/WoA-Experiments
Redefined threading logic for GESV and GEMV on WoA
7 months ago
Martin Kroeker
09ba099461
make throttling code conditional on SMP
7 months ago
Harishmcw
030ae1fd97
Redefined threading logic for WoA
7 months ago
Martin Kroeker
c03a81b927
Merge pull request #5141 from michalowski-arm/fork-throttle
Add throttling profile for SGEMM and SGEMV on `NEOVERSEV2`
7 months ago
Martin Kroeker
75b958a018
Transform the B array back if necessary before returning
7 months ago
Marek Michalowski
650a062e19
Add thread throttling profile for SGEMV on `NEOVERSEV2`
7 months ago
Marek Michalowski
b723c1b7b7
Add thread throttling profile for SGEMM on `NEOVERSEV2`
7 months ago
Vaisakh K V
f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1
7 months ago
Vaisakh K V
d23eb3b93e
Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API
* Added ARMV9SME target
* Added SGEMM_DIRECT kernel based on SME1
10 months ago
Harish-Gits
daf16b8229
Adjusted GESV threading logic for optimal performance on WoA
7 months ago
Martin Kroeker
60d0be0e97
Update nrm2.c
7 months ago
Martin Kroeker
0fd5448b2c
Handle INCX=0
7 months ago
Martin Kroeker
db7e5f1fa7
Update gemmt.c
7 months ago
Martin Kroeker
ff30ac9666
Update Makefile
7 months ago
Martin Kroeker
7c3e169b67
Update gemmt.c
7 months ago
Martin Kroeker
09414a4187
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such
7 months ago
Marek Michalowski
838bb57e27
Merge branch 'develop' into develop
8 months ago
Martin Kroeker
a54f9a9c69
Merge pull request #5071 from annop-w/sgemm_throttling
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
Marek Michalowski
4d5b13f765
Add thread throttling profile for SGEMV on `NEOVERSEV1`
8 months ago
tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
8 months ago
Annop Wongwathanarat
c8cd8da496
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
Martin Kroeker
a1075477c3
Merge pull request #4994 from martin-frbg/issue4886
Disable multithreading in ?TRTRI for small workloads
9 months ago
Martin Kroeker
0c440f8a27
disable multithreading for small workloads
10 months ago
Martin Kroeker
2a290dfc2c
forward GEMM3M calls for GENERIC targets to the regular C/ZGEMM for now
10 months ago
Martin Kroeker
0cf656fd3e
Add copies of GEMMT under its new name GEMMTR
11 months ago
Chris Daley
cb48505251
optimize gemv forwarding on ARM64 systems
11 months ago
Chip Kerchner
36bd3eeddf
Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power).
11 months ago
Chip Kerchner
1d51ca5798
Change multi-threading logic for SBGEMV to be the same as SGEMV.
11 months ago
Martin Kroeker
9762464718
Fix CBLAS interface filling in the wrong triangle for Row-Major
11 months ago
gxw
48698b2b1d
LoongArch64: Rename core
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
1 year ago
Martin Kroeker
7878976236
disable forwarding from SBGEMM to SBGEMV for now
1 year ago
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
1 year ago
Chris Sidebottom
90eb863d4b
Re-add accidental removal
1 year ago
Chris Sidebottom
28b5334f22
Complete implementation of GEMV forwarding
1 year ago
Martin Kroeker
3db5dbc88e
forward to GEMV when one argument is actually a vector
1 year ago
gxw
f3cebb3ca3
x86: Fixed numpy CI failure when the target is ZEN.
1 year ago
Martin Kroeker
2f12a47405
fix build options for CAXPYC/ZAXPYC
1 year ago
Martin Kroeker
db9f7bc552
fix float array types to include bfloat16
1 year ago
Martin Kroeker
076766df4e
Update CMakeLists.txt
1 year ago
Martin Kroeker
ff6670cb83
don't generate non-cblas files for gemm_batch
1 year ago
Martin Kroeker
362a063396
remove return value
1 year ago
Martin Kroeker
89c7bbcba6
add cblas_?gemm_batch
1 year ago
Martin Kroeker
2957281275
Introduce a lower limit for multithreading
1 year ago
Martin Kroeker
5fd871d7ea
Introduce a lower limit for multithreading
1 year ago