Martin Kroeker
30d11bc92c
Adjust multithreading threshold and add an intermediate step
2 months ago
Martin Kroeker
a9e8fa06bf
Introduce a (crude) threshold to multithreading
2 months ago
Martin Kroeker
965463f177
Include float-bfloat conversion functions in ONLY_CBLAS builds as well
2 months ago
youcai
41f9701ebc
Fix cmake building with cblas_bgemm
2 months ago
Martin Kroeker
30dbca5051
fix misleading indentation to silence a gcc warning
2 months ago
Martin Kroeker
39c90f9859
Merge pull request #5380 from quic/topic/sgemm_direct_sme1_alpha_beta
SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3
2 months ago
Rajendra Prasad Matcha
eae0abfdb6
SME1 based direct kernel with alpha and beta for cblas_sgemm level 3 API.
2 months ago
Chris Sidebottom
947d7af4c9
Fix CMake references to bscal and bgemv
2 months ago
Chris Sidebottom
e105411460
Add infrastructure for bgemv/bscal
- Sets up all the various entrypoints for `bgemv`
- Adds `bscal` for use in the `bgemv` interface
- Adds test cases for comparing `sgemv` and `bgemv`
- Adds generic kernels for `bgemv_n` and `bgemv_t` which are accurate
enough to pass above tests
2 months ago
Chris Sidebottom
740efd71c4
Add optimized BGEMM kernel for NEOVERSEV1 target
This also improves the testing and generic kernel by re-using the BF16
conversion functions.
Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
2 months ago
Chris Sidebottom
66d9185ebe
Fix CMake support
2 months ago
Chris Sidebottom
f95e7b0e32
Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.
Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
3 months ago
Usui, Tetsuzo
14107e37d9
Add parallel laed3
3 months ago
Martin Kroeker
d96daa220d
Merge pull request #5290 from Srangrang/develop
Add support for FP16 to openBLAS and shgemm on RISCV
3 months ago
Srangrang
ec14e1648c
fix: resolve non-RISCV host build failed issue
- adjust interface to disable "small matrix" pathway
- separate HFLOAT16 from BFLOAT16
- remove SHGEMM_UNROLL_M and SHGEMM_UNROLL_N equal conditions
Related to PR#5290
Co-authored-by Martin
3 months ago
Martin Kroeker
5e393f207c
fix source file used for sbgemmt/sbgemmtr
3 months ago
Martin Kroeker
11ff18bb0f
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal
kernel/generic: Fixed cscal and zscal
3 months ago
gkdddd
670ec6f757
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B
Added HFLOAT16 support for RISCV64
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B based on HFLOAT16
The instruction sets used are ZVFH and ZFH, which need to be supported by RVV1.0
Related to issue #5279
Co-authored-by Linjin Li <linjin_li@163.com>
4 months ago
Martin Kroeker
42b7d1f897
Fix addressing of alpha in CBLAS
4 months ago
Martin Kroeker
6680e0592f
Fix conditional inclusion of SGEMM_KERNEL_DIRECT
4 months ago
Martin Kroeker
70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
5 months ago
Ruiyang Wu
02fd1df10b
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
Using `OpenMP::OpenMP_LANG` targets for CMake is less error-prone than
passing the compiler and linker flags manually. Furthermore, it allows
the user to customize those flags by setting `OpenMP_LANG_FLAGS`,
`OpenMP_LANG_LIB_NAMES`, and `OpenMP_omp_LIBRARY`.
6 months ago
Martin Kroeker
51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0
6 months ago
shubham.chaudhari
8e289ecddc
Simplified thread throttling function in gemv
6 months ago
shubham.chaudhari
189dbbc04f
Add thread throttling for dynamic arch neoversev1
7 months ago
shubham.chaudhari
b6cb5ece58
Add thread throttling profile for DGEMV on NEOVERSEV1
7 months ago
Martin Kroeker
7338a473a7
Merge pull request #5150 from Harishmcw/WoA-Experiments
Redefined threading logic for GESV and GEMV on WoA
7 months ago
Martin Kroeker
09ba099461
make throttling code conditional on SMP
7 months ago
Harishmcw
030ae1fd97
Redefined threading logic for WoA
7 months ago
Martin Kroeker
c03a81b927
Merge pull request #5141 from michalowski-arm/fork-throttle
Add throttling profile for SGEMM and SGEMV on `NEOVERSEV2`
7 months ago
Martin Kroeker
75b958a018
Transform the B array back if necessary before returning
7 months ago
Marek Michalowski
650a062e19
Add thread throttling profile for SGEMV on `NEOVERSEV2`
7 months ago
Marek Michalowski
b723c1b7b7
Add thread throttling profile for SGEMM on `NEOVERSEV2`
7 months ago
Vaisakh K V
f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1
7 months ago
Vaisakh K V
d23eb3b93e
Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API
* Added ARMV9SME target
* Added SGEMM_DIRECT kernel based on SME1
10 months ago
Harish-Gits
daf16b8229
Adjusted GESV threading logic for optimal performance on WoA
7 months ago
Martin Kroeker
60d0be0e97
Update nrm2.c
7 months ago
Martin Kroeker
0fd5448b2c
Handle INCX=0
7 months ago
Martin Kroeker
db7e5f1fa7
Update gemmt.c
7 months ago
Martin Kroeker
ff30ac9666
Update Makefile
7 months ago
Martin Kroeker
7c3e169b67
Update gemmt.c
7 months ago
Martin Kroeker
09414a4187
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such
7 months ago
Marek Michalowski
838bb57e27
Merge branch 'develop' into develop
8 months ago
Martin Kroeker
a54f9a9c69
Merge pull request #5071 from annop-w/sgemm_throttling
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
Marek Michalowski
4d5b13f765
Add thread throttling profile for SGEMV on `NEOVERSEV1`
8 months ago
tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
8 months ago
gxw
e114880dc4
kernel/generic: Fixed cscal and zscal
8 months ago
Annop Wongwathanarat
c8cd8da496
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
Martin Kroeker
a1075477c3
Merge pull request #4994 from martin-frbg/issue4886
Disable multithreading in ?TRTRI for small workloads
9 months ago
Martin Kroeker
0c440f8a27
disable multithreading for small workloads
10 months ago