Martin Kroeker
0fd5448b2c
Handle INCX=0
7 months ago
Martin Kroeker
db7e5f1fa7
Update gemmt.c
7 months ago
Martin Kroeker
ff30ac9666
Update Makefile
7 months ago
Martin Kroeker
7c3e169b67
Update gemmt.c
7 months ago
Martin Kroeker
09414a4187
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such
7 months ago
Marek Michalowski
838bb57e27
Merge branch 'develop' into develop
8 months ago
Martin Kroeker
a54f9a9c69
Merge pull request #5071 from annop-w/sgemm_throttling
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
Marek Michalowski
4d5b13f765
Add thread throttling profile for SGEMV on `NEOVERSEV1`
8 months ago
tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
8 months ago
Annop Wongwathanarat
c8cd8da496
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
Martin Kroeker
a1075477c3
Merge pull request #4994 from martin-frbg/issue4886
Disable multithreading in ?TRTRI for small workloads
9 months ago
Martin Kroeker
0c440f8a27
disable multithreading for small workloads
10 months ago
Martin Kroeker
2a290dfc2c
forward GEMM3M calls for GENERIC targets to the regular C/ZGEMM for now
10 months ago
Martin Kroeker
0cf656fd3e
Add copies of GEMMT under its new name GEMMTR
11 months ago
Chris Daley
cb48505251
optimize gemv forwarding on ARM64 systems
11 months ago
Chip Kerchner
36bd3eeddf
Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power).
11 months ago
Chip Kerchner
1d51ca5798
Change multi-threading logic for SBGEMV to be the same as SGEMV.
11 months ago
Martin Kroeker
9762464718
Fix CBLAS interface filling in the wrong triangle for Row-Major
11 months ago
gxw
48698b2b1d
LoongArch64: Rename core
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
1 year ago
Martin Kroeker
7878976236
disable forwarding from SBGEMM to SBGEMV for now
1 year ago
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
1 year ago
Chris Sidebottom
90eb863d4b
Re-add accidental removal
1 year ago
Chris Sidebottom
28b5334f22
Complete implementation of GEMV forwarding
1 year ago
Martin Kroeker
3db5dbc88e
forward to GEMV when one argument is actually a vector
1 year ago
gxw
f3cebb3ca3
x86: Fixed numpy CI failure when the target is ZEN.
1 year ago
Martin Kroeker
2f12a47405
fix build options for CAXPYC/ZAXPYC
1 year ago
Martin Kroeker
db9f7bc552
fix float array types to include bfloat16
1 year ago
Martin Kroeker
076766df4e
Update CMakeLists.txt
1 year ago
Martin Kroeker
ff6670cb83
don't generate non-cblas files for gemm_batch
1 year ago
Martin Kroeker
362a063396
remove return value
1 year ago
Martin Kroeker
89c7bbcba6
add cblas_?gemm_batch
1 year ago
Martin Kroeker
2957281275
Introduce a lower limit for multithreading
1 year ago
Martin Kroeker
5fd871d7ea
Introduce a lower limit for multithreading
1 year ago
gxw
637c650f4f
loongarch64: Add buffer offset for target LOONGSON3R5
1 year ago
Martin Kroeker
93d975d8fd
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset
loongarch: Optimizing the performance of the GEMM on servers
1 year ago
gxw
d8c4ea8793
loongarch: Optimizing the performance of the GEMM on servers
1 year ago
Martin Kroeker
d277c6d15b
Merge pull request #4585 from martin-frbg/issue1881
Cap the number of parallel threads for GEMM;GETRF and POTRF to ensure sensible workloads on big systems
1 year ago
Igor Zhuravlov
22d305e2df
fix dtrtrs_ and ztrtrs_ to accept case-insensitive parameters uplo and diag
Changes to be committed:
modified: interface/lapack/trtrs.c
modified: interface/lapack/ztrtrs.c
1 year ago
Martin Kroeker
68ab5185d0
Update potrf.c
1 year ago
Martin Kroeker
19b29b3448
Update getrf.c
1 year ago
Martin Kroeker
a3354a7630
Cap the number of parallel threads
1 year ago
Martin Kroeker
5da4c93ef2
Cap the number of parallel threads
1 year ago
Martin Kroeker
496106642f
Cap the number of parallel threads
1 year ago
Martin Kroeker
cb8131cfd9
Merge pull request #4499 from kseniyazaytseva/new-tests
Tests for BLAS-like and BLAS API
1 year ago
Martin Kroeker
baf88564bc
Fix potential buffer overflow
1 year ago
kseniyazaytseva
7e9b1c0807
fix uninitialized data usage
1 year ago
kseniyazaytseva
c6f30fd414
check for zero inc
1 year ago
kseniyazaytseva
5e9ead09ac
fix info return
1 year ago
Martin Kroeker
500ac4de5e
fix incompatible pointer types
1 year ago
Martin Kroeker
d4db6a9f16
Separate the interface for SBGEMMT from GEMMT due to differences in GEMV arguments
1 year ago