Chip Kerchner
|
f8e113f27b
|
Replace types with include file.
|
11 months ago |
Chip Kerchner
|
a53a197934
|
Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV
|
11 months ago |
Martin Kroeker
|
3184b7f209
|
Merge pull request #4933 from ChipKerchner/thread_sbgemv
Change multi-threading logic for SBGEMV to be the same as SGEMV.
|
11 months ago |
Chip Kerchner
|
0082240044
|
Merge branch 'thread_sbgemv' into vectorizeBF16GEMV
|
11 months ago |
Chip Kerchner
|
1d51ca5798
|
Change multi-threading logic for SBGEMV to be the same as SGEMV.
|
11 months ago |
Chip Kerchner
|
c8f53b85ce
|
Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV
|
11 months ago |
Martin Kroeker
|
18a23c23f7
|
Merge pull request #4929 from martin-frbg/issue4905
Fix CBLAS_?GEMMT filling in the wrong triangle for Row-Major
|
11 months ago |
Martin Kroeker
|
5a79446bdb
|
Merge pull request #4918 from HaoZeke/testFixes
TST,BUG: Explicitly allow running tests multiple times
|
11 months ago |
Martin Kroeker
|
7ba6591ff2
|
Merge branch 'OpenMathLib:develop' into issue4905
|
11 months ago |
Martin Kroeker
|
550bc77832
|
Fix expectation values for CblasRowMajor order
|
11 months ago |
Martin Kroeker
|
e0ad20f72b
|
Merge pull request #4932 from martin-frbg/cirrusosxndk
Update Android NDK install path for M1/armv7 crossbuild on CirrusCI
|
11 months ago |
Martin Kroeker
|
e4bc5e4718
|
remove stray quote
|
11 months ago |
Martin Kroeker
|
b89fb9632f
|
Update Android NDK install path for M1/armv7 crossbuild
|
11 months ago |
Martin Kroeker
|
e52d9b4cf1
|
Merge pull request #4928 from austinpagan/czgemm_in_c
CGEMM & ZGEMM using C code, Power only, P10 only.
|
11 months ago |
Martin Kroeker
|
dbd83762f9
|
Merge pull request #4926 from NickelWenzel/fix_arm64_windows_and_uwp
fix: add missing NO_AFFINITY checks
|
11 months ago |
Martin Kroeker
|
9762464718
|
Fix CBLAS interface filling in the wrong triangle for Row-Major
|
11 months ago |
Gordon Fossum
|
0b7fb5c791
|
CGEMM & ZGEMM using C code.
|
11 months ago |
NickelWenzel
|
bee123e8e3
|
fix: add missing NO_AFFINITY checks
|
11 months ago |
Martin Kroeker
|
7ac5b9011f
|
Merge pull request #4923 from martin-frbg/zen5
Add preliminary cpu autodetection for Zen5/5c
|
11 months ago |
gxw
|
3ab8b1408e
|
LoongArch64: Update README.md
|
11 months ago |
Martin Kroeker
|
2c3b87a082
|
Add preliminary cpu autodetection for Zen5/5c
|
11 months ago |
Martin Kroeker
|
73c1882129
|
Merge pull request #4922 from martin-frbg/issue4904-2
Update names of Loongarch64 targets in cmake cross-building
|
1 year ago |
Martin Kroeker
|
bc0691a556
|
Merge pull request #4920 from martin-frbg/issue4917
Fix potential inaccuracy in multithreaded level3 related to SWITCH_RATIO
|
1 year ago |
Martin Kroeker
|
b0346e72f4
|
update names of loongarch64 targets for cross-compilation
|
1 year ago |
Martin Kroeker
|
9c707dc6b9
|
Update dynamic arch list to new target scheme
|
1 year ago |
Martin Kroeker
|
9783dd07ab
|
Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC
|
1 year ago |
Martin Kroeker
|
0dfe42d62a
|
Merge pull request #4919 from martin-frbg/issue4916-2
Handle inf/nan in ppc440 s/dscal
|
1 year ago |
Chip Kerchner
|
d6bb8dcfd1
|
Common code.
|
1 year ago |
Martin Kroeker
|
8a1710dd0d
|
don't apply switch_ratio to tail of loop
|
1 year ago |
Martin Kroeker
|
c9e92348a6
|
Handle inf/nan if dummy2 flag is set
|
1 year ago |
Rohit Goswami
|
d9f368dfe6
|
TST: Signal abort for ctest failures correctly
|
1 year ago |
Rohit Goswami
|
722e4ae07a
|
MAINT: Explicitly replace instead of unknown
|
1 year ago |
Rohit Goswami
|
a6b7751881
|
BUG: Allow tests to be run multiple times
Without failures due to existing files
|
1 year ago |
Chip Kerchner
|
9ac0fb0111
|
Merge branch 'develop' into vectorizeBF16GEMV
|
1 year ago |
Martin Kroeker
|
624e9d110e
|
Merge pull request #4916 from martin-frbg/issue4901
Fix SIGILL/SIGSEGV in PPCG4 SGEMM and fix NAN handling in PPCG4 SSCAL/DSCAL
|
1 year ago |
Martin Kroeker
|
d714013ab9
|
change sgemm kernel to 4x4 as the 16x4 altivec goes out of bounds
|
1 year ago |
Martin Kroeker
|
7c4f3638fd
|
switch PPCG4 SGEMM kernel to 4x4
|
1 year ago |
Chip Kerchner
|
915a6d6e44
|
Add casting.
|
1 year ago |
Chip Kerchner
|
7ec3c16d82
|
Remove beta from optimized functions.
|
1 year ago |
Martin Kroeker
|
54afc24e4d
|
Merge pull request #4906 from XiWeiGu/arm64_cmake_small_matrix_opt
ARM64: Enable SMALL_MATRIX_OPT when compiling with CMake
|
1 year ago |
Martin Kroeker
|
b4495a8fb8
|
Merge branch 'develop' into arm64_cmake_small_matrix_opt
|
1 year ago |
Martin Kroeker
|
68eefe60b9
|
Merge pull request #4915 from martin-frbg/issue4907
Support LoongArch64 compilation with LLVM
|
1 year ago |
Martin Kroeker
|
4f00f02567
|
Do not add -mabi flags for Loongson when the compiler is flang
|
1 year ago |
Martin Kroeker
|
f817f26062
|
Add simpler EPILOGUE for clang
|
1 year ago |
Martin Kroeker
|
a492181665
|
filter out Loongarch -mabi options for flang-new
|
1 year ago |
Martin Kroeker
|
de421b7764
|
Merge pull request #4904 from XiWeiGu/la64_cross_cmake
LoongArch64: Enable cmake cross-compilation
|
1 year ago |
Martin Kroeker
|
edaf5933c4
|
Merge pull request #4913 from martin-frbg/issue4912
Declare the input array in CBLAS_?GEADD as const in cblas.h
|
1 year ago |
Martin Kroeker
|
71131406ae
|
Declare the input array in CBLAS_?GEADD as const
|
1 year ago |
Chip Kerchner
|
7cc00f68c9
|
Remove more duplicate.
|
1 year ago |
Chip Kerchner
|
e238a68c03
|
Remove duplicate.
|
1 year ago |