gxw
48698b2b1d
LoongArch64: Rename core
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
1 year ago
Martin Kroeker
fca86e359c
Merge pull request #4887 from goplanid/develop
Small GEMM improvements for AArch64 with SVE
1 year ago
Martin Kroeker
60c1519e01
Merge pull request #4896 from martin-frbg/update_azure_mac_hpc
AzureCI: Update Intel oneAPI download for Mac to final version
1 year ago
Martin Kroeker
c8313d9d80
Merge pull request #4895 from martin-frbg/update_homebrewjob
CI: Update nightly-homebrew workflow
1 year ago
Martin Kroeker
b588e922a1
Update oneAPI download location for Mac to final
1 year ago
Martin Kroeker
4178905fa7
Update version of upload-artifacts following deprecation
1 year ago
Martin Kroeker
5f70e245a2
Merge pull request #4894 from martin-frbg/issue4893
Fix function definition in the f2c-converted ctest and remove suppression of gcc14 error
1 year ago
Martin Kroeker
383e0b133e
remove suppression of gcc14's incompatible pointer error
1 year ago
Martin Kroeker
869a169c57
Fix ZAXPYTEST prototype
1 year ago
Deeksha Goplani
4894c54055
Improve TN case with further unrolling
1 year ago
Martin Kroeker
485027563e
Merge pull request #4883 from ChipKerchner/fixSGEMMUnitTestZeroSize
Fix SBGEMM unit test to handle zero elements.
1 year ago
Chip Kerchner
89702e1f4a
Fix zero element GEMV test.
1 year ago
Chip Kerchner
77f85c7c00
GEMV tests don't like zero elements.
1 year ago
Chip Kerchner
868aa857bc
Change malloc zero to return one byte and update the SBGEMM test to again use sizes of zero.
1 year ago
Chip Kerchner
b1802f4dc8
Fix unit test to start at 1 instead of 0 - since malloc zero bytes fails on some systems.
1 year ago
Martin Kroeker
f61930eb11
Merge pull request #4882 from martin-frbg/issue4805-3
Restore the workaround in the POTRS utest as it is reportedly still needed on 3C6000/gcc14.2
1 year ago
Martin Kroeker
dfba3f8841
restore the pragma as it is reportedly still needed on 3C6000/gcc14.2
1 year ago
Martin Kroeker
7129a64d87
Merge pull request #4881 from martin-frbg/issue4805-2
Use fld.d/fst.d in PROLOGUE/EPILOGUE in LOONGSON3R5 GEMM
1 year ago
Martin Kroeker
49080b631e
remove optimizer pragma again
1 year ago
Martin Kroeker
e05d98d00a
expressly use fld.d/fst.d for floating point registers instead of LD/ST macros
1 year ago
Martin Kroeker
3ee9e9d8d0
Merge pull request #4879 from martin-frbg/issue4868-2
Ensure a memory buffer has been allocated for each thread before invoking it (take 2)
1 year ago
Martin Kroeker
dd71df8fab
Merge pull request #4880 from ChipKerchner/betterPowerGEMVTail
[POWER] Vectorize SGEMV transpose reduce stage
1 year ago
Martin Kroeker
a8d6b0219a
Merge pull request #4877 from XiWeiGu/fixed_undefined_blas_set_parameter
Fixed the undefined reference to blas_set_parameter
1 year ago
Martin Kroeker
d24b3cf393
properly fix buffer allocation and assignment
1 year ago
Chip Kerchner
a0aeba631d
Merge branch 'develop' into betterPowerGEMVTail
1 year ago
Martin Kroeker
eba8615c11
Merge pull request #4876 from martin-frbg/granite
Add autodetection support for Intel Granite Rapids as Sapphire Rapids
1 year ago
Martin Kroeker
bc80e7f02d
Merge pull request #4878 from martin-frbg/cirrus-androidndk
Cirrus CI: fix installation of NDK in armv7 crossbuild
1 year ago
Martin Kroeker
94c9e0b7ad
Update ndk version number
1 year ago
Martin Kroeker
ed0321563a
fix installation of NDK in armv7 crossbuild
1 year ago
gxw
fd033467ac
Fixed the undefined reference to blas_set_parameter
Fixed the undefined reference to blas_set_parameter when
enabling USE_OPENMP and DYNAMIC_ARCH.
1 year ago
Martin Kroeker
1b8e40874e
Add autodetection support for Intel Granite Rapids as Sapphire Rapids
1 year ago
Martin Kroeker
4944148e66
Merge pull request #4875 from ChipKerchner/addGEMVtoBF16Test
Add GEMV to SBGEMx vs SGEMx testing
1 year ago
Martin Kroeker
a388c4b834
Merge pull request #4872 from chenx97/ls3a-fix-stack-fpr-len
Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A
1 year ago
Martin Kroeker
f24b521709
Merge pull request #4787 from vlad0x00/patch-1
Update cross compile info
1 year ago
Vladimir Nikolić
2d84ed7e76
Update README.md
1 year ago
Chip Kerchner
083faf7556
Merge branch 'develop' into betterPowerGEMVTail
1 year ago
Chip Kerchner
c23897f585
Add GEMV testing to SBGEMx vs SGEMx testing.
1 year ago
Martin Kroeker
0d8ee96f1e
Merge pull request #4874 from martin-frbg/issue4869
Fix handling of deprecated ?GELQS/?GEQRS in building the shared library
1 year ago
Martin Kroeker
b80671d896
Merge pull request #4871 from martin-frbg/issue4868
Ensure a buffer has been allocated for each thread before invoking it
1 year ago
Martin Kroeker
6452f7b46d
Merge pull request #4873 from ChipKerchner/fixSBGEMMDefaults
[POWER] Problem with multi-threaded SBGEMM
1 year ago
Chip Kerchner
75472b830a
Merge branch 'develop' into betterPowerGEMVTail
1 year ago
Martin Kroeker
ca7777de18
Merge pull request #4870 from chenx97/fix-recursive-make-var
Fix recursive variable expansion in Makefiles for LOONGSON3A
1 year ago
Martin Kroeker
f6469e21bc
move gelqs and geqrs to lapack-deprecated
1 year ago
Chip Kerchner
31226740d6
Cleanup of SBGEMM unit test.
1 year ago
Henry Chen
ef94b96530
Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A
This fix is similar to
2d8064174c
.
1 year ago
Martin Kroeker
23b5d66a86
Ensure a memory buffer has been allocated for each thread before invoking it
1 year ago
Henry Chen
20bdb65882
Fix recursive variable expansion in Makefiles for LOONGSON3A
1 year ago
Chip Kerchner
b1737698db
Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences.
1 year ago
Martin Kroeker
e5525036e7
Merge pull request #4865 from martin-frbg/issue4856
Tweak LAPACK STFSM test threshold a little more to cover POWER10 fma
1 year ago
Martin Kroeker
fd52d09490
Merge pull request #4864 from martin-frbg/issue4862
Spell out function prototypes in the SYRK calls of potrf_parallel
1 year ago