Martin Kroeker
7e939fb831
Fix handling of additional buffer structures in case of overflow
2 years ago
Martin Kroeker
bb2f1ec3b0
Merge pull request #4222 from dev-zero/bugfix/correct-thread-warning
memory: show correct number of max threads
2 years ago
Martin Kroeker
466e6115d3
Merge pull request #4230 from martin-frbg/lapack907
Increase work array size in S/DTGEX2 to avoid overflow (Reference-LAPACK PR 907)
2 years ago
Martin Kroeker
1285b53e39
Make IWORK array larger to avoid overflow
2 years ago
Martin Kroeker
7779bb6fb1
Make IWORK array larger to avoid overflow
2 years ago
Martin Kroeker
0606102460
Merge pull request #4229 from martin-frbg/issue4228
Add la_constants.o to SCLAUX/DZLAUX in LAPACK Makefile
2 years ago
Martin Kroeker
fb97cc4d5e
Add la_constants.o to SCLAUX/DZLAUX
2 years ago
Tiziano Müller
6a611db560
memory: show correct number of max threads
2 years ago
Martin Kroeker
6bc079687f
Merge pull request #4218 from XiWeiGu/loongarch64_sgemv
LoongArch64: Add sgemv kernel
2 years ago
Martin Kroeker
cd36b8fff7
Merge pull request #4214 from martin-frbg/issue4212
Disable SVE targets in DYNAMIC_ARCH when compiler is gcc on macOS
2 years ago
Martin Kroeker
09911f077e
Disable SVE targets for DYNAMIC_ARCH when compiling with (homebrew)gcc on macOS/arm64
2 years ago
Martin Kroeker
c3f2a3c0ca
Update version to 0.3.24.dev
2 years ago
Martin Kroeker
4867cf5dd7
Update version to 0.3.24.dev
2 years ago
gxw
f2cf929374
LoongArch64: Add sgemv kernel
2 years ago
Martin Kroeker
f29a0d1a7d
Merge pull request #4211 from xianyi/release-0.3.0
merge release-0.3.24 back into develop to copy tag
2 years ago
Martin Kroeker
9f815cf1bf
Update version to 0.3.24
2 years ago
Martin Kroeker
3c49711f1e
Update version to 0.3.24
2 years ago
Martin Kroeker
2c68822cde
Merge pull request #4210 from xianyi/develop
merge develop into 0.3.0 for 0.3.24
2 years ago
Martin Kroeker
3c51bd0fbf
Merge pull request #4209 from martin-frbg/changelog0324
Update Changelog for 0.3.24
2 years ago
Martin Kroeker
5d73041068
Update Changelog for 0.3.24
2 years ago
Martin Kroeker
8e6d93359d
Merge pull request #4196 from TiborGY/obsolete_inlines
Modernize obsolete inline order
2 years ago
Martin Kroeker
33797c44fc
Merge pull request #4143 from martin-frbg/issue4130
Update to use safe scaling algorithm from Reference-LAPACK PR 527
2 years ago
Martin Kroeker
ee310e3533
Merge pull request #4208 from XiWeiGu/loongarch64_toolchain
LoongArch64: Compatible with early internal toolchain
2 years ago
Martin Kroeker
42909ce57d
Merge branch 'xianyi:develop' into issue4130
2 years ago
Martin Kroeker
a2a184572c
update zrotg
2 years ago
gxw
394a1fd1bf
LoongArch64: Compatible with early internal toolchain
__loongarch_grlen and __loongarch_frlen were introduced in gcc version 8.3.0
(Loongnix 8.3.0-6.lnd.vec.31) internally within Loongson to standardize the
general and floating-point register widths. However, previous versions did
not have them, requiring additional checks to be added.
2 years ago
Martin Kroeker
12d8f219d6
Merge pull request #4207 from martin-frbg/issue4174-2
Clarify the comment on the out-of-bounds check in ?GETF2
2 years ago
Martin Kroeker
9c4ae4d4fb
Merge pull request #4206 from martin-frbg/issue4201-2
Work around miscompilation of zdot_thunderx2t99 by the current NVIDIA HPC compiler
2 years ago
Martin Kroeker
3bb70b8ca4
Merge pull request #4205 from martin-frbg/fixintmain
Fix missing type declaration for main() in converted LAPACK files
2 years ago
Martin Kroeker
3b6050ac04
clarify the comment on the out-of-bounds check from #723
2 years ago
Martin Kroeker
22a402bc2c
clarify the comment on the out-of-bounds check from #723
2 years ago
Martin Kroeker
88435104c8
Merge pull request #4204 from martin-frbg/llvm17-2
Work around LLVM17 miscompiling the AVX512 microkernels for CASUM/ZASUM
2 years ago
Martin Kroeker
fc8894dd98
Workaround miscompilation by NVIDIA nvc
2 years ago
Martin Kroeker
be57c595aa
Merge pull request #4203 from martin-frbg/issue4201
Add support for building arm64 SVE kernels with the NVIDIA HPC compiler
2 years ago
Martin Kroeker
7a6203ffa1
restore default Neoverse SVE build instructions for non-NVIDIA compilers
2 years ago
Martin Kroeker
7f7d3896dd
Fix missing type declaration for main
2 years ago
Martin Kroeker
2c3034ff7f
Disable the C/ZASUM AVX512 microkernels when compiling with LLVM17 as well
2 years ago
Martin Kroeker
49689fbef7
Add support for compiling SVE kernels with the NVIDIA HPC compiler
2 years ago
Martin Kroeker
8794544b43
Add support for compiling the Neoverse SVE kernels with the NVIDIA HPC compiler
2 years ago
Martin Kroeker
e9f1b2d26f
Expand the SVE compatibility check for the NVIDIA HPC compiler
2 years ago
Martin Kroeker
d69f57c8c2
Merge pull request #4200 from XiWeiGu/loongarch64_sgemm
LoongArch64: Add sgemm_kernel
2 years ago
gxw
553cc1372f
LoongArch64: Add sgemm_kernel
2 years ago
Martin Kroeker
12ede72ab7
Merge pull request #4192 from imciner2/im/clangfix
Fix cooperlake and sapphire rapids march flags on clang
2 years ago
Martin Kroeker
8d9f701fbf
Merge pull request #4195 from TiborGY/BF16_ignore
Add junk from BF16 test to .gitignore
2 years ago
Martin Kroeker
7f67ba9147
Merge pull request #4198 from martin-frbg/issue4197
Correct INFO returned for too small lda in non-CBLAS s/dgeadd
2 years ago
Martin Kroeker
214be14c1d
Correct INFO returned for lda in non-CBLAS s/dgeadd
2 years ago
Martin Kroeker
1b09f4b2bb
Merge pull request #4193 from imciner2/im/ppcgnu
Fix power10 gcc intrinsic check
2 years ago
Ian McInerney
79c15db348
Fix power10 gcc intrinsic check
__builtin_vsx_assemble_pair was only in GCC 10-11.2 and was replaced by
__builtin_vsx_build_pair thereafter.
2 years ago
TGY
b5ba95a6c0
Modernize obsolete inline order
2 years ago
TiborGY
0d30daa772
Add junk from BF16 test to .gitignore
2 years ago