Martin Kroeker
dd38b4e811
Merge pull request #5225 from annop-w/gemv_n
Improve performance for SGEMVN on NEONVERSEN1
7 months ago
Martin Kroeker
3a088de2d1
Merge pull request #5228 from martin-frbg/cmakecrossarm
Update and amend parameters for Neoverse cpus in CMake crossbuilds
7 months ago
Martin Kroeker
0241d516f6
Merge pull request #5220 from iha-taisei/sdgemv_n_unroll
Further performance improvements to non-transposed [SD]GEMV kernels for A64FX and Neoverse V1.
7 months ago
Martin Kroeker
afb664527f
Merge pull request #5221 from tetsuzo-usui/tune_symv_for_arm64
Add AArch64-optimized SYMV kernels
7 months ago
Annop Wongwathanarat
d535728803
Improve performance for SGEMVN on NEONVERSEN1
7 months ago
Martin Kroeker
d9369bda1e
Update and amend parameters for Neoverse cpus
7 months ago
zanpeeters
acef78c778
Reset buffer length before every call to sysctlbyname.
7 months ago
zanpeeters
d1c2528aed
Add L1_DATA_LINESIZE for ifdef __APPLE__
7 months ago
zanpeeters
7b66330dea
hw.perflevel[01].cpusperl changed to hw.perflevel[01].cpusperl2
7 months ago
Usui, Tetsuzo
d711906e3e
Add symv kernels for arm64
7 months ago
Iha, Taisei
f1e628b889
Further performance improvements to [SD]GEMV.
7 months ago
Martin Kroeker
39718cd28e
Merge pull request #5218 from martin-frbg/lapacke_mangling
lapacke_mangling.h is no longer generated, so don't delete on make clean
7 months ago
Martin Kroeker
211dfd0754
disable the CooperLake microkernel as it produces wrong results
7 months ago
Martin Kroeker
fd3afef122
lapacke_mangling.h is no longer generated, so don't delete on make clean
7 months ago
Martin Kroeker
b30dc9701f
Merge pull request #5215 from annop-w/gemv_t
Use SVE kernel for S/DGEMVT for SVE machines
7 months ago
Martin Kroeker
2893d0add4
Merge pull request #5211 from guoyuanplct/develop
Optimizing the Implementation of GEMV on the RISC-V V Extension
7 months ago
Martin Kroeker
ed1e470663
Merge pull request #5217 from haampie/hs/fix/darwin-gcc
test_potrs.c: do not use GCC pragma on darwin-aarch64
7 months ago
Harmen Stoppels
3d6d026fe1
no-gcse when loongarch64
7 months ago
Harmen Stoppels
51ba70f47b
test_potrs.c: remove pragma darwin-aarch64 support
Using GCC 14.2.0 on Darwin, the pragma ultimately causes a linker error
"ld: invalid r_symbolnum=". The current workaround is to use the old
linker, but (a) it's deprecated and (b) it can produce libraries that
are subsequently not linkable with the newer linker in dependents: the
new ld64 does not link to libraries with duplicate rpaths created by the
classic linker.
7 months ago
Annop Wongwathanarat
ec146157d3
Use SVE kernel for S/DGEMVT for SVE machines
7 months ago
Martin Kroeker
de2380e5a6
Merge pull request #5214 from martin-frbg/issue5200
Remove spurious cast from Alpha and Cell's DEFAULT_ALIGN
7 months ago
Martin Kroeker
a34b487f22
Remove spurious cast from Alpha and Cell's DEFAULT_ALIGN
7 months ago
Martin Kroeker
1b3e7cc491
Merge pull request #5212 from martin-frbg/lapack1119
Fix incomplete error message in EIG test (Reference-LAPACK PR 1119)
7 months ago
Martin Kroeker
4270d5bc43
Merge pull request #5204 from martin-frbg/issue4692
Repeat the libs target's "ln" in the all target to ensure completeness of copy on Windows
7 months ago
Martin Kroeker
880e43ee54
Merge pull request #5198 from martin-frbg/woadlldebug
Fix pdb file creation in debug dll builds with CMake on Windows/WoA
7 months ago
Martin Kroeker
70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
7 months ago
Martin Kroeker
f0f274725d
Merge pull request #5207 from martin-frbg/issue5202
Fix MacOS compilation with xcode16.3/clang17/gcc14
7 months ago
Martin Kroeker
94fb7033a4
Fix incomplete error message (Reference-LAPACK PR 1119)
7 months ago
lglglglgy
1ff303f36e
Optimizing the Implementation of GEMV on the RISC-V V Extension
Specialized some scenarios, performed loop unrolling, and reduced the
number of multiplications.
7 months ago
Martin Kroeker
fc8090b607
Move additional omp dependency to EXTRALIB
7 months ago
Martin Kroeker
1c5d0d5539
move libomp to extralib
7 months ago
Martin Kroeker
67c5bdd639
Azure CI: Update flang call in OSX_LLVM_flangnew job ( #5208 )
* Update flang call in OSX_LLVM_flangnew job
7 months ago
Martin Kroeker
1ed962d259
Fix compilation with xcode16.3/clang17/gcc14
7 months ago
Martin Kroeker
f0008f50cc
Merge pull request #5206 from ColumbusAI/develop
Update zsum.c -- fixed spelling error to successfully compile
7 months ago
ColumbusAI
7bf848454d
Update zsum.c -- fixed spelling error to successfully compile
spelling error where zsum_kernel is used and it should be zasum_kernel. Will not compile without fix.
7 months ago
Martin Kroeker
0aa5ef29ec
Repeat the libs target's "ln" in the all target to ensure completeness
7 months ago
Martin Kroeker
f90eff306d
Merge pull request #5197 from e4t/z-arch-exec-stack
On zarch don't produce objects from assembler with a writable stack s…
7 months ago
Vaisakh K V
04915be829
Add vector registers to clobber list to prevent compiler optimization.
SME based SGEMMDIRECT kernel uses the vector registers (z) and adding
clobber list informs compiler not to optimize these registers.
7 months ago
Martin Kroeker
3fc15ad81c
Fix pdb file creation in debug dll builds with CMake on Windows/WoA
7 months ago
Egbert Eich
61b9339d3a
getarch/cpuid.S: Fix warning about executable stack
When using the GNU toolchain a warning is printed about an executible
stack:
/usr/lib64/gcc/.../x86_64-suse-linux/bin/ld: warning: /tmp/ccyG3xBB.o: missing .note.GNU-stack section implies executable stack
[ 15s] /usr/lib64/gcc/.../x86_64-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
to prevent this warning, add:
```
.section .note.GNU-stack,"",@progbits
```
Signed-off-by: Egbert Eich <eich@suse.com>
8 months ago
Egbert Eich
ea6515c4b3
On zarch don't produce objects from assembler with a writable stack section
On z-series, the current version of the GNU toolchain produces warnings
such as:
```
/usr/lib64/gcc/[...]/s390x-suse-linux/bin/ld: warning: ztrmm_kernel_RC_Z14.o: missing .note.GNU-stack section implies
executable stack
/usr/lib64/[...]/s390x-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
```
To prevent this message and make sure we are future proof, add
```
.section .note.GNU-stack,"",@progbits
```
Also add the `.size` bit to give the asm defined functions a proper size
in the symbol table.
Signed-off-by: Egbert Eich <eich@suse.com>
8 months ago
Martin Kroeker
f33943d73e
Merge pull request #5196 from martin-frbg/issue5193
Fix misinterpretation of NO_LAPACK=0 and SPMV settings in CMake builds
8 months ago
Ruiyang Wu
251c3f857d
gh m1: fix mixed linkage when built with OpenMP and clang+gfortran
8 months ago
Ruiyang Wu
1b0c0f00e9
CMake: Avoid mixed OpenMP linkage
8 months ago
Ruiyang Wu
02fd1df10b
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
Using `OpenMP::OpenMP_LANG` targets for CMake is less error-prone than
passing the compiler and linker flags manually. Furthermore, it allows
the user to customize those flags by setting `OpenMP_LANG_FLAGS`,
`OpenMP_LANG_LIB_NAMES`, and `OpenMP_omp_LIBRARY`.
8 months ago
Martin Kroeker
8b35534201
Merge pull request #5195 from martin-frbg/update-gensymbolpl
Re-synchronize gensymbol.pl with the posix shell version
8 months ago
Martin Kroeker
51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0
8 months ago
Martin Kroeker
3ca1ba1be3
resynchronize with the posix shell version
8 months ago
Martin Kroeker
72f0abeed5
Merge pull request #5191 from Harishmcw/CMake_Symbol_Fix
Fix DLL symbol name pre/postfixing in CMake builds on Windows
8 months ago
Harishmcw
1724b3f104
DLL symbol pre/postfixing in CMake builds
8 months ago