davidz-ampere
|
84730068af
|
reduce duplicate kernel code
|
3 months ago |
davidz-ampere
|
be68ef03b4
|
Add support for Ampere processors
|
3 months ago |
Martin Kroeker
|
f1097d1cba
|
Merge pull request #5306 from martin-frbg/lapack1131
Fix missing initialization leading to bypassing corner cases in C/ZGEQP3RK (Reference-LAPACK PR #1131)
|
3 months ago |
Martin Kroeker
|
bad47bd024
|
Fix too strict leading dimensions check in LAPACKE_?gesdd_work (Reference-LAPACK PR #1126) (#5307)
* relax leading dimensions check (Reference-LAPACK PR #1126)
|
3 months ago |
Martin Kroeker
|
7f3093a0ad
|
Merge pull request #5305 from martin-frbg/lapack1135
Fix 2nd dimension used by LAPACKE_c/zunmlq in NaN check and transposition (Reference-LAPACK PR #1135)
|
3 months ago |
Martin Kroeker
|
1804ff58d7
|
fix missing initialization
|
3 months ago |
Martin Kroeker
|
906b9df316
|
fix missing initialization
|
3 months ago |
Martin Kroeker
|
f4e5177050
|
fix dimension used in nancheck (Reference-LAPACK PR 1135)
|
3 months ago |
Martin Kroeker
|
2a6beac88f
|
fix dimension used in transposition (Reference-LAPACK PR 1135)
|
3 months ago |
Martin Kroeker
|
d8a2324699
|
fix dimension used in nancheck (Reference-LAPACK PR 1135)
|
3 months ago |
Martin Kroeker
|
874744976c
|
fix dimension used in nancheck (Reference-LAPACK PR 1135)
|
3 months ago |
Martin Kroeker
|
0ea173ec8c
|
Merge pull request #5304 from martin-frbg/fixgemmtr_if
fix source file used for sbgemmt/sbgemmtr in CMake builds
|
3 months ago |
Martin Kroeker
|
5e393f207c
|
fix source file used for sbgemmt/sbgemmtr
|
3 months ago |
Martin Kroeker
|
dbd5643d37
|
Merge pull request #5302 from martin-frbg/zscal_mips_3
mips64 SICORTEX: temporarily change default C/ZSCAL to the non-asm implementation
|
3 months ago |
Martin Kroeker
|
e338d34ce1
|
fix path
|
3 months ago |
Martin Kroeker
|
d36093d084
|
temporarily change default C/ZSCAL to the non-asm implementation
|
3 months ago |
Martin Kroeker
|
cc4b04a684
|
Merge pull request #5301 from martin-frbg/zscal_mips_2
kernel/mips(64): Fix cscal and zscal
|
3 months ago |
Martin Kroeker
|
b3c90564d7
|
resync with the generic arm version for inf/nan handling
|
3 months ago |
Martin Kroeker
|
6bdc7f9eb7
|
Merge pull request #5300 from martin-frbg/fixup5296
kernel/riscv64: Fix cscal/zscal for riscv64_generic
|
3 months ago |
Martin Kroeker
|
63272b6c82
|
Merge pull request #5299 from martin-frbg/x86_64-ssezscal
Disable the default SSE kernels for x86_64 CSCAL/ZSCAL for now
|
3 months ago |
Martin Kroeker
|
73af02b89f
|
use dummy2 as Inf/NAN handling flag
|
3 months ago |
Martin Kroeker
|
549a9f1dbb
|
Disable the default SSE kernels for CSCAL/ZSCAL for now
|
3 months ago |
Martin Kroeker
|
ca1ce84ee5
|
Merge pull request #5298 from martin-frbg/fixup5281
Fix PR5281 "kernel/arm64: fix cscal/zscal"
|
3 months ago |
Martin Kroeker
|
58eeb9041c
|
fix handling of dummy2
|
3 months ago |
Martin Kroeker
|
7c77537b25
|
Merge pull request #5297 from martin-frbg/zscal_x86_sparc
kernel/(x86|sparc): Fix cscal and zscal by reverting to the generic C kernels
|
3 months ago |
Martin Kroeker
|
63287e1855
|
Merge pull request #5296 from martin-frbg/zscal_riscv
kernel/riscv64: Fix cscal and zscal
|
3 months ago |
Martin Kroeker
|
d2855d3dab
|
Merge pull request #5285 from martin-frbg/zscal_zarch
kernel/zarch: Fix cscal and zscal
|
3 months ago |
Martin Kroeker
|
1408be5fe0
|
Merge pull request #5282 from martin-frbg/zscal_power
kernel/power: Fixed cscal and zscal
|
3 months ago |
Martin Kroeker
|
1589d0b21e
|
Merge pull request #5281 from martin-frbg/zscal_arm64
kernel/arm64: fixed cscal and zscal
|
3 months ago |
Martin Kroeker
|
a86419fb66
|
Merge pull request #5280 from martin-frbg/zscal_x86_64
kernel/x86_64: fixed cscal and zscal
|
3 months ago |
Martin Kroeker
|
11ff18bb0f
|
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal
kernel/generic: Fixed cscal and zscal
|
3 months ago |
Martin Kroeker
|
2e2691b34b
|
Merge pull request #5078 from XiWeiGu/la64_fixed_cscal_zscal
LoongArch64: fixed cscal and zscal
|
3 months ago |
Martin Kroeker
|
f4194fc65f
|
Merge branch 'develop' into la64_fixed_cscal_zscal
|
3 months ago |
Martin Kroeker
|
e12132abd4
|
Use generic C/ZSCAL kernels to address inf/nan handling for now
|
3 months ago |
Martin Kroeker
|
1cefbea7ea
|
Use generic SCAL kernels to address inf/nan handling for now
|
3 months ago |
Martin Kroeker
|
f18b7a46bf
|
add dummy2 flag handling for inf/nan agnostic zeroing
|
3 months ago |
Martin Kroeker
|
fe220a0d7d
|
Merge pull request #5291 from guoyuanplct/develop
kernel/riscv64:fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small
|
3 months ago |
Martin Kroeker
|
bbdc265798
|
Merge pull request #5294 from arnej27959/arnej/fix-arm64-register
Accumulate results in output register explicitly
|
3 months ago |
Arne Juul
|
5442aff218
|
Accumulate results in output register explicitly
|
3 months ago |
guoyuanplct
|
83fcab7578
|
Merge branch 'develop' of https://github.com/guoyuanplct/OpenBLAS into develop
|
3 months ago |
guoyuanplct
|
2ae019161a
|
fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small
|
3 months ago |
Martin Kroeker
|
02267d86f5
|
Merge pull request #5288 from guoyuanplct/develop
kernel/riscv64:Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B.
|
4 months ago |
guoyuanplct
|
d2003dc886
|
del lines
|
4 months ago |
guoyuanplct
|
45fd2d9b07
|
Optimized the axpby function.
|
4 months ago |
Martin Kroeker
|
fb8dc8ff5c
|
Add dummy2 flag handling
|
4 months ago |
Martin Kroeker
|
cf06250d36
|
add handling of dummy2 flag
|
4 months ago |
Martin Kroeker
|
28f8fdaf0f
|
support flag for NaN/Inf handling and fix scaling of NaN/Inf values
|
4 months ago |
Martin Kroeker
|
669c847ceb
|
support extra flag for NaN handling
|
4 months ago |
Martin Kroeker
|
0163143fdd
|
Merge pull request #5278 from martin-frbg/fixup5276
Fix compilation with pre-C99 compilers
|
4 months ago |
Martin Kroeker
|
20f2ba0141
|
Move declaration of i for pre-C99 compilers
|
4 months ago |