Martin Kroeker
2332ea7e7a
fix misleading indentation
11 months ago
Martin Kroeker
73e13b0273
flesh out HERK prototype
1 year ago
Martin Kroeker
824306baab
flesh out HERK prototype
1 year ago
Mark Ryan
3b715e6162
Add autodetection for riscv64
Implement DYNAMIC_ARCH support for riscv64. Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly. Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0. The
approach taken is to first try hwprobe. If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.
Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.
A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
1 year ago
Martin Kroeker
2dda40d280
use atomic operations as in the corresponding getrf
1 year ago
Dirreke
ec89466e14
Add CSKY support
1 year ago
Martin Kroeker
1d4aa8d7d5
fix improper function prototypes (empty parentheses)
2 years ago
Martin Kroeker
f4f31fb53b
fix improper function prototypes (empty parentheses)
2 years ago
gxw
d15e0a055c
LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH
2 years ago
Martin Kroeker
3b6050ac04
clarify the comment on the out-of-bounds check from #723
2 years ago
Martin Kroeker
22a402bc2c
clarify the comment on the out-of-bounds check from #723
2 years ago
Martin Kroeker
437c0bf2b4
Merge pull request #3843 from Mousius/switch-ratio
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
2 years ago
Chris Sidebottom
32f2fafde7
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
Previously dynamic builds were either using the default SWITCH_RATIO
or one from the higher level architecture; this patch ensures the
dynamic builds can use this parameter as well.
2 years ago
Martin Kroeker
6c431239da
Split test condition in LU computation - non-denormal for computation, exact zero for reporting singularity
2 years ago
Martin Kroeker
12aabb9f9b
fix conditional
2 years ago
Martin Kroeker
f3d21039ce
Improve fix from PR3924 ( #3941 )
* compare denominator against DBL_MIN rather than a somewhat arbitrary small number near it
2 years ago
Martin Kroeker
3d27cbd9a3
avoid overflow in division
2 years ago
Martin Kroeker
a39ced0551
avoid overflow in division
2 years ago
Martin Kroeker
aa2a2d9c01
Conditionally compile files that may get replaced by ReLAPACK
2 years ago
Martin Kroeker
7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
3 years ago
Martin Kroeker
40003f8edb
Fix pivot offset calculation for negative incx
3 years ago
Martin Kroeker
57e2a72f40
Fix pivot offset calculation for negative incx
3 years ago
Martin Kroeker
3b6293f5a0
Fix offset calculation for negative incx
3 years ago
Martin Kroeker
afa0cece5c
Fix pivot offset calculation for negative incx
3 years ago
Martin Kroeker
eca2f50b48
Fix pivot offset calculation for negative incx
3 years ago
Martin Kroeker
0e9e951306
Fix pivot offset calculation for negative incx
3 years ago
Martin Kroeker
1b49ef8dcf
Fix pivot index for negative increments
3 years ago
Martin Kroeker
6b407a16cb
fix function typecasts
3 years ago
Martin Kroeker
aecb4a5e8d
fix function typecasts
3 years ago
Martin Kroeker
c49d46f25f
fix function typecast
3 years ago
gxw
af0a69f355
Add support for LOONGARCH64
4 years ago
Zhang Xianyi
d7ba7679b6
Merge branch 'develop' into risc-v
5 years ago
Martin Kroeker
4bb73c0171
Rename "HALF" type to "BFLOAT16"
5 years ago
Martin Kroeker
32733ded04
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
5 years ago
Martin Kroeker
b27ca78a21
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
93454022a9
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
20cf1d773f
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
5c657fffad
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
b262058059
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
bc319cee82
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
e5966f8606
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
9df12eb08f
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
cf53970bcb
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
dcd51d5c72
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
b8f95354c7
Adapt to having only a subset of variable types supported
5 years ago
Martin Kroeker
f194ad59e1
Use _Atomic instead of volatile where available (file moved from ../getrf)
must have misplaced this in ../getrf when I made that change in March 2018 (40160ff
)
the only changes since then were
RFC : Add half precision gemm for bfloat16 in OpenBLAS Rajalakshmi Srinivasaraghavan
Rajalakshmi Srinivasaraghavan committed on 14 Apr 2020 as 7ebbb50
Change _STDC_VERSION__ to __STDC_VERSION__
Zhiyong Dang committed on 11 May 2018 as 3716267
5 years ago
Martin Kroeker
4fda217f99
Delete potrf_parallel.c (moving it to ../potrf)
5 years ago
Martin Kroeker
bbe119ee3b
Update conditional for atomics to use HAVE_C11
5 years ago
Martin Kroeker
f4f74941bd
Update conditional for atomics to use HAVE_C11
5 years ago
Rajalakshmi Srinivasaraghavan
22bb50fb81
cmake fixes
5 years ago