Martin Kroeker
|
31fd13d048
|
MIPS: make HAVE_MSA reflect cpu capability and NO_MSA software/env
|
2 years ago |
Chris Sidebottom
|
2fb096315e
|
Set SWITCH_RATIO for Arm(R) Neoverse(TM) V1 CPUs
From testing this yields better results than the default of `2`.
|
2 years ago |
Honglin Zhu
|
4989e039a5
|
Define SBGEMM_ALIGN_K for DYNAMIC_ARCH build
|
2 years ago |
Jiaxun Yang
|
a50b29c540
|
Provide a fallback MIPS64_GENERIC target
It is really dangerous to fallback to Loongson core on other
MIPS64 processors.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
|
3 years ago |
gxw
|
fbfe1daf6e
|
LoongArch64: Add DYNAMIC_ARCH support
|
3 years ago |
gxw
|
3573306a69
|
LoongArch64: Add core LOONGSON2K1000 and LOONGSONGENERIC
|
3 years ago |
Honglin Zhu
|
123e0dfb62
|
Neoverse N2 sbgemm:
1. Modify the algorithm to resolve multithreading failures
2. No memory allocation in sbgemm kernel
3. Optimize when alpha == 1.0f
|
3 years ago |
Honglin Zhu
|
55d686d41e
|
neoverse n2 sbgemm:
implement ncopy tcopy kernel_8x4
|
3 years ago |
Martin Kroeker
|
dac14a5f7d
|
revert "switch DGEMM parameters for SkylakeX if DYNAMIC_ARCH"
|
3 years ago |
Martin Kroeker
|
a55a06c269
|
Update param.h
|
3 years ago |
Martin Kroeker
|
d93cf7f23c
|
fix defines for CORTEX-X
|
3 years ago |
Martin Kroeker
|
09b8545fc5
|
Add initial support for M1 on Linux, Phytium FT2xxx series, ARM Cortex 510/710/X1/X2
|
3 years ago |
Martin Kroeker
|
8d0f7f0176
|
Revert accidental change of generic ARMV8 DGEMM parameters from #3425
|
3 years ago |
Martin Kroeker
|
c1c0d5ce1d
|
Merge pull request #3492 from binebrank/arm_sve_zgemm
SVE zgemm&cgemm (and other BLAS 3 complex)
|
3 years ago |
Bine Brank
|
b6a445cfd8
|
adapt Makefile for SVE trsm
|
3 years ago |
Martin Kroeker
|
499ae5e8f7
|
Merge pull request #3510 from martin-frbg/issue3505
Fix recent SkylakeX/DYNAMIC_ARCH DGEMM breakage
|
3 years ago |
Martin Kroeker
|
b6b024232d
|
Merge pull request #3508 from snadampal/v1_n2
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
|
3 years ago |
Martin Kroeker
|
15d4b37913
|
SkylakeX: match parameters to dgemm kernels for dyn/non-dyn
|
3 years ago |
Sunita Nadampalli
|
19c8f615dc
|
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
|
3 years ago |
Bine Brank
|
39ab219704
|
sve copy functions for cgemm chemm zsymm
|
3 years ago |
gxw
|
8d9b9c6b2a
|
loongarch64: Optimize dgemm_kernel
|
3 years ago |
Martin Kroeker
|
697e2752d7
|
Merge pull request #3464 from binebrank/arm_sve_sgemm
Add sgemm part for Arm SVE
|
3 years ago |
Bine Brank
|
a8f62a347b
|
fix UNROLL_MN and add to targets for SVE
|
3 years ago |
Martin Kroeker
|
f7f7fea0dc
|
Merge pull request #3472 from kavanabhat/p10_aixas_p8
Fallback for Power kernels
|
3 years ago |
kavanabhat
|
eee3381cbe
|
Fallback for Power kernels
|
3 years ago |
Martin Kroeker
|
dd1f645371
|
switch DGEMM unroll parameters for SkylakeX if DYNAMIC_ARCH
|
3 years ago |
Bine Brank
|
86ae89bf33
|
add sgemm kernel and copy functions for sgemm and ssymm
|
3 years ago |
Martin Kroeker
|
454edd741c
|
Merge pull request #3425 from binebrank/arm_sve_dgemm
Add dgemm kernel for arm64 SVE
|
3 years ago |
Bine Brank
|
f4da23dcb6
|
reduced dgemm_unroll_m to work with 128-bit sve
|
3 years ago |
Bine Brank
|
9388f05a3c
|
configure SVE Makefile
|
3 years ago |
Martin Kroeker
|
52a3f004a0
|
Fix unintended reversion of recent CortexA53 changes
|
3 years ago |
Martin Kroeker
|
19ccef5fb1
|
Add generic MIPS32 target
|
3 years ago |
Jia-Chen
|
302f22693a
|
MOD: optimize normal DGEMM on ARMV8 cortex-A53 & cortex-A55
|
3 years ago |
Martin Kroeker
|
46947efb83
|
Ignore compiler support for MIPS MSA if the cpu lacks this capability
|
3 years ago |
Bine Brank
|
ab7917910d
|
add v2x8 kernel + fix sve dtrmm
|
3 years ago |
Bine Brank
|
7093372e32
|
add ARMV8SVE target
|
3 years ago |
Wangyang Guo
|
7b2f5cb3b7
|
sbgemm: spr: enlarge P to 256 for performance
|
4 years ago |
Wangyang Guo
|
0abbcd19c1
|
sbgemm: spr: tuning for blocking params
|
4 years ago |
Wangyang Guo
|
3dc6052c7e
|
initial support for Sapphire Rapids platform
|
4 years ago |
Martin Kroeker
|
24233b7c49
|
Use "big arm server" GEMM defaults for Vortex
|
4 years ago |
kavanabhat
|
fe3c778c51
|
AIX changes for P10 with GNU Compiler
|
4 years ago |
Wangyang Guo
|
8356a604f0
|
sbgemm: cooperlake: tuning for block params
|
4 years ago |
Niyas Sait
|
7cddbf99b1
|
Make explicit conversion condition on _WIN64 flag
|
4 years ago |
Niyas Sait
|
d1ed72fa87
|
[win/arm64]: Explicit casting for GMEMM_DEFAULT_ALIGN to create 64-bit value
Win64 uses LLP64 datamodel and unsigned long is only 32-bit. For 64-bit
architecture we need 64-bit mask to correctly generate address
|
4 years ago |
gxw
|
af0a69f355
|
Add support for LOONGARCH64
|
4 years ago |
Martin Kroeker
|
a6351e32f0
|
Remove BLASLONG casts from SPARC entries
in response to https://github.com/xianyi/OpenBLAS/pull/3266#issuecomment-878637675
|
4 years ago |
User User-User
|
b7da75e4fd
|
WiP CORTEX A55 support
|
4 years ago |
Martin Kroeker
|
7dfc45e840
|
Remove casts for PPC/POWER and complete parameters for POWER3/4
|
4 years ago |
Gordon Fossum
|
198adea961
|
Changed default P/Q values for CGEMM and ZGEMM (Power10 only)
|
4 years ago |
Martin Kroeker
|
8cdf0825de
|
Add workaround for older gcc on ppc64be not supporting casts in defines
|
4 years ago |