Martin Kroeker
e02df9fc55
Propagate BUILD_BFLOAT16 to CFLAGS
4 years ago
Martin Kroeker
1c0a8a714a
Add defaults for SBGEMV kernels
4 years ago
Martin Kroeker
af19cda65a
Add "recursive" option for IBM xlf compiler ( #3359 )
* Add correct "recursive" option for xlf (from reference-lapack issue 606)
4 years ago
Martin Kroeker
bec9d9f63d
Merge pull request #3335 from guowangy/small-matrix-latest
Add GEMM optimization for small matrix and single/double kernel for skylakex
4 years ago
cianciosa
4c766cd11f
Fix a small syntax error. A ( was accidently deleted.
4 years ago
cianciosa
c28560129f
Check the total number of arguments passed insead of if the ARGV# is defined. This fixes a problem when compling openblas as a subproject of another code.
4 years ago
Wangyang Guo
76ea8db4da
Small Matrix: enable by default for x86_64 arch
If no customized GEMM_SMALL_M_PERMIT kernel defined, it will just by pass to normal path.
4 years ago
Wangyang Guo
fee5abd84b
Small Matrix: support cmake build
4 years ago
gxw
0b8f7c8c10
Add cmake support for LOONGARCH64
4 years ago
Martin Kroeker
47ba85f314
Fix regex to match kernels suffixed with cpuname too
4 years ago
Martin Kroeker
30f23be0f9
Rework setting of -mfma to only apply it where necessary
4 years ago
User User-User
91e2b11d3c
add to cmake listings too
4 years ago
Martin Kroeker
13fa9f737d
Modify defines for CR and RC to work around name collision on Windows
4 years ago
Martin Kroeker
db50b24a4a
Add entries for the new Householder Reconstruction functions from 3.9.1
4 years ago
Martin Kroeker
40000d1f64
Add entries for Householder reconstruction functions from 3.9.1
4 years ago
刘雨培
725432efaa
pass NO_AVX512 macro def
4 years ago
Jake Arkinstall
d7a77091a3
Addressed issue #3100 , removing an unnecessary write to the include directory
4 years ago
Martin Kroeker
33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
4 years ago
Martin Kroeker
95e19e2e23
fix case in compiler name check
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
4 years ago
Martin Kroeker
99ac042702
remove spurious lines (probably editor malfunction)
4 years ago
Martin Kroeker
774b9f8653
handle AppleClang in Cooperlake support condition
4 years ago
Martin Kroeker
eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion)
4 years ago
Martin Kroeker
0cc36770f1
Merge pull request #3073 from xoviat/embedded
add embedded option
4 years ago
Martin Kroeker
cb61d3b46b
Add DYNAMIC_LIST support for ARM64
4 years ago
xoviat
b60de4447a
add cortex-m platform
4 years ago
Martin Kroeker
89ae305e11
Workaround for cmake having its own C_COMPILER variable
4 years ago
Martin Kroeker
ec4d77c47c
Add -mfma for HAVE_FMA3 in the non-DYNAMIC_ARCH case as well
4 years ago
Martin Kroeker
a29338aaa6
Remove extraneous quotes that caused a cmake policy warning
4 years ago
Martin Kroeker
438a8e5624
Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds
4 years ago
Martin Kroeker
0155cd53a3
Add -msse3 where needed for DYNAMIC_ARCH builds
4 years ago
Martin Kroeker
a9f9354296
Fix target test
4 years ago
Martin Kroeker
b9bc76aec4
Add files via upload
4 years ago
Martin Kroeker
e5f8c2bf8a
typo fix
4 years ago
Martin Kroeker
6baf8af658
Disable EXPRECISION for the combination of DYNAMIC_CORE and GENERIC target
4 years ago
Chen, Guobing
a7b1f9b1bb
Implementation of BF16 based gemv
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
5 years ago
Martin Kroeker
eddc65c7b7
Add POWER10 support flag (unconditionally for now)
5 years ago
Martin Kroeker
f5902ab0a1
Support cross-compiling for Apple Vortex
5 years ago
Martin Kroeker
f64243ff57
Add compiler options for sse/sse2/ssse3/sse4.1
5 years ago
Martin Kroeker
786c0a3ce8
Add sse options for use of intrinics with older compilers
5 years ago
Martin Kroeker
756802df61
Merge pull request #2890 from martin-frbg/s-d-sum
Revert special handling of Windows xNRM2 and enable C+intrinsics kern…
5 years ago
Martin Kroeker
75e3a92df6
Add express -mavx and -msse options (and fix a stray = for cooperlake)
5 years ago
Martin Kroeker
e3a29f6b58
Change "HALF" and "sh" to "BFLOAT16" and "sb"
5 years ago
Martin Kroeker
68e6823d36
Adapt for supporting only a subset of variable types
5 years ago
Martin Kroeker
88928650c4
Merge pull request #2883 from martin-frbg/issue2872
Minor CMAKE fixes
5 years ago
Martin Kroeker
82a497ec5d
restore PRESCOTT default for DYNAMIC_LIST
5 years ago
Martin Kroeker
de27e4f5fb
Stop DYNAMIC_ARCH build if the toplevel source contains a stray config_kernel.h from a gmake build
This is unlikely to happen in practice, but if it does, the rogue file would get included instead of the dynamically generated version for each target_core, leading to very confusing errors like "invalid operands (undefined UND and ABS sections)" in compilation of the assembly kernels as macros like PREFETCH would remain undefined
5 years ago
Martin Kroeker
e1b7123bbe
Merge pull request #2867 from Qiyu8/usimd-floatdot
Optimize the performance of dot by using universal intrinsics in X86/ARM
5 years ago
Qiyu8
f32d34a015
add sse3 compiler flag
5 years ago
Martin Kroeker
a5feea6611
make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows
5 years ago
Martin Kroeker
2367726578
Remove redundant status message
5 years ago