Martin Kroeker
85fd3c4279
Support compilation with the Cray C and Fortran compilers ( #3712 )
* Add support for the Cray Fortran compiler
3 years ago
Martin Kroeker
18b19d135b
C_LAPACK: Fixes to make it compile with MSVC ( #3605 )
* Fix f2c-like support functions to compile with MSVC, and
re-enable C_LAPACK for MSVC in CMAKE
* Add MSVC&flang build to Azure CI in order to check C_LAPACK correctness
3 years ago
Martin Kroeker
b7873605d4
Use f2c translations of LAPACK when no Fortran compiler is available ( #3539 )
* Add C equivalents of the Fortran routines from Reference-LAPACK as fallbacks, and C_LAPACK variable to trigger their use
3 years ago
Rafael Cardoso Fernandes Sousa
d38110a5ce
Use CMake variables instead of as
3 years ago
Rafael Cardoso Fernandes Sousa
214fbcee15
Fix cmake for power
3 years ago
Markus Mützel
de2ed66596
cmake: Set SUFFIX64 also for NOFORTRAN
3 years ago
Wangyang Guo
3dc6052c7e
initial support for Sapphire Rapids platform
4 years ago
Martin Kroeker
e02df9fc55
Propagate BUILD_BFLOAT16 to CFLAGS
4 years ago
Wangyang Guo
76ea8db4da
Small Matrix: enable by default for x86_64 arch
If no customized GEMM_SMALL_M_PERMIT kernel defined, it will just by pass to normal path.
4 years ago
Wangyang Guo
fee5abd84b
Small Matrix: support cmake build
4 years ago
Martin Kroeker
30f23be0f9
Rework setting of -mfma to only apply it where necessary
4 years ago
User User-User
91e2b11d3c
add to cmake listings too
4 years ago
刘雨培
725432efaa
pass NO_AVX512 macro def
4 years ago
Martin Kroeker
33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
4 years ago
Martin Kroeker
95e19e2e23
fix case in compiler name check
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
4 years ago
Martin Kroeker
99ac042702
remove spurious lines (probably editor malfunction)
4 years ago
Martin Kroeker
774b9f8653
handle AppleClang in Cooperlake support condition
4 years ago
Martin Kroeker
eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion)
4 years ago
xoviat
b60de4447a
add cortex-m platform
4 years ago
Martin Kroeker
438a8e5624
Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds
4 years ago
Martin Kroeker
0155cd53a3
Add -msse3 where needed for DYNAMIC_ARCH builds
4 years ago
Martin Kroeker
b9bc76aec4
Add files via upload
4 years ago
Martin Kroeker
f64243ff57
Add compiler options for sse/sse2/ssse3/sse4.1
5 years ago
Martin Kroeker
e3a29f6b58
Change "HALF" and "sh" to "BFLOAT16" and "sb"
5 years ago
Martin Kroeker
68e6823d36
Adapt for supporting only a subset of variable types
5 years ago
Martin Kroeker
e1b7123bbe
Merge pull request #2867 from Qiyu8/usimd-floatdot
Optimize the performance of dot by using universal intrinsics in X86/ARM
5 years ago
Qiyu8
f32d34a015
add sse3 compiler flag
5 years ago
Martin Kroeker
a5feea6611
make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows
5 years ago
Martin Kroeker
c4aeeeb9f4
Activate all BUILD_ options if none was specified
5 years ago
Martin Kroeker
26792d2096
Copy BUILD_* directives to the compiler options to allow ifdef in tests
5 years ago
Martin Kroeker
68b1713c30
Merge pull request #2811 from martin-frbg/issue2806
Make NO_AVX512 option override the AVX512 compile test in CMAKE builds as well
5 years ago
Martin Kroeker
bd3207b4b4
Update system.cmake
5 years ago
Martin Kroeker
b8ebfc9335
Update system.cmake
5 years ago
Martin Kroeker
71d33c952d
Typo fix
5 years ago
Martin Kroeker
6a3c074786
-march=cooperlake requires gcc10
5 years ago
Chen, Guobing
e740c4873d
Enable COOPERLAKE build target
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
5 years ago
Martin Kroeker
6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead
5 years ago
Martin Kroeker
3ce469a34f
Limit optimization level to O1 for flang and add -frecursive
5 years ago
Martin Kroeker
bb12c2c854
Limit MAX_STACK_ALLOC availability to non-Wndows
5 years ago
Martin Kroeker
6e97df7b47
Add CMAKE support for MAX_STACK_ALLOC setting
5 years ago
Rajalakshmi Srinivasaraghavan
7eb55504b1
RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes). Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N. Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.
Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64. For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.
This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
5 years ago
Martin Kroeker
7f0d523b42
Make BUFFER_SIZE configurable
5 years ago
Martin Kroeker
e3d846ab57
Do not use -march=native with the PGI compiler
6 years ago
Martin Kroeker
f69a0be712
Add getarch flags to disable AVX on x86
(and other small fixes to match Makefile behaviour)
6 years ago
Michael Lass
7a9a4dbc4f
Fix detection of AVX512 capable compilers in getarch
21eda8b5
introduced a check in getarch.c to test if the compiler is capable of
AVX512. This check currently fails, since the used __AVX2__ macro is only
defined if getarch itself was compiled with AVX2/AVX512 support. Make sure this
is the case by building getarch with -march=native on x86_64. It is only
supposed to run on the build host anyway.
6 years ago
Martin Kroeker
1e52572be3
Add option USE_LOCKING for single-threaded build with locking support
6 years ago
luz.paz
daf2fec12d
Misc. typo fixes
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
6 years ago
Martin Kroeker
5952e586ce
Support DYNAMIC_LIST option in cmake
e.g. cmake -DDYNAMIC_ARCH=1 -DDYNAMIC_LIST="NEHALEM;HASWELL;ZEN" ..
original issue was #1639
6 years ago
Martin Kroeker
58dd7e4501
Change ARMV8 target to ARMV7 for BINARY=32
6 years ago
Martin Kroeker
76b4b8980f
Use -dumpversion with gcc only
6 years ago