Martin Kroeker
736f0146c3
Revert "Fix undefined CC in f_check (again)"
4 years ago
Martin Kroeker
897fc2b6ef
Merge pull request #3118 from martin-frbg/issue3018-2
Fix undefined CC in f_check (again)
4 years ago
Martin Kroeker
441c116105
fix undefined CC again
4 years ago
Martin Kroeker
8ecd80a34a
Merge pull request #14 from xianyi/develop
rebase
4 years ago
Martin Kroeker
4ba53db0da
Merge pull request #3117 from haampie/fix-perl
use /usr/bin/env perl
4 years ago
Martin Kroeker
6c365ff648
Merge pull request #3114 from martin-frbg/issue3113
Fix dll_callback and p_process_term signatures for USE_TLS on Windows x64
4 years ago
Martin Kroeker
e33bcdbb7b
Merge pull request #3115 from martin-frbg/issue2532
Replace unoptimized OMATCOPY_RT with 4x4 blocked version
4 years ago
Harmen Stoppels
ec6b354c32
use /usr/bin/env perl
4 years ago
Martin Kroeker
292d1af1a0
Update omatcopy_rt.c
4 years ago
Martin Kroeker
325b398e3c
Update omatcopy_rt.c
4 years ago
Martin Kroeker
6f5667b4d4
Enable optimized S/D OMATCOPY_RT
4 years ago
Martin Kroeker
cceeee7806
Add optimized omatcopy_rt
4 years ago
Martin Kroeker
0a4546b742
Typo fix
4 years ago
Martin Kroeker
b1eed27a54
Replace naive omatcopy_rt with 4x4 blocked implementation
as suggested by MigMuc in issue 2532
4 years ago
Martin Kroeker
1a3ad4b670
Fix signatures of the TLS-mode dll_callback and p_process_term functions for Win64
4 years ago
Martin Kroeker
86a5f98e4a
Merge pull request #13 from xianyi/develop
rebase
4 years ago
Martin Kroeker
1caa44bea9
Merge pull request #3111 from hawkinsp/forkrace
Fix race in blas_thread_shutdown.
4 years ago
Peter Hawkins
dbbf92c1d1
Fix race in blas_thread_shutdown.
blas_server_avail was read without holding server_lock. If multiple threads call blas_thread_shutdown simultaneously, for example, by calling fork(), then they can attempt to shut down multiple times. This can lead to a segmentation fault.
4 years ago
Martin Kroeker
cb429d6b12
Merge pull request #3110 from martin-frbg/issue3108
Fix get_num_procs() in the USE_TLS branch for non-glibc systems
4 years ago
Martin Kroeker
b0bded3f2f
Fix get_num_procs() in the USE_TLS branch for non-glibc systems
4 years ago
Martin Kroeker
f9aaf22fc3
Merge pull request #3105 from martin-frbg/tigerlake
Recognize Intel Tiger Lake CPUID as SkylakeX
4 years ago
Martin Kroeker
35ff3c731d
Merge pull request #3106 from RajalakshmiSR/ppcbe
Fix build issue on POWER8 with DYNAMIC_ARCH
4 years ago
Rajalakshmi Srinivasaraghavan
63fa6c832e
Fix build issue on POWER8 with DYNAMIC_ARCH
Running make DYNAMIC_ARCH=1 on POWER 8 BE with gcc10.2 version, gives
the following error due to the difference in UNROLL_M/N.
'No rule to make target 'dgemm_incopy_POWER10.o', needed by kernel'
4 years ago
Martin Kroeker
e4e5042e38
Recognize Intel Tiger Lake as SkylakeX
4 years ago
Martin Kroeker
ae53e3e233
Recognize Intel Tiger Lake as SkylakeX
4 years ago
Martin Kroeker
074d9bff7f
Merge pull request #3104 from martin-frbg/issue3103
Enable optimized Haswell/AVX2 kernels for sasum/dasum and srot/drot on Ryzen
4 years ago
Martin Kroeker
f36862603a
Merge pull request #3101 from jake-arkinstall/issue-3100
Addressed issue #3100 - removing an unnecessary write to the include directory
4 years ago
Martin Kroeker
47691c031f
Use Haswell optimizations for Zen as well
4 years ago
Martin Kroeker
ce7ddd8921
Use Haswell optimizations for Zen as well
4 years ago
Martin Kroeker
950c047b49
Use Haswell optimizations for Zen as well
4 years ago
Martin Kroeker
46509953a9
Use Haswell optimizations for Zen as well
4 years ago
Martin Kroeker
db348dcff2
Enable optimized srot/drot kernels from Haswell
4 years ago
Martin Kroeker
a33f471065
Merge pull request #3102 from martin-frbg/issue3099
Strip pkgversion info from compiler version string before comparing
4 years ago
Martin Kroeker
ece3ce581e
Strip parenthesized (pkgversion) data from GCC version string to avoid misinterpretation
4 years ago
Martin Kroeker
8189a98d85
Merge pull request #12 from xianyi/develop
rebase
4 years ago
Jake Arkinstall
d7a77091a3
Addressed issue #3100 , removing an unnecessary write to the include directory
4 years ago
Martin Kroeker
3e1e74fca6
Merge pull request #3094 from xoviat/patch-1
build openmp on appveyor
4 years ago
Martin Kroeker
33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
4 years ago
Martin Kroeker
95e19e2e23
fix case in compiler name check
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
4 years ago
Martin Kroeker
99ac042702
remove spurious lines (probably editor malfunction)
4 years ago
Martin Kroeker
774b9f8653
handle AppleClang in Cooperlake support condition
4 years ago
Martin Kroeker
eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion)
4 years ago
xoviat
6fa9860dbe
appveyor: cleanup and add openmp run
4 years ago
Martin Kroeker
0cc36770f1
Merge pull request #3073 from xoviat/embedded
add embedded option
4 years ago
Martin Kroeker
558cd543bf
Merge pull request #3093 from martin-frbg/fix3064
fix copy-paste error in build rules for cblas_crotg and cblas_zrotg
4 years ago
Martin Kroeker
bd906e3410
fix copy-paste error in build rules for cblas_crotg and cblas_zrotg
4 years ago
Martin Kroeker
35086cb501
Merge pull request #3092 from RajalakshmiSR/cscal_p10
Optimize cscal function for POWER10
4 years ago
Rajalakshmi Srinivasaraghavan
2056ffc227
Optimize cscal function for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
4 years ago
Martin Kroeker
7745439312
Merge pull request #3091 from martin-frbg/lapack477-2
Fix calculation of the non-exceptional shift values in LAPACK complex QZ
4 years ago
Martin Kroeker
c4b5abbe43
fix data type
4 years ago