Martin Kroeker
d0ec4325cf
Add cpuid for AMD Ryzen 2
7 years ago
Martin Kroeker
3f73e8b8cf
Add cpuid for AMD Ryzen 2
for #1664
7 years ago
Martin Kroeker
61659f8765
Merge pull request #1648 from martin-frbg/nofort
Handle NOFORTRAN=0
7 years ago
Martin Kroeker
3d3c19717c
Merge pull request #1655 from martin-frbg/issue1641
Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS
7 years ago
Martin Kroeker
24e344038d
Merge pull request #1654 from martin-frbg/avx512check
Add compiler option to avx512 test and hide test output
7 years ago
Martin Kroeker
4e9c34018e
Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS
fixes #1641
7 years ago
Martin Kroeker
f5243e8e1f
Add compiler option to avx512 test and hide test output
7 years ago
Martin Kroeker
ba8388cee0
Merge pull request #1651 from martin-frbg/avx512-nodgemm
Disable the 16x2 DTRMM kernel on SkylakeX as well
7 years ago
Martin Kroeker
6e54b0a027
Disable the 16x2 DTRMM kernel on SkylakeX as well
7 years ago
Martin Kroeker
40c8cbc3bf
Merge pull request #1650 from martin-frbg/avx512-nodgemm
Disable the AVX512 DGEMM kernel for now
7 years ago
Martin Kroeker
d3c9eb4c7d
Merge pull request #1639 from martin-frbg/dyn_list
Add DYNAMIC_LIST option for user-defined list of dynamic targets
7 years ago
Martin Kroeker
f0a8dc2eec
Disable the AVX512 DGEMM kernel for now
due to #1643
7 years ago
Martin Kroeker
cc92257ea6
Update Makefile
7 years ago
Martin Kroeker
2aba1b1658
Merge branch 'develop' into nofort
7 years ago
Martin Kroeker
8396e9e777
Handle NOFORTRAN=0
7 years ago
Martin Kroeker
bfad307ed7
Merge pull request #1647 from martin-frbg/armv7-dot
Remove premature exits from ARMV7 xdot codes
7 years ago
Martin Kroeker
b83e4c60c7
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
e344db269b
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
545b82efd3
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
e322a951fe
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
ff2f171036
Merge pull request #1644 from martin-frbg/revert-filterout
Revert changes to NOFORTRAN handling in Makefile
7 years ago
Martin Kroeker
092175cfec
Revert changes to NOFORTRAN handling from 952541e
7 years ago
Martin Kroeker
750162a05f
Try gradual fallback for cores not in the dynamic core list
7 years ago
Martin Kroeker
e6d93f20f1
Merge pull request #2 from martin-frbg/develop
merge develop
7 years ago
Martin Kroeker
c38c65eb65
Merge pull request #1 from xianyi/develop
Merge xianyi:develop into develop
7 years ago
Martin Kroeker
ce3651516f
Merge pull request #1642 from oon3m0oo/develop
Rewrite &= -> = and simplify the initial blocking phase.
7 years ago
Craig Donner
0144068537
Rewrite &= -> = and simplify the initial blocking phase.
7 years ago
Martin Kroeker
1833a67071
Add support for a user-defined list of dynamic targets
7 years ago
Martin Kroeker
0b2b83d9ed
Add support for a user-defined list of dynamic targets
7 years ago
Martin Kroeker
62cf769aa6
Merge pull request #1638 from martin-frbg/issue1637
Expose the CBLAS interface to the IxAMIN functions and have make build it
7 years ago
Martin Kroeker
eb71d61c7c
Expose CBLAS interface to BLAS extensions iXamin
7 years ago
Martin Kroeker
9cf22b7d91
Build cblas_iXamin interfaces
7 years ago
Martin Kroeker
cc66743b66
Merge pull request #1634 from oon3m0oo/develop
Fix data races reported by TSAN.
7 years ago
oon3m0oo
2aa0a5804e
Use BLAS rather than CBLAS in test_fork.c ( #1626 )
This is handy for people not using lapack.
7 years ago
Craig Donner
28c28ed275
Fix data races reported by TSAN.
7 years ago
oon3m0oo
a399d00425
Further improvements to memory.c. ( #1625 )
- Compiler TLS is now used only used when the compiler supports it
- If compiler TLS is unsupported, we use platform-specific TLS
- Only one variable (an index) is now in TLS
- We only access TLS once per alloc, and never when freeing
- Allocation / release info is now stored within the allocation itself, by
over-allocating; this saves having external structures do the bookkeeping, and
reduces some of the redundant data that was being stored (such as addresses)
- We never hit the alloc lock when not using SMP or when using OpenMP (that was
my fault)
- Now that there are fewer tracking structures I think this is a bit easier to
read than before
7 years ago
Martin Kroeker
f66b9c8826
Merge pull request #1630 from martin-frbg/x86-march
Add -march=skylake-avx512 to flags if target is skylake x
7 years ago
Martin Kroeker
2946c46024
Merge pull request #1631 from oon3m0oo/stack
Avoid declaring arrays of size 0 when making large stack allocations.
7 years ago
Craig Donner
05978528c3
Avoid declaring arrays of size 0 when making large stack allocations.
7 years ago
Martin Kroeker
ef6f0b645e
Merge pull request #1629 from martin-frbg/issue1628
Make gfortran link libomp for clang in the tests; avoid two typical gotchas with NOFORTRAN
7 years ago
Martin Kroeker
0c5b7b400b
Add -march=skylake-avx512 to flags if target is skylake x
7 years ago
Martin Kroeker
952541e840
Need to use filter-out to handle NOFORTRAN not set
7 years ago
Martin Kroeker
9369d3e6e5
Modify NOFORTRAN tests to always check the value; fix rewriting of NO_FORTRAN
7 years ago
Martin Kroeker
10b70c904d
Handle erroneous user settings NOFORTRAN=0 and NO_FORTRAN
7 years ago
Martin Kroeker
6a5ab083b7
Handle special case of gfortran+clang+OpenMP
7 years ago
Martin Kroeker
1f9e4f3193
Handle special case of gfortran+clang+OpenMP
7 years ago
Martin Kroeker
5a6a2bed9a
Merge pull request #1623 from fenrus75/fast-thread
Initialize only the required subset of the jobs array, fix barriers and improve switch ratio on SkylakeX and Haswell. For issue #1622
7 years ago
Martin Kroeker
2d8cc7193a
Support upcoming Intel Cannon Lake CPUs as Skylake X ( #1621 )
* Support upcoming Cannon Lake as Skylake X
7 years ago
Arjan van de Ven
2ddc96c9e5
make WMB / MB safer on x86-64
make it so that
if (foo)
RMB;
else
MB;
is always done correctly and without syntax surprises
7 years ago
Arjan van de Ven
7e39ffe113
On x86-64, make MB/WMB compiler barriers
Whie on x86(64) one does not normally need full memory barriers, it's
good practice to at least use compiler barriers for places where on other
architectures memory barriers are used; this prevents the compiler
from over-optimizing.
7 years ago