maamountki
|
81daf6bc38
|
[ZARCH] Format source code, Fix constraints
|
6 years ago |
maamountki
|
a38aa56e76
|
Merge pull request #1 from xianyi/develop
Update
|
6 years ago |
Martin Kroeker
|
729e925174
|
Merge pull request #1996 from quickwritereader/develop
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
|
6 years ago |
Ubuntu
|
498ac98581
|
Note for unused kernels
|
6 years ago |
Ubuntu
|
cd9ea45463
|
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
|
6 years ago |
Martin Kroeker
|
f9c5023e04
|
Merge pull request #1994 from quickwritereader/develop
sgemv cgemv pairs
|
6 years ago |
Ubuntu
|
4abc375a91
|
sgemv cgemv pairs
|
6 years ago |
Martin Kroeker
|
874df65491
|
Fix incorrect sgemv results for IBM z14
part of PR #1993 that was inadvertently misplaced into the toplevel directory
|
6 years ago |
Martin Kroeker
|
1f4b61f572
|
Delete misplaced file sgemv_t_4.c
from #1993 , file should have gone into kernel/zarch
|
6 years ago |
Martin Kroeker
|
282230c303
|
Merge pull request #1993 from martin-frbg/aarnes-zarch
Various fixes for the new Z14 target
|
6 years ago |
Martin Kroeker
|
cce574c3e0
|
Improve the z14 SGEMVT kernel
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
877023e1e1
|
Fix precision of zarch DSDOT
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
265142edd5
|
Fix typo in the zarch min/max kernels
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
885a3c4350
|
USE_TRMM on Z14
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
4b512f84dd
|
Add cache sizes for Z14
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
72d3e7c9b4
|
Add FORCE Z14
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
bdc73a49e0
|
Add parameters for Z14
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
1249ee1fd0
|
Add Z14 target
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
42df9efa0c
|
Merge pull request #1991 from maamountki/z14
[ZARCH] Z14 Support, BLAS 1/2 single precision implementations
|
6 years ago |
maamountki
|
82124729af
|
Merge branch 'develop' into z14
|
6 years ago |
maamountki
|
29416cb5a3
|
[ZARCH] Add Z13 version for max/min functions
|
6 years ago |
maamountki
|
48b9b94f7f
|
[ZARCH] Improve loading performance for camax/icamax
|
6 years ago |
Martin Kroeker
|
86a824c97f
|
Fix wrong comparison that made IMIN identical to IMAX
as reported by aarnez in #1990
|
6 years ago |
Martin Kroeker
|
808410c2c7
|
Fix wrong comparison that made IMIN identical to IMAX
as suggested in #1990
|
6 years ago |
maamountki
|
eaf20f0e7a
|
Remove ztest
|
6 years ago |
maamountki
|
fcd814a8d2
|
[ZARCH] Fix bug in max/min functions
|
6 years ago |
maamountki
|
dc4d3bccd5
|
[ZARCH] Fix icamax/icamin
|
6 years ago |
maamountki
|
c7143c1019
|
[ZARCH] Fix iamax/imax single precision
|
6 years ago |
maamountki
|
04873bb174
|
[ZARCH] Undo the last commit
|
6 years ago |
maamountki
|
c8ef9fb220
|
[ZARCH] Fix bug in iamax/iamin/imax/imin
|
6 years ago |
Martin Kroeker
|
5be61f4b47
|
Merge pull request #1985 from martin-frbg/issue1984
Correct naming of getrf_parallel object
|
6 years ago |
Martin Kroeker
|
3d155cff83
|
Merge pull request #1981 from edisongustavo/develop
Fix include directory of exported targets
|
6 years ago |
Martin Kroeker
|
7d47f0a82d
|
Merge pull request #1978 from danielgindi/feature/msvc_cmake
Better support for MSVC/Windows in CMake (v0.3.x)
|
6 years ago |
Martin Kroeker
|
a529c71a74
|
Merge pull request #1962 from brada4/r
Modrenize R benchmarks slightly
|
6 years ago |
Martin Kroeker
|
89b60dab8a
|
Merge pull request #1987 from martin-frbg/issue1961
Change ARMV8 target with BINARY=32 to ARMV7 automatically
|
6 years ago |
Martin Kroeker
|
58dd7e4501
|
Change ARMV8 target to ARMV7 for BINARY=32
|
6 years ago |
Martin Kroeker
|
36b844af88
|
Change ARMV8 target to ARMV7 when BINARY32 is set
fixes #1961
|
6 years ago |
Martin Kroeker
|
e882b239aa
|
Correct naming of getrf_parallel object
fixes #1984
|
6 years ago |
Martin Kroeker
|
3f7bb87a2a
|
Merge pull request #1971 from martin-frbg/trsm-threshold
Shift transition to multithreading towards larger matrix sizes
|
6 years ago |
Edison Gustavo Muenz
|
e908ac2a51
|
Fix include directory of exported targets
|
6 years ago |
Martin Kroeker
|
8533aca964
|
Avoid penalizing tall skinny matrices
|
6 years ago |
Martin Kroeker
|
16494cb7c4
|
Merge pull request #1980 from martin-frbg/issue1979
Report SkylakeX as Haswell if compiler does not support AVX512
|
6 years ago |
Martin Kroeker
|
b56b34a75c
|
Syntax fix
|
6 years ago |
Martin Kroeker
|
21eda8b577
|
Report SkylakeX as Haswell if compiler does not support AVX512
... or make was invoked with NO_AVX512=1
|
6 years ago |
Daniel Cohen Gindi
|
24288803b3
|
Adjust test script for correct deployment
|
6 years ago |
Martin Kroeker
|
f0d834b824
|
Use VERSION_LESS for comparisons involving software version numbers
|
6 years ago |
Daniel Cohen Gindi
|
63bbd7b0d7
|
Better support for MSVC/Windows in CMake
|
6 years ago |
maamountki
|
b111829226
|
[ZARCH] Update max/min functions
|
6 years ago |
Martin Kroeker
|
010d59bfee
|
Merge pull request #1973 from martin-frbg/issue1464
Increase Zen SWITCH_RATIO to 16
|
6 years ago |
Martin Kroeker
|
83b5c6b92d
|
Fix compilation with NO_AVX=1 set
fixes #1974
|
6 years ago |