Martin Kroeker
|
b824fa70eb
|
Fix declaration of assembly arguments in SSYMV and DSYMV microkernels
Arguments 0 and 1 are both input and output
|
6 years ago |
Martin Kroeker
|
91481a3e4e
|
Fix declaration of input arguments in inline assembly
Argument 0 is modified as it doubles as a counter
|
6 years ago |
Martin Kroeker
|
dc6ac9eab0
|
Fix declaration of input arguments in the x86_64 s/dGEMV_T and s/dGEMV_N kernels
Arguments 0 and 1 need to be tagged as both input and output
|
6 years ago |
maamountki
|
f583674109
|
[ZARCH] Fix cgemv_t_4
|
6 years ago |
maamountki
|
77fe70019f
|
[ZARCH] Fix constraints and source code formatting
|
6 years ago |
maamountki
|
7039770165
|
[ZARCH] Undo the last commit
|
6 years ago |
maamountki
|
11a43e8116
|
[ZARCH] Set alignment hint for vl/vst
|
6 years ago |
maamountki
|
61526480f9
|
[ZARCH] Fix copy constraint
|
6 years ago |
maamountki
|
81daf6bc38
|
[ZARCH] Format source code, Fix constraints
|
6 years ago |
Martin Kroeker
|
729e925174
|
Merge pull request #1996 from quickwritereader/develop
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
|
6 years ago |
Ubuntu
|
498ac98581
|
Note for unused kernels
|
6 years ago |
Ubuntu
|
cd9ea45463
|
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
|
6 years ago |
Martin Kroeker
|
f9c5023e04
|
Merge pull request #1994 from quickwritereader/develop
sgemv cgemv pairs
|
6 years ago |
Ubuntu
|
4abc375a91
|
sgemv cgemv pairs
|
6 years ago |
Martin Kroeker
|
874df65491
|
Fix incorrect sgemv results for IBM z14
part of PR #1993 that was inadvertently misplaced into the toplevel directory
|
6 years ago |
Martin Kroeker
|
877023e1e1
|
Fix precision of zarch DSDOT
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
265142edd5
|
Fix typo in the zarch min/max kernels
from patch provided by aarnez in #991
|
6 years ago |
Martin Kroeker
|
885a3c4350
|
USE_TRMM on Z14
from patch provided by aarnez in #991
|
6 years ago |
maamountki
|
82124729af
|
Merge branch 'develop' into z14
|
6 years ago |
maamountki
|
29416cb5a3
|
[ZARCH] Add Z13 version for max/min functions
|
6 years ago |
maamountki
|
48b9b94f7f
|
[ZARCH] Improve loading performance for camax/icamax
|
6 years ago |
Martin Kroeker
|
86a824c97f
|
Fix wrong comparison that made IMIN identical to IMAX
as reported by aarnez in #1990
|
6 years ago |
Martin Kroeker
|
808410c2c7
|
Fix wrong comparison that made IMIN identical to IMAX
as suggested in #1990
|
6 years ago |
maamountki
|
fcd814a8d2
|
[ZARCH] Fix bug in max/min functions
|
6 years ago |
maamountki
|
dc4d3bccd5
|
[ZARCH] Fix icamax/icamin
|
6 years ago |
maamountki
|
c7143c1019
|
[ZARCH] Fix iamax/imax single precision
|
6 years ago |
maamountki
|
04873bb174
|
[ZARCH] Undo the last commit
|
6 years ago |
maamountki
|
c8ef9fb220
|
[ZARCH] Fix bug in iamax/iamin/imax/imin
|
6 years ago |
maamountki
|
b111829226
|
[ZARCH] Update max/min functions
|
6 years ago |
Martin Kroeker
|
32b0f1168e
|
Fix declaration of input arguments in the Sandybridge GER microkernels (#1967)
* Tag arguments 0 and 1 as both input and output
|
6 years ago |
Martin Kroeker
|
b495e54310
|
Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966)
* Tag arguments 0 and 1 as both input and output (see #1964)
|
6 years ago |
Martin Kroeker
|
d5e6940253
|
Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965)
* Tag operands 0 and 1 as both input and output
For #1964 (basically a continuation of coding problems first seen in #1292)
|
6 years ago |
Ubuntu
|
43a4572038
|
crot fix
|
6 years ago |
Abdelrauf
|
a034e65512
|
Merge branch 'develop' into develop
|
6 years ago |
Ubuntu
|
8c3386be87
|
Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin},
Fixed idamin,icamin choosing the first occurance index of equal minimals
|
6 years ago |
maamountki
|
b815a04c87
|
[ZARCH] fix a bug in max/min functions
|
6 years ago |
maamountki
|
1a7925b3a3
|
[ZARCH] Update dgemv_n_4.c
|
6 years ago |
maamountki
|
406f835f00
|
[ZARCH] update cgemv_n_4.c
|
6 years ago |
maamountki
|
621dedb37b
|
[ZARCH] Update cgemv_t_4.c
|
6 years ago |
maamountki
|
b731e8246f
|
Update sgemv_t_4.c
|
6 years ago |
maamountki
|
ecc31b743f
|
Update dgemv_t_4.c
|
6 years ago |
maamountki
|
5d89d6b143
|
[ZARCH] fix sgemv_n_4.c
|
6 years ago |
maamountki
|
67432b23c2
|
[ZARCH] fix cgemv_n_4.c
|
6 years ago |
maamountki
|
be66f5d5c2
|
[ZARCH] fix data prefetch type in sdot
|
6 years ago |
maamountki
|
c2ffef8156
|
[ZARCH] fix data prefetch type in ddot
|
6 years ago |
maamountki
|
e7455f500c
|
[ZARCH] fix dsdot.c
|
6 years ago |
maamountki
|
3eafcfa650
|
[ZARCH] fix cgemv_n_4.c
|
6 years ago |
maamountki
|
94cd946b96
|
[ZARCH] fix cgemv_n_4.c
|
6 years ago |
maamountki
|
1aa840a0a2
|
[ZARCH] fix sgemv_t_4.c
|
6 years ago |
Arjan van de Ven
|
795285c587
|
Fix thinko in skylake beta handling
casting ints is cheaper but it has a rounding, not memory casing effect, resulting in
invalid outcome
|
7 years ago |