Arjan van de Ven
99c7bba8e4
Initial support for SkylakeX / AVX512
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)
This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".
Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
7 years ago
Martin Kroeker
d94d7baf7e
Add mips32r2 api target
7 years ago
Shivraj Patil
e3d844b062
Added mips I6500 core
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
8 years ago
Gian-Carlo Pascutto
832a272784
Revert Zen param.h to Haswell values (instead of Excavator).
8 years ago
Denis Steckelmacher
c9ff735da6
Add ZEN support (tested for auto-detected static backend)
8 years ago
Martin Kroeker
cd135e2b59
Merge pull request #1130 from quickwritereader/develop
Blas 3 for single precision
8 years ago
Abdurrauf
08786c4b95
strmm and ctrmm
8 years ago
Abdurrauf
82e80fa82b
initial strmm(sgemm). not tuned yet
8 years ago
Martin Kroeker
ffc1d6c468
Merge pull request #1108 from ashwinyes/develop_20170203_thunderx2t99
Optimized Implementations for ThunderX2T99
8 years ago
Ashwin Sekhar T K
19ba133383
THUNDERX2T99: Add Optimized ZGEMM Implementation
8 years ago
Abdurrauf
0d96b0e2a7
Merge branch 'z13' into develop
8 years ago
Abdurrauf
848cb27b1e
ztrmm kernel.
8 years ago
Ashwin Sekhar T K
2757b49767
THUNDERX2T99: Add Optimized CGEMM Implementation
8 years ago
Ashwin Sekhar T K
f279ff4789
THUNDERX2T99: Add Optimized SGEMM Implementation
8 years ago
Ashwin Sekhar T K
4b55fae337
ARM64: Add Cavium THUNDERX2T99 Target
8 years ago
Andrew Pinski
fb200c7245
ARM64: Add Cavium THUNDERX Target
8 years ago
Ashwin Sekhar T K
4713e7c47f
ARM64: Add the VULCAN Target
9 years ago
Zhang Xianyi
b678471d65
Merge branch 'z13' into develop
Conflicts:
CONTRIBUTORS.md
8 years ago
Abdurrauf
6418667818
dtrmm and dgemm for z13
8 years ago
Shivraj Patil
9687437928
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
d1c6469283
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
9 years ago
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
9 years ago
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Werner Saar
88011f625d
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
9 years ago
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
9 years ago
Shivraj Patil
085cf236c2
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
b7b3d8ec8e
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
cd7af5260a
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
9 years ago
Werner Saar
782f75ba94
optimized param.h for POWER8
9 years ago
Werner Saar
0d0c6f7d7d
optimized dgemm for POWER8
9 years ago
Werner Saar
40ac64ae4f
updated param.h for EXCAVATOR
9 years ago
Werner Saar
089aad57f7
updated param.h for POWER8
9 years ago
Werner Saar
879a51165f
Optimized zgemm and tested zgemm again
9 years ago
Shivraj Patil
2c3dfe2bf3
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Werner Saar
3c6294ca3d
added optimized sgemm_tcopy for power8
9 years ago
Zhang Xianyi
dd43661cfd
Init IBM z system (s390x) porting.
9 years ago
Werner Saar
e173c51c04
updated zgemm- and ztrmm-kernel for POWER8
9 years ago
Werner Saar
9c42f0374a
Updated cgemm- and sgemm-kernel for POWER8 SMP
9 years ago
Werner Saar
a51102e9b7
bugfixes for sgemm- and cgemm-kernel
9 years ago
Werner Saar
c5b1fbcb2e
updated optimized cgemm- and ctrmm-kernel for POWER8
9 years ago
Werner Saar
6a9bbfc227
updated sgemm- and strmm-kernel for POWER8
9 years ago
Werner Saar
e1df5a6e23
fixed sgemm- and strmm-kernel
9 years ago
Werner Saar
5c658f8746
add optimized cgemm- and ctrmm-kernel for POWER8
9 years ago
Werner Saar
96284ab295
added sgemm- and strmm-kernel for POWER8
9 years ago
Werner Saar
91e1c5080c
modified configuration, to use power6 sgemm kernel for power8
9 years ago
Werner Saar
b752858d6c
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
9 years ago
Zhang Xianyi
3e8d6ea74f
Init POWER8 kernels by POWER6.
10 years ago