Ashwin Sekhar T K
4713e7c47f
ARM64: Add the VULCAN Target
9 years ago
Zhang Xianyi
b678471d65
Merge branch 'z13' into develop
Conflicts:
CONTRIBUTORS.md
8 years ago
Abdurrauf
6418667818
dtrmm and dgemm for z13
9 years ago
Shivraj Patil
9687437928
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
d1c6469283
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
9 years ago
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
9 years ago
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Werner Saar
88011f625d
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
9 years ago
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
9 years ago
Shivraj Patil
085cf236c2
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
b7b3d8ec8e
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
cd7af5260a
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
9 years ago
Werner Saar
782f75ba94
optimized param.h for POWER8
9 years ago
Werner Saar
0d0c6f7d7d
optimized dgemm for POWER8
9 years ago
Werner Saar
40ac64ae4f
updated param.h for EXCAVATOR
9 years ago
Werner Saar
089aad57f7
updated param.h for POWER8
9 years ago
Werner Saar
879a51165f
Optimized zgemm and tested zgemm again
9 years ago
Shivraj Patil
2c3dfe2bf3
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Werner Saar
3c6294ca3d
added optimized sgemm_tcopy for power8
9 years ago
Zhang Xianyi
dd43661cfd
Init IBM z system (s390x) porting.
9 years ago
Werner Saar
e173c51c04
updated zgemm- and ztrmm-kernel for POWER8
9 years ago
Werner Saar
9c42f0374a
Updated cgemm- and sgemm-kernel for POWER8 SMP
9 years ago
Werner Saar
a51102e9b7
bugfixes for sgemm- and cgemm-kernel
9 years ago
Werner Saar
c5b1fbcb2e
updated optimized cgemm- and ctrmm-kernel for POWER8
9 years ago
Werner Saar
6a9bbfc227
updated sgemm- and strmm-kernel for POWER8
9 years ago
Werner Saar
e1df5a6e23
fixed sgemm- and strmm-kernel
9 years ago
Werner Saar
5c658f8746
add optimized cgemm- and ctrmm-kernel for POWER8
9 years ago
Werner Saar
96284ab295
added sgemm- and strmm-kernel for POWER8
9 years ago
Werner Saar
91e1c5080c
modified configuration, to use power6 sgemm kernel for power8
9 years ago
Werner Saar
b752858d6c
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
9 years ago
Zhang Xianyi
3e8d6ea74f
Init POWER8 kernels by POWER6.
10 years ago
Werner Saar
b07d733a71
added updates for syrk and syr2k
9 years ago
Ashwin Sekhar T K
39937d15cd
Change BUFFER_SIZE for Cortex A57 to 20 MB
Change the GEMM_P, GEMM_Q, GEMM_R values for Cortex A57
10 years ago
Ashwin Sekhar T K
1397b47197
Optimized zgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
45f78963ac
Optimized cgemm kernel for CORTEXA57
Also, add a generic ztrmm 4x4 kernel
10 years ago
Ashwin Sekhar T K
402443bf9c
Optimized dgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Werner Saar
9bd962f655
modified haswell parameter dgemm_unroll_n
10 years ago
Zhang Xianyi
51ff17d46e
Add AMD Excavator target.
10 years ago
Zhang Xianyi
229ce2ccd1
Add cortex-a9 and cortex-a15 targets.
11 years ago
Werner Saar
ddf983d643
added optimizations for steamroller
11 years ago
Werner Saar
4319769b79
added target processor STEAMROLLER
11 years ago
Werner Saar
587e16fba3
Ref #458 : Backport, sandybrigde uses nehalem zgemm kernel
11 years ago
Zhang Xianyi
2fb02626da
Update organization info.
11 years ago
Zhang Xianyi
a85c2785ae
Refs #467 . Added generic kernel file for x86_64.
11 years ago
Benedikt Huber
58c90d5937
# The first commit's message is:
Optimizations for APM's xgene-1 (aarch64).
1) general system updates to support armv8 better. Make all did not work, one needed to supply TARGET=ARMV8.
2) sgem 4x4 kernel in assembler using SIMD, and configuration changes to use it.
3) strmm 4x4 kernel in C. Since the sgem kernel does 4x4, the trmm kernel must also do 4xN.
Added Dave Nuechterlein to the contributors list.
11 years ago
wernsaar
9d7057366d
bugfix for GEMM3M functions
11 years ago