Werner Saar
b07d733a71
added updates for syrk and syr2k
9 years ago
Ashwin Sekhar T K
39937d15cd
Change BUFFER_SIZE for Cortex A57 to 20 MB
Change the GEMM_P, GEMM_Q, GEMM_R values for Cortex A57
10 years ago
Ashwin Sekhar T K
1397b47197
Optimized zgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
45f78963ac
Optimized cgemm kernel for CORTEXA57
Also, add a generic ztrmm 4x4 kernel
10 years ago
Ashwin Sekhar T K
402443bf9c
Optimized dgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Werner Saar
9bd962f655
modified haswell parameter dgemm_unroll_n
10 years ago
Zhang Xianyi
51ff17d46e
Add AMD Excavator target.
10 years ago
Zhang Xianyi
229ce2ccd1
Add cortex-a9 and cortex-a15 targets.
10 years ago
Werner Saar
ddf983d643
added optimizations for steamroller
10 years ago
Werner Saar
4319769b79
added target processor STEAMROLLER
10 years ago
Werner Saar
587e16fba3
Ref #458 : Backport, sandybrigde uses nehalem zgemm kernel
10 years ago
Zhang Xianyi
2fb02626da
Update organization info.
11 years ago
Zhang Xianyi
a85c2785ae
Refs #467 . Added generic kernel file for x86_64.
11 years ago
Benedikt Huber
58c90d5937
# The first commit's message is:
Optimizations for APM's xgene-1 (aarch64).
1) general system updates to support armv8 better. Make all did not work, one needed to supply TARGET=ARMV8.
2) sgem 4x4 kernel in assembler using SIMD, and configuration changes to use it.
3) strmm 4x4 kernel in C. Since the sgem kernel does 4x4, the trmm kernel must also do 4xN.
Added Dave Nuechterlein to the contributors list.
11 years ago
wernsaar
9d7057366d
bugfix for GEMM3M functions
11 years ago
wernsaar
7aae4a62e7
enabled use of GEMM3M functions
11 years ago
wernsaar
5087096711
optimization of sandybridge cgemm-kernel
11 years ago
wernsaar
1cc02b4337
optimized sgemm kernel for haswell
11 years ago
wernsaar
125610d23b
allow to set custom value for ?GEMM_DEFAULT_UNROLL_MN, optimizations for syrk
11 years ago
Zhang Xianyi
99efbbbad5
Fixed #395 . Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Conflicts:
kernel/Makefile.L1
kernel/x86_64/KERNEL
param.h
11 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
11 years ago
wernsaar
365e8de346
added optimized cgemm-kernel for SANDYBRIDGE
11 years ago
wernsaar
dabab2b5f4
added new optimized sgemm kernel for SANDYBRIGE
11 years ago
wernsaar
aa2709c4e0
enabled optimized dgemm kernel for NEHALEM
11 years ago
wernsaar
d83373db61
added parameter for gemm3m kernels
11 years ago
wernsaar
43fbdb7a5a
added ARMV5 as reference platform
11 years ago
wernsaar
5f3b68b4d4
replaced sgemm and cgemm kernels because lapack bugs
11 years ago
wernsaar
2424af62fd
replaced dgemm-kernel because bug in lapack
11 years ago
wernsaar
47b22763f8
reduced stack usage on windows to 16K
11 years ago
wernsaar
aae75b2461
modified param.h
12 years ago
wernsaar
b3254eecaf
Merge remote branch 'origin/haswell' into develop
12 years ago
wernsaar
ecbc85b954
modified param.h
12 years ago
wernsaar
afe44b0241
tests and code cleanup of gemm_kernels for HASWELL
12 years ago
wernsaar
a77c71eaf5
added highly optimized dgemm_kernel for HASWELL
12 years ago
wernsaar
fe8c5666f9
optimized dgemm_kernel for HASWELL
12 years ago
Zhang Xianyi
2638370844
Init code base for Intel Haswell.
12 years ago
Zhang Xianyi
886cbaf4e4
Support AMD Piledriver by bulldozer kernels.
12 years ago
Zhang Xianyi
6e8501c8a1
Fixed #239 bug in param.h about BARCELONA and BULLDOZER.
12 years ago
wernsaar
f67fa62851
added dgemv_n_bulldozer.S
12 years ago
wernsaar
d65bbec99b
added new sgemm kernel for BULLDOZER
12 years ago
wernsaar
ba800f0883
correct GEMM_THREAD in param.h
12 years ago
wernsaar
25491e42f9
New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S
12 years ago
wernsaar
731220f870
changed DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q to 248 for BULLDOZER 64bit
12 years ago
Zhang Xianyi
b7c0fa6bd2
Init AMD Bulldozer codebase.
13 years ago
Sébastien Villemot
01e3c984ce
Fix compilation with TARGET=GENERIC
Patch applied to Debian package
13 years ago
Sylvestre Ledru
3692b4d631
Improve the detection of sparc
13 years ago
Xianyi Zhang
b39c51195b
Fixed the build bug about Sandy Bridge on 32-bit.
We used Nehalem/Penryn codes on Sandy Bridge 32-bit.
13 years ago
Xianyi Zhang
996dc6d1c8
Fixed dynamic_arch building bug.
13 years ago
wangqian
f76f952547
Refs #83 #53 . Adding Intel Sandy Bridge (AVX supported) kernel codes for BLAS level 3 functions.
13 years ago