Ashwin Sekhar T K
19fdbee291
Improve the sgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
3b0cdfab1e
Optimized gemv kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
46efa6a1da
Optimized swap kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
ea1465cdf8
Optimized scal kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
fb4be3b3eb
Optimized rot kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
6c2f4ddbcd
Optimized nrm2 kernels for CORTEXA57
10 years ago
Ashwin Sekhar T K
870c4d49c0
Optimized dot kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
cd7684097c
Optimized copy kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
2690b71b1f
Optimized axpy kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
3e4acedf0e
Optimized asum kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
2610752dbb
Optimized iamax kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
dbb213655e
Optimized amax kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ralph Campbell
c053559ed9
Minor C code fixes in kernel/arm
10 years ago
Ralph Campbell
55e4332f00
Remove duplicate -D args in kernel/Makefile.L1
10 years ago
Zhang Xianyi
3e8d6ea74f
Init POWER8 kernels by POWER6.
10 years ago
Zhang Xianyi
69363622a8
Fix DYNAMIC_ARCH=1 bug.
10 years ago
Zhang Xianyi
53b6023a6c
Fix cmake bug on MSVC 32-bit.
10 years ago
Zhang Xianyi
309875de3c
Fix cmake bug on x86 32-bit.
e.g. Build 32-bit on 64-bit Linux.
cmake -DBINARY=32
10 years ago
Zhang Xianyi
8fade093aa
Fixed cmake bug on Visual Studio.
10 years ago
Zhang Xianyi
96f0bbe067
Fixed cmake bug on haswell.
10 years ago
Zhang Xianyi
d8392c1245
Fixe cmake config bugs.
10 years ago
Zhang Xianyi
94b125255f
Merge branch 'develop' into cmake
Conflicts:
driver/others/memory.c
10 years ago
Martin Koehler
711ca33bc6
Improved Ximatcopy when lda==ldb.
The Ximatcopy functions create a copy of the input matrix
although they seem to work inplace. The new routines
XIMATCOPY_K_YY perform the operations inplace if the leading
dimension does not change.
10 years ago
Zhang Xianyi
7df0820160
Use C kernels for s/dgemv on x86.
10 years ago
Zhang Xianyi
f874465bb8
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
10 years ago
Zhang Xianyi
898fc7552a
Merge pull request #612 from ibmsoe/ppc64le
ppc64le platform support (ELF ABI v2)
10 years ago
Zhang Xianyi
ab0a0a75fc
Merge branch 'develop' into cmake
10 years ago
Zhang Xianyi
1cf2b10224
Use pure C generic target on x86 and x86_64.
make TARGET=GENERIC
?gemm3m is unimplemented on generic target.
10 years ago
Zhang Xianyi
7ac7e147d4
Fixed cmake building bugs on Linux. Disable LAPACK by default.
10 years ago
Matthew Brandyberry
7ba4fe5afb
ppc64le platform support (ELF ABI v2)
10 years ago
Zhang Xianyi
dcd5ba4443
Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake
10 years ago
Werner Saar
e7c969e164
added optimized dtrmm_kernel for haswell
10 years ago
Werner Saar
9bd962f655
modified haswell parameter dgemm_unroll_n
10 years ago
Werner Saar
24f58c8bb1
added optimized cscal and zscal kernels for steamroller
10 years ago
Werner Saar
95b1faf667
added optimized cscal and zscal kernels for steamroller and piledriver
10 years ago
Werner Saar
2d9e406050
added optimized cscal kernel for sandybridge
10 years ago
Werner Saar
59083e3ce1
added optimized cscal kernel for bulldozer
10 years ago
wernsaar
685be40339
Merge pull request #571 from wernsaar/develop
added optimized cscal and zscal functions
10 years ago
Werner Saar
31c9e399e9
added optimized cscal kernel for haswell
10 years ago
Werner Saar
7de6bb9889
added optimized zscal kernel for bulldozer
10 years ago
Werner Saar
d63034303b
added optimized zscal kernel for haswell
10 years ago
Zhang Xianyi
51ff17d46e
Add AMD Excavator target.
10 years ago
Werner Saar
18e90ee2e3
bugfix: added static to functions
10 years ago
Werner Saar
e00cccc41e
added optimized dscal kernel for piledriver
10 years ago
Werner Saar
73f09bf64f
optimized dscal kernel for increment != 1
10 years ago
Werner Saar
02e772c7e4
added optimized dscal kernel for haswell
10 years ago
Werner Saar
7aee913991
added optimized dscal kernel for sandybridge
10 years ago
Werner Saar
e50a933037
added optimized dscal kernel for bulldozer
10 years ago
Werner Saar
133c11a156
updated dgemv_n kernel for nehalem
10 years ago