Ashwin Sekhar T K
|
2757b49767
|
THUNDERX2T99: Add Optimized CGEMM Implementation
|
8 years ago |
Ashwin Sekhar T K
|
f279ff4789
|
THUNDERX2T99: Add Optimized SGEMM Implementation
|
8 years ago |
Zhang Xianyi
|
0863a0d4b4
|
Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
|
8 years ago |
Werner Saar
|
c1c5a63d3c
|
prepared parameter.c for UNROLL values, that are not a power of two
|
8 years ago |
Ashwin Sekhar T K
|
4b55fae337
|
ARM64: Add Cavium THUNDERX2T99 Target
|
8 years ago |
Ashwin Sekhar T K
|
0b8e876d89
|
VULCAN: Add optimized DGEMM implementation
|
8 years ago |
Ashwin Sekhar T K
|
4713e7c47f
|
ARM64: Add the VULCAN Target
|
9 years ago |
Werner Saar
|
78b05f6476
|
bugfix for EXCAVATOR and DYNAMIC_ARCH
|
9 years ago |
Zhang Xianyi
|
05196a8497
|
Refs #716. Only call getenv at init function.
|
9 years ago |
Werner Saar
|
4319769b79
|
added target processor STEAMROLLER
|
10 years ago |
wernsaar
|
a64fe9bcc9
|
added optimized sgemv_n kernel for sandybridge
|
11 years ago |
wernsaar
|
2021d0f9d6
|
experimentally removed expensive function calls
|
11 years ago |
wernsaar
|
50e99a52ea
|
added definitions for PILEDRIVER and HASWELL
|
11 years ago |
Zhang Xianyi
|
7a8949e0ce
|
Merge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop
Conflicts:
driver/others/memory.c
|
11 years ago |
Timothy Gu
|
6c2ead30f0
|
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
|
11 years ago |
Jameson Nash
|
f41f03ab83
|
fix #394. this cleans up some handles after using them, and doesn't disable ALL process privileges upon success
|
11 years ago |
Zhang Xianyi
|
bfaaa975e6
|
Added BULLDOZER target. So far it uses barcelona kernels.
|
13 years ago |
Zhang Xianyi
|
d3b67d0bd8
|
Refs #113. Fixed the typo BOBCATE -> BOBCAT
|
13 years ago |
Zhang Xianyi
|
d6cab3f37e
|
Refs #113. Support AMD Bobcate using Barcelona kernel codes. Replace 3DNow! with MMX.
|
13 years ago |
Xianyi Zhang
|
19a48b82cf
|
Init Sandybridge codes based on Nehalem.
|
13 years ago |
Wang Qian
|
8163ab7e55
|
Change the block size on Loongson 3B.
|
14 years ago |
Xianyi Zhang
|
b95ad4cfaf
|
Support detecting ICT Loongson-3B CPU.
|
14 years ago |
traz
|
831858b883
|
Modify aligned address of sa and sb to improve the performance of multi-threads.
|
14 years ago |
Xianyi Zhang
|
16fc083322
|
Refs #47. Fixed the seting parameter bug on Loongson 3A single thread version.
|
14 years ago |
Xianyi Zhang
|
4727fe8abf
|
Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.
|
14 years ago |
Xianyi Zhang
|
342bbc3871
|
Import GotoBLAS2 1.13 BSD version codes.
|
14 years ago |