Ralph Campbell
fbc21266e6
Minor C code fixes in driver/
10 years ago
wernsaar
1d33547222
optimized zgemm kernel for haswell
11 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
11 years ago
wernsaar
c947ab85dc
changed level3.c
12 years ago
wernsaar
2840d56aeb
added dgemm_kernel for Piledriver
12 years ago
Zhang Xianyi
32d2ca3035
Refs #214 , #221 , #246 . Fixed the getrf overflow bug on Windows.
I used a smaller threshold since the stack size is 1MB on windows.
12 years ago
wernsaar
6f008abcef
replaced defined(DOUBLE) by !defined(XDOUBLE)
12 years ago
Zhang Xianyi
5d3312142a
Refs #221 #246 . Fixed the overflowing stack bug in mutlithreading BLAS3.
When NUM_THREADS(MAX_CPU_NUNBERS) is very large ,e.g. 256.
typedef struct {
volatile BLASLONG working[MAX_CPU_NUMBER][CACHE_LINE_SIZE * DIVIDE_RATE];
} job_t;
job_t job[MAX_CPU_NUMBER];
The job array is equal 8MB.
Thus, We use malloc instead of stack allocation.
12 years ago
wernsaar
25491e42f9
New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S
12 years ago
Xianyi Zhang
342bbc3871
Import GotoBLAS2 1.13 BSD version codes.
14 years ago