Zhang Xianyi
|
a124637329
|
Merge pull request #560 from sebastien-villemot/develop
Fix detection of ARM architectures in c_check.
|
10 years ago |
Sébastien Villemot
|
642aaba2e0
|
Fix detection of ARM architectures in c_check.
This is necessary to avoid the false detection of a cross-compiling environment.
|
10 years ago |
wernsaar
|
4c616173e4
|
Merge pull request #558 from wernsaar/develop
optimizations for sandybridge
|
10 years ago |
Werner Saar
|
5e83d80725
|
optimized dger kernel for sandybridge
|
10 years ago |
Werner Saar
|
b2e1797dc6
|
added optimized sger kernel for sandybridge
|
10 years ago |
Werner Saar
|
e216f686cb
|
optimized saxpy and daxpy for sandybridge
|
10 years ago |
Zhang Xianyi
|
e42652f772
|
Merge pull request #554 from wernsaar/develop
added benchmarks for zgeru and cgeru
|
10 years ago |
Werner Saar
|
e77db2af31
|
add benchmarks for zgeru and cgeru
|
10 years ago |
Zhang Xianyi
|
37b00841ac
|
Merge pull request #552 from jeromerobert/develop
gemv: Ensure stack buffer is large enough to handle memory alignment
|
10 years ago |
Werner Saar
|
fc0e0391f3
|
bugfixes: replaced int with BLASLONG
|
10 years ago |
wernsaar
|
da0f27b9ac
|
Merge pull request #553 from wernsaar/develop
optimized some blas level1 kernels for increments != 1
|
10 years ago |
Werner Saar
|
c22068c406
|
optimized sdot.c for increments != 1
|
10 years ago |
Werner Saar
|
dee100d0e4
|
optimized saxpy.c for increments != 1
|
10 years ago |
Werner Saar
|
0273966abb
|
optimized daxpy kernel for increments != 1
|
10 years ago |
Werner Saar
|
3a67daa954
|
optimized ddot.c for increments != 1
|
10 years ago |
Jerome Robert
|
ab567d8443
|
gemv: Ensure stack buffer is large enough to handle memory alignment
Ref #478
|
10 years ago |
wernsaar
|
3c09cea4b2
|
Merge pull request #550 from wernsaar/develop
added optimized ssymv kernels for haswell and sandybridge
|
10 years ago |
Werner Saar
|
b4f2153dcd
|
added optimized ssymv kernels for sandybridge
|
10 years ago |
Werner Saar
|
1c4b0eeae3
|
added optimized ssymv kernels for haswell
|
10 years ago |
wernsaar
|
406d9d64e9
|
Merge pull request #549 from wernsaar/develop
added optimized dsymv kernels for haswell and sandybridge
|
10 years ago |
Werner Saar
|
1bec9abb9a
|
added optimized dsymv kernels for sandybridge
|
10 years ago |
Werner Saar
|
3814bf60d3
|
added optimized dsymv kernels for haswell
|
10 years ago |
Zhang Xianyi
|
847e19c04e
|
Refs #478,#482, Enable stack alloc for s/dgemv_t.(revert 9798491)
|
10 years ago |
Werner Saar
|
46c7b4d5c8
|
added asum benchmark
|
10 years ago |
Werner Saar
|
8e05d291b5
|
added scal benchmark
|
10 years ago |
wernsaar
|
9da555e5f7
|
Merge pull request #546 from wernsaar/develop
added optimized zaxpy-kernels
|
10 years ago |
Werner Saar
|
6d0db0151f
|
added optimized zaxpy-kernels
|
10 years ago |
Zhang Xianyi
|
37b9033c90
|
Merge pull request #543 from jeromerobert/develop
Fix a buffer overflow with MAX_STACK_ALLOC size in dgemv_t
|
10 years ago |
wernsaar
|
59e7a518c6
|
Merge pull request #544 from wernsaar/develop
Optimized caxpy-kernels
|
10 years ago |
Werner Saar
|
13889515b3
|
added optimized caxpy-kernel for sandybridge
|
10 years ago |
Werner Saar
|
248c9340c3
|
added optimized caxpy-kernel for haswell
|
10 years ago |
Werner Saar
|
e9f33b4ca7
|
added optimized caxpy-kernel for steamroller
|
10 years ago |
Werner Saar
|
f5d847122a
|
updated caxpy_microk_bulldozer-2.c and caxpy.c
|
10 years ago |
Jerome Robert
|
a4c96eca67
|
Fix a buffer overflow with MAX_STACK_ALLOC size in dgemv_t
Refs #478, #482, 9798481, fd9fd42
|
10 years ago |
wernsaar
|
fb02cb0a41
|
Merge pull request #540 from wernsaar/develop
Optimized dot- and axpy-kernels
|
10 years ago |
Werner Saar
|
baa0363ea2
|
add optimized ddot-kernel for piledriver
|
10 years ago |
Werner Saar
|
34ba66606a
|
add optimized daxpy-kernel for piledriver
|
10 years ago |
Werner Saar
|
f615dc7603
|
added optimized saxpy kernel for steamroller
|
10 years ago |
Werner Saar
|
331c417637
|
optimized saxpy for piledriver
|
10 years ago |
Zhang Xianyi
|
6c3a0b5d46
|
Enable MAX_STACK_ALLOC by default.
|
10 years ago |
Zhang Xianyi
|
fd9fd42936
|
Refs #478, #482. Fixed bug on previous commit.
|
10 years ago |
Zhang Xianyi
|
9798481979
|
Refs #478, #482. Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.
For gemv_t, directly use malloc to create the buffer.
|
10 years ago |
Werner Saar
|
d7a17ad85d
|
optimized sdot-kernel for pilediver
|
10 years ago |
Werner Saar
|
d35f6c63c2
|
add optimized daxpy-kernel for steamroller
|
10 years ago |
Werner Saar
|
166d76e864
|
added optimized sdot-kernel for steamroller
|
10 years ago |
Werner Saar
|
f9f127d838
|
added optimized ddot kernel for steamroller
|
10 years ago |
wernsaar
|
62231ab337
|
Merge pull request #538 from wernsaar/develop
Added optimized cdot- and zdot-kernels
|
10 years ago |
Werner Saar
|
3119def9a7
|
updated cdot and zdot
|
10 years ago |
Werner Saar
|
33b332372a
|
add optimized cdot- and zdot-kernel for sandybridge
|
10 years ago |
Werner Saar
|
fd838c75bc
|
add optimized cdot- and zdot-kernel for haswell
|
10 years ago |