Werner Saar
19b8fd2aed
smp lock bugfix
10 years ago
wernsaar
0cc5212741
Merge pull request #580 from wernsaar/develop
added blas level1 swap benchmark
10 years ago
Werner Saar
c47c8e8cf5
added blas level1 swap benchmark
10 years ago
Zhang Xianyi
a11555c715
Support Android NDK armeabi-v7a-hard ABI. (-mfloat-abi=hard)
e.g.
make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7
In Android NDK, it uses armeabi-v7a-hard ABI.
TARGET_CFLAGS += -mhard-float -D_NDK_MATH_NO_SOFTFP=1
TARGET_LDFLAGS += -Wl,--no-warn-mismatch -lm_hard
For more information, please check hard-float example at
android_ndk/tests/device/hard-float/jni/.
10 years ago
wernsaar
897d03518e
Merge pull request #578 from wernsaar/develop
added blas level1 copy benchmark
10 years ago
Werner Saar
23fbc5728e
added blas level1 copy benchmark
10 years ago
Zhang Xianyi
6d40fa587f
Fix f_check bug.
10 years ago
wernsaar
22dcd79959
Merge pull request #577 from wernsaar/develop
Bugfix for armv6 memory barrier
10 years ago
Werner Saar
ea4df0aad3
Ref #574 : Bugfix for armv6 memory barrier
10 years ago
Zhang Xianyi
e127fb8fd8
1) Refs #575 . Remove g77 from compiler list.
2) If OpenBLAS cannot find Fortran compiler, it will only build BLAS
(without LAPACK).
10 years ago
wernsaar
7fb718a7d8
Merge pull request #572 from wernsaar/develop
added optimized cscal and zscal functions for steamroller
10 years ago
Werner Saar
24f58c8bb1
added optimized cscal and zscal kernels for steamroller
10 years ago
Werner Saar
95b1faf667
added optimized cscal and zscal kernels for steamroller and piledriver
10 years ago
Werner Saar
2d9e406050
added optimized cscal kernel for sandybridge
10 years ago
Werner Saar
59083e3ce1
added optimized cscal kernel for bulldozer
10 years ago
wernsaar
685be40339
Merge pull request #571 from wernsaar/develop
added optimized cscal and zscal functions
10 years ago
Werner Saar
31c9e399e9
added optimized cscal kernel for haswell
10 years ago
Werner Saar
7de6bb9889
added optimized zscal kernel for bulldozer
10 years ago
Werner Saar
d63034303b
added optimized zscal kernel for haswell
10 years ago
Zhang Xianyi
51ff17d46e
Add AMD Excavator target.
10 years ago
wernsaar
905534942a
Merge pull request #568 from wernsaar/develop
added optimized dscal kernel
10 years ago
Werner Saar
18e90ee2e3
bugfix: added static to functions
10 years ago
Werner Saar
e00cccc41e
added optimized dscal kernel for piledriver
10 years ago
Werner Saar
73f09bf64f
optimized dscal kernel for increment != 1
10 years ago
Werner Saar
02e772c7e4
added optimized dscal kernel for haswell
10 years ago
Werner Saar
7aee913991
added optimized dscal kernel for sandybridge
10 years ago
Werner Saar
e50a933037
added optimized dscal kernel for bulldozer
10 years ago
Zhang Xianyi
5f9011d6ef
Merge pull request #566 from powderluv/develop
Fix build with ALLOC_SHM=0 (Android NDK)
10 years ago
powderluv
ebb9eba987
Fix build with ALLOC_SHM=0 (Android NDK)
Refactor such that you can build with ALLOC_SHM=0. HughTLB
implicity depends on ALLOC_SHM=1. This patch allows
building for Android NDK r10d.
10 years ago
Zhang Xianyi
8e5a1083bb
Refs #532 . Improve gemv paralel with small m and large n case.
Splite the matrix and reduction.
10 years ago
Zhang Xianyi
6743beb748
Refs #565 . Fix the bug of generate FEXTRALIB.
10 years ago
Zhang Xianyi
bcabf72c08
Refs #565 . Merge branch 'andreasnoack-anj/bench' into develop
10 years ago
Andreas Noack
cda29f183b
Add vecLib benchmarks
10 years ago
wernsaar
e52d36450a
Merge pull request #564 from wernsaar/develop
Use only 1 thread in trsm if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
10 years ago
Werner Saar
f8f2e261fe
use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
10 years ago
Werner Saar
be3c843700
added loops to trsm.c
10 years ago
wernsaar
e6f57db846
Merge pull request #563 from wernsaar/develop
Bugfix for gemm3m tests
10 years ago
Werner Saar
9bfd267d51
bugfix for gemm3m tests
10 years ago
Werner Saar
924bc5372e
removed gemm3m functions from normal checks
10 years ago
wernsaar
2b83a69650
Merge pull request #561 from wernsaar/develop
updated dgemv_n sgemv_n kernels
10 years ago
Werner Saar
133c11a156
updated dgemv_n kernel for nehalem
10 years ago
Werner Saar
30f52d53df
optimized dgemv_n kernel for haswell
10 years ago
Zhang Xianyi
a124637329
Merge pull request #560 from sebastien-villemot/develop
Fix detection of ARM architectures in c_check.
10 years ago
Sébastien Villemot
642aaba2e0
Fix detection of ARM architectures in c_check.
This is necessary to avoid the false detection of a cross-compiling environment.
10 years ago
wernsaar
4c616173e4
Merge pull request #558 from wernsaar/develop
optimizations for sandybridge
10 years ago
Werner Saar
5e83d80725
optimized dger kernel for sandybridge
10 years ago
Werner Saar
b2e1797dc6
added optimized sger kernel for sandybridge
10 years ago
Werner Saar
e216f686cb
optimized saxpy and daxpy for sandybridge
10 years ago
Zhang Xianyi
e42652f772
Merge pull request #554 from wernsaar/develop
added benchmarks for zgeru and cgeru
10 years ago
Werner Saar
e77db2af31
add benchmarks for zgeru and cgeru
10 years ago