Zhang Xianyi
85636ff1a0
Merge branch 'develop'
9 years ago
Zhang Xianyi
821affb9a0
Update doc for 0.2.19.
9 years ago
Zhang Xianyi
515bc56ea9
Refs #946 . Use nrm2 reference implementation for Power8.
9 years ago
Zhang Xianyi
ae70b916f4
Refs #929 . Deal with zero and NaNs for scale.
9 years ago
Zhang Xianyi
9ea0144482
Merge pull request #941 from sva-img/develop
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
9 years ago
Zhang Xianyi
1f217a6175
Merge pull request #943 from ibmsoe/IBMMASS_Support
Added support of IBM's MASS library that optimizes performance on Pow…
9 years ago
nishidha@us.ibm.com
78348a2853
Added support of IBM's MASS library that optimizes performance on Power architectures
9 years ago
Shivraj Patil
9687437928
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
d1c6469283
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
b544be914d
Merge pull request #933 from ashwinyes/develop_aarch64_20160726_Dgemm_8x4_Opts
Cortex A57: Improvements to DGEMM 8x4 kernel
9 years ago
Ashwin Sekhar T K
c54a29bb48
Cortex A57: Improvements to DGEMM 8x4 kernel
9 years ago
Zhang Xianyi
ff4c5deafa
Merge pull request #930 from sva-img/develop
P6600/I6400 Build fix.
9 years ago
Shivraj Patil
22b9c2747d
P6600/I6400 Build fix. Reverted the changes which was done to support for MIPS n32 ABI
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
27b5211ccd
Merge pull request #927 from sva-img/develop
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
9 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
9e44f3ddd0
Refs #917 Avoid detecting gfortran bug on IBM POWER + Ubuntu
9 years ago
Zhang Xianyi
eece9fd889
Merge pull request #926 from vriera/develop
Complete support for MIPS n32 ABI
9 years ago
Zhang Xianyi
5dfa0712c3
Merge pull request #925 from martin-frbg/develop
Update zgetrf2.f, cpuid_x86.c, dynamic.c
9 years ago
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
9 years ago
Zhang Xianyi
7f2409a8e1
Merge pull request #918 from sva-img/develop
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM.
9 years ago
Vicente Olivert Riera
7f28cd1f88
Complete support for MIPS n32 ABI
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
9 years ago
Martin Kroeker
154729908e
Update cpuid_x86.c
9 years ago
Martin Kroeker
97bd1e42c8
Update cpuid_x86.c
9 years ago
Martin Kroeker
7de829f713
Update dynamic.c
Add Braswell (extended model 4, model 12) N3150 as Nehalem
9 years ago
Martin Kroeker
9b69d8a8e5
Update zgetrf2.f
Trivial typo correction (ZERBLA => XERBLA) to fix #910
9 years ago
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
9 years ago
Ashwin Sekhar T K
8a40f1355e
Improvements to GEMV kernels
9 years ago
Ashwin Sekhar T K
78782485b6
Improvements to COPY and IAMAX kernels
9 years ago
Ashwin Sekhar T K
8d86d14d3f
Add time prints in benchmark output
9 years ago
Ashwin Sekhar T K
925d4e1dc6
Add IAMAX and NRM2 benchmarks
9 years ago
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
437c7d64f2
Merge pull request #913 from dpfoose/develop
Small change to allow compiling with USE_OPENMP on MSVC
9 years ago
Zhang Xianyi
ca5c25c870
Merge pull request #907 from jeromerobert/bug786
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
9 years ago
Zhang Xianyi
4a30a2584a
Merge pull request #897 from ksraste/develop
STRSM optimized for MSA
9 years ago
Daniel Patrick Foose
a94f2b7848
Change to allow compiling with USE_OPENMP on MSVC
MSVC treats the declaration of omp_in_parallel and omp_get_num_procs without the modifiers __declspec(dllimport) and __cdecl as a redefinition.
9 years ago
Jerome Robert
d346c533b1
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
* Hopefully, because this was found by error and trial (dark magic)
* Ref #786
9 years ago
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
9 years ago
Werner Saar
41000c8443
added directory for optimized lapack fortan codes and added dlaqr5.f
9 years ago
Kaustubh Raste
011431b9d7
STRSM optimized for MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Kaustubh Raste
c8a7860eb3
STRSM optimized
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
2daad2bcb5
Merge pull request #893 from biddisco/develop
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
9 years ago
Zhang Xianyi
bac478d17e
Merge pull request #891 from rndfax/develop
mips64/axpy: fix error when INCY == 0
9 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
9 years ago
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
9 years ago
Werner Saar
412bcd187a
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
9 years ago
Werner Saar
bd06b246cc
Merge pull request #890 from wernsaar/develop
optimized dtrsm_kernel_LT for POWER8
9 years ago
Werner Saar
8b140220c8
optimized dtrsm_kernel_LT for POWER8
9 years ago
Werner Saar
318cad9c37
added trsm bencharks for POWER8 to benchmark/Makefile
9 years ago
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
9 years ago
Zhang Xianyi
7d0358475d
Merge the patch for musl libc.
9 years ago