Zhiyong Dang
3716267124
Change _STDC_VERSION__ to __STDC_VERSION__
Change-Id: Id3fa4e8d9eedd4ef7230df69b611e7f397301a42
7 years ago
Martin Kroeker
20c6c38e51
Merge branch 'develop' into atomic
7 years ago
Martin Kroeker
8ec28ff461
Remove unguarded use of _Atomic and fix tabbing
7 years ago
Martin Kroeker
bb9876db33
Fix thread races and infinite looping on systems with many cpus
On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497 .
This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue.
7 years ago
Martin Kroeker
40160ff3c1
Use _Atomic instead of volatile for thread safety where C11 is supported
7 years ago
Andrew
9fa986337d
add missing brackets to silence indentation warnings gcc721
7 years ago
Andrew
d602b99386
LAPACK helpers in C that need care too
7 years ago
Martin Kroeker
c7a8512d12
Cmake fixes for DYNAMIC_ARCH builds and whitespace in path names ( #1323 )
* prebuild.cmake: Put quotes around path names that may contain whitespace
(Copied from alexkaratakis' PR #1295 )
* kernel/CMakeLists.txt: Fix common_lapack header inclusion and DYNAMIC_ARCH generation of ?neg_tcopy and ?laswp_ncopy files
* lapack/CMakeLists.txt: Use correct template for ?laswp_(plus,minus) functions
8 years ago
Sacha Refshauge
37858d1146
Fix threading usage in CMake: s/SMP/USE_THREAD/
8 years ago
Isuru Fernando
d245caa49a
Support out-of-source build
8 years ago
Dan Horák
56762d5e4c
add lapack laswp for zarch
8 years ago
Ashwin Sekhar T K
3918d17025
LAPACK: Fix lapack-test errors in ARM64 threaded version
8 years ago
Werner Saar
209b63197e
prepared lapack/lauum for UNROLL values, that are not a power of two
8 years ago
Werner Saar
c81dc6322f
prepared lapack/potrf functions for UNROLL values, that are not a power of two
8 years ago
Werner Saar
3e1bbd6b5f
prepared lapack/getrf functions for UNROLL values, that are not a power of two
8 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
9 years ago
Werner Saar
956be69e1d
optimized getrf_single.c for POWER8
9 years ago
Werner Saar
6a2bde7a2d
optimized dgemm and dgetrf for POWER8
9 years ago
Shivraj Patil
2c3dfe2bf3
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
buffer51
4e1b521e27
Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used).
10 years ago
Zhang Xianyi
13f0f8c10e
Refs #723 . Avoid out of boundary for getf2.
9 years ago
Hank Anderson
0553476fba
Added TRANS defines for complex sources in lapack.
10 years ago
Hank Anderson
0d8e227ea7
Changed strategy for setting preprocessor definitions.
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
10 years ago
Hank Anderson
f3f2b3d768
Added complex and single netlib-lapack fortran sources to lapack.cmake.
10 years ago
Hank Anderson
67e39bd8fb
Added mangled complex filenames to interface and lapack CMakeLists.txt.
10 years ago
Hank Anderson
4662a0b13a
Changed generate functions to iterate through a list of float types.
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
10 years ago
Hank Anderson
e74462a3f5
Moved declarations to start of functions to satisfy MSVC C89 implementation.
10 years ago
Hank Anderson
056ba26755
Changed a number of inline calls to use __inline.
MSVC doesn't inmplement C99, so can't use the inline keyword. __inline
appears to work in MSVC and GCC.
10 years ago
Hank Anderson
3b20b62423
Fixed trti2 name.
10 years ago
Hank Anderson
6ddbfea700
Added generic laswp object.
10 years ago
Hank Anderson
e8c39138c6
Removed return value from GenerateNamedObjects.
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
10 years ago
Hank Anderson
13d2d48e67
Added yet another naming scheme for lapack functions.
10 years ago
Hank Anderson
373a1bdadb
Converted lapack/Makefile to cmake.
10 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
11 years ago
wernsaar
c26bbee489
enabled abd tested optimized trtri lapack functions
11 years ago
wernsaar
c4ccb3fbb2
removed lapack/getri because it was never used
11 years ago
wernsaar
a748d3a75d
enabled optimized trti2 lapack functions again
11 years ago
wernsaar
dbaeea7b59
enabled lauu2 and lauum lapack functions again
11 years ago
wernsaar
4f98f8c9b3
enabled and tested optimized potrf lapack functions
11 years ago
wernsaar
536875d463
enabled and tested optimized getrs lapack functions
11 years ago
wernsaar
ac029f81b3
enabled and tested optimized dgetrf function
11 years ago
wernsaar
a35a1a9ae7
changed makefiles for lapack development
11 years ago
wernsaar
4be4db590c
Merge remote branch 'origin/develop' into armv7
12 years ago
wernsaar
fe5f46c330
added experimental support for ARMV8
12 years ago
Zhang Xianyi
5048a80032
Refs #283 . Fixed the incorrect usage of long data type for Windows 64.
12 years ago
Zhang Xianyi
73770e60b8
Refs #309 . Fixed trtri_U single thread computational bug.
12 years ago
wernsaar
95aedfa0ff
added missing file arm/Makefile in lapack/laswp
12 years ago
Zhang Xianyi
a07cc39571
Refs #266 . Fixed the compiling bug with Open64 5.0.
12 years ago
Zhang Xianyi
fd0c388681
Refs #191 . A walk around for dtrtri_U single thread bug.
This function caused the failure of ERKALE serial test.
I replaced it with LAPACK source code.
12 years ago
Zhang Xianyi
32d2ca3035
Refs #214 , #221 , #246 . Fixed the getrf overflow bug on Windows.
I used a smaller threshold since the stack size is 1MB on windows.
12 years ago