Martin Kroeker
c460027dbe
Merge pull request #1325 from grisuthedragon/patch-1
Update README.md to include POWER8
8 years ago
Martin Köhler
bfa9b9f6b2
Update README.md
Add POWER 8 to the list of additional architectures.
8 years ago
Martin Kroeker
c7a8512d12
Cmake fixes for DYNAMIC_ARCH builds and whitespace in path names ( #1323 )
* prebuild.cmake: Put quotes around path names that may contain whitespace
(Copied from alexkaratakis' PR #1295 )
* kernel/CMakeLists.txt: Fix common_lapack header inclusion and DYNAMIC_ARCH generation of ?neg_tcopy and ?laswp_ncopy files
* lapack/CMakeLists.txt: Use correct template for ?laswp_(plus,minus) functions
8 years ago
Martin Kroeker
db72ad8f6a
Merge pull request #1320 from timmoon10/develop
2D thread distribution for multi-threaded GEMMs
8 years ago
Martin Kroeker
97ecd4996a
Merge pull request #1319 from martin-frbg/issue601
Fix out-of-bounds memory accesses exposed by xccblat3 testcase
8 years ago
Martin Kroeker
1eb43cccad
Merge pull request #1317 from martin-frbg/power8-asm
Save and restore VSX registers
8 years ago
Martin Kroeker
9d92f526dd
Comment out a code block that performs out-of-bounds memory accesses
...and does not appear to be needed even when it stays within the bounds of the array
8 years ago
Martin Kroeker
514d237257
Merge pull request #1279 from xsacha/develop
CMake improvements
8 years ago
Tim Moon
30486a356c
Reduce number of data partitions in n.
8 years ago
Martin Kroeker
e1b2502840
Merge pull request #1316 from timmoon10/develop
Variable thread count for multi-threaded GEMMs
8 years ago
Tim Moon
9de52b489a
Cleaning up and documenting multi-threaded GEMM code.
8 years ago
Tim Moon
860dcfc703
Use 2D thread distribution for small GEMMs.
Allows maximum use of available cores if one of M and N is small and the other is large.
8 years ago
Martin Kroeker
f96afd94b0
Fix out-of-bounds accesses where the data should be zero anyway
8 years ago
Martin Kroeker
ebe84215e4
Merge pull request #1318 from pv/potrf-smoketest
Add trivial smoketest for xpotrf
8 years ago
Pauli Virtanen
845e6d750f
Add trivial smoketest for xpotrf
8 years ago
Tim Moon
a89d6711c6
Increasing flexibility of GEMM benchmark.
m, n, and k can be set to arbitrary constants. A and B matrices can be transposed independently.
8 years ago
Martin Kroeker
9c017a2218
Save and restore VSX registers
8 years ago
Tim Moon
0e6b11b708
Merge https://github.com/timmoon10/OpenBLAS into develop
8 years ago
Tim Moon
6aaa107865
Reducing threads for multi-threaded GEMMs on small matrices.
8 years ago
Martin Kroeker
00c42dc815
Merge pull request #1314 from martin-frbg/nofortran-fix-2
Rewrite NOFORTRAN conditionals
8 years ago
Martin Kroeker
79e754e548
Rewrite NOFORTRAN conditionals
... so that they do not trigger accidentally when NOFORTRAN is empty/unset
8 years ago
Martin Kroeker
2ccd7f6e0c
Merge pull request #1310 from sva-img/develop
Added mips I6500 core
8 years ago
Shivraj Patil
e3d844b062
Added mips I6500 core
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
8 years ago
Martin Kroeker
def146efed
Merge pull request #1308 from sebastien-villemot/develop
Add support for TARGET=ZARCH_GENERIC and TARGET=Z13
8 years ago
Sébastien Villemot
7543e578a4
Add support for TARGET=ZARCH_GENERIC and TARGET=Z13
8 years ago
Martin Kroeker
601c71fe54
Merge pull request #1304 from martin-frbg/aix-build-fixes
(Plain make) build system fixes for AIX
8 years ago
Martin Kroeker
3810a6fd99
(Plain make) build system fixes for AIX
- retry fortran compiler test with aix-specific option if generic -m32/-m64 fails
- pass any custom ARFLAGS to lapack
- no addition of -m32/-m64 to the CFLAGS and FFLAGS on AIX
8 years ago
Martin Kroeker
742f54c235
Merge pull request #1303 from martin-frbg/imatcopy-rowscols
Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
8 years ago
Martin Kroeker
d674fbb4c7
Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
Equivalent of #1244 (issue #899 ) for the non-complex cases. Fixes #1289
8 years ago
Martin Kroeker
2922c15f36
Merge pull request #1302 from martin-frbg/nofortran-fix
Remove default FEXTRALIBS in NOFORTRAN case
8 years ago
Martin Kroeker
3a245a376f
Remove default FEXTRALIBS in NOFORTRAN case
8 years ago
Martin Kroeker
46c9357c72
Merge pull request #1288 from quickwritereader/develop
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision). Issue 884
8 years ago
Martin Kroeker
1c3e2d3dd5
Merge pull request #1293 from embray/cygwin/install
More canonical installation on Cygwin
8 years ago
Martin Kroeker
f66d908282
Merge pull request #1299 from martin-frbg/race_fixes
Fix thread data races uncovered by gcc thread sanitizer
8 years ago
Martin Kroeker
ba1f91f17b
Convert another caller of "allocation" to LOCK_COMMAND
... as the "allocation" code jumped to now does UNLOCK_COMMAND instead of blas_unlock
8 years ago
Martin Kroeker
f460776f0f
Fix thread data races
8 years ago
Martin Kroeker
e882f3d6f3
Fix thread data race in memory.c
8 years ago
Erik M. Bray
dddedbab5d
More canonical installation on Cygwin:
* The DLL is named cygopenblas.dll, not libopenblas.dll
* The import lib (still called libopenblas.dll.a) is installed
8 years ago
Abdurrauf
1cfdb2295d
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision)
8 years ago
Martin Kroeker
00740c0e34
Merge pull request #1290 from martin-frbg/imatcopy
Use in-place transform shortcut only if matrix is square
8 years ago
Martin Kroeker
254db9bd7c
Use in-place transform shortcut only if matrix is square
8 years ago
Martin Kroeker
f2074f9ac1
Merge pull request #1286 from martin-frbg/baytrail
Fix coretype detection for Bay Trail Atom
8 years ago
Martin Kroeker
aece65ea29
Fix coretype detection for Bay Trail Atom
My earlier PR #982 appears to have been incomplete in this regard - fixes #1285
8 years ago
Sacha
ef64991506
Clean up config file writing.
8 years ago
Sacha
7a867082d8
Fix open_blas.config which was never working out-of-source. Remove need for gen_config_h.exe. If OpenMP is requested, do not silently ignore when it isn't available.
8 years ago
Sacha Refshauge
a1b87eac6b
Do not require Perl for MSVC if CMake >= 3.4
8 years ago
Sacha Refshauge
47ebce4d1a
Clean up, fix old typos. Simplify arch usages. Move system arch check to earlier position.
8 years ago
Sacha Refshauge
69b560751c
Improvements to previous commit (cross-compile).
Fix typos and bad if statements discovered in 0.2.20.
8 years ago
Sacha Refshauge
0a7a527a92
Add support for cross compiling.
Add support for not having host compiler as CMake cannot detect such a compiler.
Add support for not using getarch.
Successfully builds Android ARMV8. Any target can be added by supplying the TARGET_CORE config in prebuild.cmake.
8 years ago
Martin Kroeker
50715e8945
Merge pull request #1281 from sharkcz/armv8
fix detection of generic ARMv8 CPUs
8 years ago