AbdelRauf
cdbfb891da
new sgemm 8x16
6 years ago
AbdelRauf
148c4cc5fd
conflict resolve
6 years ago
AbdelRauf
d0c3543c3f
power9 zgemm ztrmm optimized
6 years ago
AbdelRauf
a469b32cf4
sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52
6 years ago
AbdelRauf
8fe794f059
improved zgemm power9 based on power8
6 years ago
AbdelRauf
47f892198c
conflict resolve
6 years ago
AbdelRauf
628b335e83
Merge branch 'develop' of https://github.com/quickwritereader/OpenBLAS into develop
6 years ago
AbdelRauf
0f105dd8a5
sgemm/strmm
6 years ago
Martin Kroeker
7c51cc8527
Merge branch 'develop' into develop
6 years ago
AbdelRauf
853a18bc17
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
6 years ago
Martin Kroeker
3ae122e2c7
Merge pull request #2069 from aixoss/aix-asm-change
AIX asm syntax changes needed for shared object creation
6 years ago
Ayappan P
b043a5962e
AIX asm syntax changes needed for shared object creation
6 years ago
Martin Kroeker
8502030e5e
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
Fix for #2063
6 years ago
Erik M. Bray
8ba9e2a61a
Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles.
6 years ago
Erik M. Bray
4ad694eda1
Fix for #2063 : The DllMain used in Cygwin did not run the thread memory
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
6 years ago
Martin Kroeker
dff4a197a5
Merge pull request #2058 from xsacha/patch-3
Change 64-bit detection as explained in #2056
6 years ago
Martin Kroeker
a5425575b1
Merge pull request #2060 from embray/cygwin/readenv
Use POSIX getenv on Cygwin
6 years ago
Erik M. Bray
1006ff8a7b
Use POSIX getenv on Cygwin
The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().
6 years ago
Martin Kroeker
4fc17d0d75
Trivial typo fix
as suggested in #2022
6 years ago
Sacha
c3e30b2bc2
Change 64-bit detection as explained in #2056
6 years ago
Martin Kroeker
03d7110900
Merge pull request #2042 from maomao194313/develop
add TARGET support for HiSilicon tsv110 CPUs
6 years ago
Martin Kroeker
3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
Add CPUID data for Intel Denverton (as Nehalem)
6 years ago
Martin Kroeker
04f2226ea6
Add Intel Denverton
6 years ago
Martin Kroeker
b1393c7a97
Add Intel Denverton
for #2048
6 years ago
maomao194313
7e3eb9b25d
make DYNAMIC_ARCH=1 package work on TSV110
6 years ago
maomao194313
f074d7d146
make DYNAMIC_ARCH=1 package work on TSV110.
6 years ago
Martin Kroeker
f18ab6c17b
Merge pull request #2051 from martin-frbg/issue2048
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
6 years ago
Martin Kroeker
946ec6c3b8
Merge pull request #2050 from kencu/PowerMacFix
PowerMac 970 fixes
6 years ago
Martin Kroeker
5b95534afc
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
for issue #2048
6 years ago
ken-cunningham-webuse
f7a06463d9
common_power.h: force DCBT_ARG 0 on PPC970 Darwin
without this, we see
../kernel/power/gemv_n.S:427:Parameter syntax error
and many more similar entries
that relates to this assembly command
dcbt 8, r24, r18
this change makes the DCBT_ARG = 0
and openblas builds through to completion on PowerMac 970
Tests pass
6 years ago
ken-cunningham-webuse
b0c714ef60
param.h : enable defines for PPC970 on DarwinOS
fixes:
gemm.c: In function 'sgemm_':
../common_param.h:981:18: error: 'SGEMM_DEFAULT_P' undeclared (first use in this function)
#define SGEMM_P SGEMM_DEFAULT_P
^
6 years ago
Martin Kroeker
8d3d29e4d7
Merge pull request #2049 from Celelibi/fix_crash_sgemm_sse_x64
Fix crash in sgemm SSE/nano kernel on x86_64
6 years ago
Celelibi
b7f59da42d
Fix crash in sgemm SSE/nano kernel on x86_64
Fix bug #2047 .
Signed-off-by: Celelibi <celelibi@gmail.com>
6 years ago
Martin Kroeker
db3dc9e282
Merge pull request #2046 from kencu/powermac
ctest.c : add __POWERPC__ for PowerMac
6 years ago
ken-cunningham-webuse
4290afdae2
ctest.c : add __POWERPC__ for PowerMac
6 years ago
Martin Kroeker
4741ce803b
Merge pull request #2045 from martin-frbg/2033-3
Do not compile in AVX512 check if AVX support is disabled
6 years ago
Martin Kroeker
11cfd0bd75
Do not compile in AVX512 check if AVX support is disabled
xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway
6 years ago
Martin Kroeker
651ab01d2b
Merge pull request #2044 from martin-frbg/issue2043
Fix module definition conflicts between LAPACK and ReLAPACK
6 years ago
Martin Kroeker
d7b2c53c0b
Merge pull request #2039 from brada4/meminit
Address warning in memory.c
6 years ago
Martin Kroeker
e4864a8933
Fix module definition conflicts between LAPACK and ReLAPACK
for #2043
6 years ago
Martin Kroeker
10d841d8b9
Merge pull request #2026 from martin-frbg/trmv_threads
Correct range limiting in trmv_thread and re-enable TRMV multithreading
6 years ago
Martin Kroeker
12f2b76748
Merge pull request #2038 from martin-frbg/issue2035
Improve handling of NO_STATIC and NO_SHARED
6 years ago
Martin Kroeker
6c83b878f6
Merge pull request #2040 from martin-frbg/locks2002
Restore locking optimizations for OpenMP case
6 years ago
maomao194313
fb4dae7124
add TARGET support for HiSilicon tsv110 CPUs
6 years ago
maomao194313
760842dda1
add TARGET support for HiSilicon tsv110 CPUs
6 years ago
maomao194313
53f482ee72
add TARGET support for HiSilicon tsv110 CPUs
6 years ago
maomao194313
783ba8058f
HiSilicon tsv110 CPUs optimization branch
add HiSilicon tsv110 CPUs optimization branch
6 years ago
Martin Kroeker
af480b02a4
Restore locking optimizations for OpenMP case
restore another accidentally dropped part of #1468 that was missed in #2004 to address performance regression reported in #1461
6 years ago
Andrew
e4a79be6bb
address warning introed with #1814 et al
6 years ago
Andrew
e5c316c6b9
init
6 years ago