Andrew
47deec2c1a
fix couple of dead assignment warnings
8 years ago
Martin Kroeker
43c0622e7b
Retire Piledriver/Steamroller/Excavator daxpy microkernels as well
related to issue #1332
8 years ago
Martin Kroeker
0623636c98
Use Sandybridge daxpy kernel on Haswell and Zen for now
The testcase from #1332 exposes a problem in daxpy_microk_haswell-2.c that is not seen with
any of the other Intel x86_64 microkernels.
8 years ago
Andrew
281a2b952f
warning cleanup ( #1380 )
* dead increments in driver/level2
* dead increments in kernel/generic
* part dead increments in kernel/x86_64
8 years ago
Martin Kroeker
8213385ab8
Work around compiler warnings for unused variables in the generic zgemm3m_Xcopy kernels
8 years ago
Martin Kroeker
db00a51e6b
Merge pull request #1371 from martin-frbg/develop
Add trivially optimized DSDOT for POWER8
8 years ago
martin
7a4b3cfbf8
Add trivially optimized DSDOT for POWER8
8 years ago
Martin Kroeker
6c77b5f267
Merge pull request #1369 from martin-frbg/dsdot
Add optimized dsdot to all other x86_64 kernels that use sdot.c
8 years ago
Andrew
441a9c8385
more dead increments clang4 scan-build deadcode.deadstores
8 years ago
Andrew
1236dbe5a6
Eliminate 2-8 dead increments code
8 years ago
Martin Kroeker
c92cd6d162
Add trivially optimized dsdot based on sdot
8 years ago
Martin Kroeker
cae5d9a20b
Add trivially optimized dsdot based on sdot
8 years ago
Martin Kroeker
3d891c3106
Add trivially optimized dsdot based on sdot
8 years ago
Martin Kroeker
4fbdcfa823
Add trivially optimized dsdot based on sdot
8 years ago
Martin Kroeker
1bb6a96ebc
Add trivially optimized dsdot based on sdot
8 years ago
Martin Kroeker
6bd163f37a
Add trivially optimized dsdot based on sdot
8 years ago
Martin Kroeker
f0333333d1
Add trivially optimized dsdot based on sdot
8 years ago
Andrew
e89b979b2c
fix spurious compiler warning fix (no code change)
8 years ago
Andrew
7e9b29b9b8
fix spurious compiler warning (no code change)
8 years ago
Martin Kroeker
6157d0902a
Merge pull request #1358 from martin-frbg/unused_vars
Clean up spurious unused variables in the kernels
8 years ago
Martin Kroeker
3fea849bbf
Remove unused variables from Haswell dtrmm and Bulldozer dtrsm
8 years ago
Martin Kroeker
8f177621bc
Remove unused variables at0...at3 from ?symv_U
8 years ago
Martin Kroeker
5f402b7759
Remove unused (loop?) variable j from the gemv_n_4 implementations
8 years ago
Martin Kroeker
65bf0a343c
Remove unused variable btpr
8 years ago
Martin Kroeker
acf3d34bc5
Silence an unused variable warning with a cast
l2 cache size is not universally needed to assign default unrolling limits, but neither putting its declaration inside an ifdef nor cloning it into all ifdef sections that need it really makes sense here.
8 years ago
Martin Kroeker
ab87ee6b48
Merge pull request #1329 from martin-frbg/dsdot
(Trivial) optimized dsdot implementation for HASWELL
8 years ago
Martin Kroeker
a07807caac
Eliminate loop code when called as/from dsdot
8 years ago
Ashwin Sekhar T K
a0128aa489
ARM64: Convert all labels to local labels
While debugging/profiling applications using perf or other tools, the
kernels appear scattered in the profile reports. This is because the labels
within the kernels are not local and each label is shown as a separate
function.
To avoid this, all the labels within the kernels are changed to local
labels.
8 years ago
Martin Kroeker
0e2cf102e1
Fix 32bit HASWELL
8 years ago
Martin Kroeker
5e3e91d0fc
Split the microkernel workload into chunks of 32 floats for dsdot mode to limit loss of precision
8 years ago
Martin Kroeker
28c3fa8950
Add dsdot
8 years ago
Martin Kroeker
8ac87c1cb6
Implement DSDOT with unchanged sdot microkernels
8 years ago
Martin Kroeker
c7a8512d12
Cmake fixes for DYNAMIC_ARCH builds and whitespace in path names ( #1323 )
* prebuild.cmake: Put quotes around path names that may contain whitespace
(Copied from alexkaratakis' PR #1295 )
* kernel/CMakeLists.txt: Fix common_lapack header inclusion and DYNAMIC_ARCH generation of ?neg_tcopy and ?laswp_ncopy files
* lapack/CMakeLists.txt: Use correct template for ?laswp_(plus,minus) functions
8 years ago
Martin Kroeker
97ecd4996a
Merge pull request #1319 from martin-frbg/issue601
Fix out-of-bounds memory accesses exposed by xccblat3 testcase
8 years ago
Martin Kroeker
1eb43cccad
Merge pull request #1317 from martin-frbg/power8-asm
Save and restore VSX registers
8 years ago
Martin Kroeker
9d92f526dd
Comment out a code block that performs out-of-bounds memory accesses
...and does not appear to be needed even when it stays within the bounds of the array
8 years ago
Martin Kroeker
514d237257
Merge pull request #1279 from xsacha/develop
CMake improvements
8 years ago
Martin Kroeker
f96afd94b0
Fix out-of-bounds accesses where the data should be zero anyway
8 years ago
Martin Kroeker
9c017a2218
Save and restore VSX registers
8 years ago
Shivraj Patil
e3d844b062
Added mips I6500 core
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
8 years ago
Martin Kroeker
46c9357c72
Merge pull request #1288 from quickwritereader/develop
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision). Issue 884
8 years ago
Abdurrauf
1cfdb2295d
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision)
8 years ago
Sacha Refshauge
47ebce4d1a
Clean up, fix old typos. Simplify arch usages. Move system arch check to earlier position.
8 years ago
Sacha Refshauge
69b560751c
Improvements to previous commit (cross-compile).
Fix typos and bad if statements discovered in 0.2.20.
8 years ago
Sacha Refshauge
11911fd941
Add kernel/Makefile.LA to CMake
8 years ago
Isuru Fernando
d3b677fe87
Add commonobjs
8 years ago
Isuru Fernando
505b218829
Merge remote-tracking branch 'upstream/develop' into dyn
8 years ago
Isuru Fernando
d9346930dd
Merge remote-tracking branch 'upstream/develop' into develop
8 years ago
Ashwin Sekhar T K
4899d67f7d
THUDNERX2T99: Fix clang compilation
8 years ago
Isuru Fernando
1d1854032b
Add missing EXCAVATOR
8 years ago