2023 Commits (cd8ac192a901b38980755583faaa35559df7910a)

Author SHA1 Message Date
  Martin Kroeker dd04143d4a
Merge pull request #2328 from martin-frbg/ppc9 6 years ago
  Martin Kroeker f3a6164bff
Merge pull request #2324 from antonblanchard/power9_segv 6 years ago
  Martin Kroeker dedd822d1a
Fix caxpy/caxpyc naming in localentry 6 years ago
  Martin Kroeker 2181fb7047
Fix caxpy/caxpyc naming in localentry 6 years ago
  Martin Kroeker a9b62c03f8
Substitute precompiled gcc7 codes only when gcc is older than 9.x 6 years ago
  Martin Kroeker 97762234f9
Add variable for gcc >=9 test 6 years ago
  wjc404 934e601e93
Update dgemm_kernel_4x8_skylakex_2.c 6 years ago
  Anton Blanchard cf2a8e410c Fix SEGV in cdot_power9 6 years ago
  wjc404 eb1e9c8c92
some optimizations 6 years ago
  Andreas Arnez d117dfd505 Change bad usage of "asum" to "sum" in ZARCH versions of ?sum 6 years ago
  Martin Kroeker b09b5be0a4
Merge pull request #2315 from ewanglong/develop 6 years ago
  Wang, Long bfb5fbdb4d revised fix windows compatible for #2313 6 years ago
  Martin Kroeker 08fa83aba2
Merge pull request #2312 from martin-frbg/power8be 6 years ago
  Wang, Long 1191db1a49 For the sake of windows compatible, used "unsigned long long" to ensure 64-bit length 6 years ago
  Wang, Long 0caf1434c9 Fix the integer overflow issue for large matrix size 6 years ago
  Martin Kroeker cad0d150db
Define alternate kernels for big-endian POWER8 6 years ago
  Martin Kroeker eba0aeb7cd
Fix compilation for big-endian POWER8 6 years ago
  Martin Kroeker 0c07c356c1
Define alternate kernels for big-endian PPC440 6 years ago
  Martin Kroeker 3e67017ac8
Merge pull request #2309 from martin-frbg/ppc970-be 6 years ago
  Martin Kroeker b3ac6ee222
Define alternate kernels for big-endian PPC970 6 years ago
  Martin Kroeker 71e96163db
Merge pull request #2305 from wjc404/develop 6 years ago
  wjc404 819e852ae7
AVX512 CGEMM & ZGEMM kernels 6 years ago
  Martin Kroeker 4c6a457358
Merge pull request #2300 from wjc404/develop 6 years ago
  wjc404 836c414e22
optimizations of software prefetching 6 years ago
  Martin Kroeker 3cd97f1a80
Merge pull request #2301 from martin-frbg/ppc8be 6 years ago
  wjc404 430c11e135
Add files via upload 6 years ago
  wjc404 fbacd2605d
optimizations via software prefetches 6 years ago
  Martin Kroeker 68597002ea
The assembly microkernel is not safe to use on ELFv1 6 years ago
  Martin Kroeker d2a6285549
The assembly microkernel is not safe to use on ELFv1 6 years ago
  Martin Kroeker d999688d1a
The assembly microkernel is not safe to use on ELFv1 6 years ago
  Martin Kroeker 928fe1b28e
The assembly microkernel is not safe to use on ELFv1 6 years ago
  wjc404 1df9a2013d
new sgemm kernel for skylakex 6 years ago
  Martin Kroeker 85ccdce8c4
Remove the IOS fallbacks to generic C kernels 6 years ago
  wjc404 6ff013bae0
native support for icopy_4 6 years ago
  wjc404 0d669e04bb
Update dgemm_kernel_8x8_skylakex.c 6 years ago
  wjc404 17cdd9f9e1
some correction 6 years ago
  wjc404 6bcb06fcb1
make further changes to icopy_8 easier 6 years ago
  wjc404 b7315f8401
Add files via upload 6 years ago
  wjc404 9b19e9e1b0
Update dgemm_kernel_8x8_skylakex.c 6 years ago
  wjc404 6bd67ddbab
Update dgemm_kernel_8x8_skylakex.c 6 years ago
  wjc404 844629af57
Add files via upload 6 years ago
  Martin Kroeker a448884a63
Remove automatic label postfixes from macro included only once 6 years ago
  Martin Kroeker 3a2df19db6
Fix accidental duplication of jump instruction 6 years ago
  Martin Kroeker d2093a40d3
Merge pull request #2277 from martin-frbg/issue2275 6 years ago
  Martin Kroeker 56837e9d92
Make local labels in macro compatible with the xcode assembler 6 years ago
  Martin Kroeker 5e244d80f2
Merge pull request #2271 from quickwritereader/strmm_fix 6 years ago
  AbdelRauf ede5efebab trmm fix 6 years ago
  Martin Kroeker 596a22325a
Fix prologue of power9 assembly cdot(c) kernel to provide cdotc 6 years ago
  Martin Kroeker 7f58f3ad0e
Fix mis-edits in the gcc-derived power8 caxpy kernel 6 years ago
  Martin Kroeker 673e5a0495
Replace several POWER8/9 C kernels with their gcc7-generated assembly versions (#2263) 6 years ago