289 Commits (develop)

Author SHA1 Message Date
  Rajalakshmi Srinivasaraghavan c24ba8b1dd Optimize saxpy for POWER10 5 years ago
  Martin Kroeker 34c3c407ef
label always_inline function as inline to silence a gcc warning 5 years ago
  Rajalakshmi Srinivasaraghavan ad745c0bae Optimize scopy/ccopy for POWER10 5 years ago
  Martin Kroeker a61c086408
Fix spurious trailing whitespace in comment 5 years ago
  Martin Kroeker f1a4071d8c
Clean up STACKSIZE redefinition 5 years ago
  Martin Kroeker 97cf10062f
Clean up STACKSIZE redefinition 5 years ago
  Martin Kroeker 17e288e18d
Clean up STACKSIZE redefinition 5 years ago
  Martin Kroeker c1422f3e46
Clean up STACKSIZE redefinition 5 years ago
  Martin Kroeker d85b24e103
Clean up STACKSIZE redefinition 5 years ago
  Rajalakshmi Srinivasaraghavan 0826d68f93 POWER10: Change the packing format for bfloat16 5 years ago
  Martin Kroeker 2061f7fdff
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
  Martin Kroeker 9ae80490e0
rename "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
  Martin Kroeker d314d1f49f
Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c 5 years ago
  Rajalakshmi Srinivasaraghavan 2df4235e00 Optimize dcopy/zcopy for POWER10 5 years ago
  Rajalakshmi Srinivasaraghavan be43d2cb96 Optimize daxpy/zaxpy for POWER10 5 years ago
  Rajalakshmi Srinivasaraghavan 317ff27cda POWER10: Avoid setting accumulators to zero in gemm kernels 5 years ago
  Rajalakshmi Srinivasaraghavan f77b6a83f4 dgemv optimization for POWER10 5 years ago
  Rajalakshmi Srinivasaraghavan d557584b71 Fix compilation issues with clang on POWER 5 years ago
  Rajalakshmi Srinivasaraghavan 9be2688c78 Fix to store results in correct order for POWER10 GEMM kernels 5 years ago
  Martin Kroeker 6a2a60038c
Merge pull request #2720 from martin-frbg/issue2694 5 years ago
  Martin Kroeker 251a09ec90
Typo fix 5 years ago
  Martin Kroeker 95d37e1575
Regroup the 32 and 64bit sections and restore 64bit CAXPY 5 years ago
  Martin Kroeker 3523bb778e
Merge pull request #2721 from martin-frbg/p8align 5 years ago
  Martin Kroeker ca3561cab9
Add ifdefs around call to altivec microkernel 5 years ago
  Martin Kroeker 21072e502a
Typo fix 5 years ago
  Martin Kroeker 661c6bfa5a
Exclude altivec code paths if the compiler does not support them 5 years ago
  Martin Kroeker 0033f8be0d
Use vec_vsx_ld/st to fix misaligned accesses flagged by asan 5 years ago
  Martin Kroeker f308e741b2
remove debug output and revert changes to cdot and crot 5 years ago
  Martin Kroeker f8c2697701
Use POWER6 GEMM, TRMM and DTRSM on 32bit POWER8 5 years ago
  EGuesnet 634e1305f9
Update cgemm_kernel_8x4_power8.S 5 years ago
  Rajalakshmi Srinivasaraghavan d23419accc powerpc: Optimized SHGEMM kernel for POWER10 5 years ago
  Gordon Fossum bb2f52844b powerpc: Optimized ZGEMM kernel for POWER10 5 years ago
  Rajalakshmi Srinivasaraghavan 571eadb880 powerpc: Optimized SGEMM/DGEMM/CGEMM for POWER10 5 years ago
  Rajalakshmi Srinivasaraghavan 9fe930f205 powerpc: Add support for future processor 5 years ago
  Martin Kroeker b1ee81228a
Change complex DOT and ROT to generic kernels and switch CGEMM 5 years ago
  Rajalakshmi Srinivasaraghavan bd9ff820bc Fix cmake compilation issue - POWER9 5 years ago
  Martin Kroeker 06208c8d01
Limit this fix to ELFv2 builds 5 years ago
  Martin Kroeker f5c4c28b98
Work around POWER8BE bugs on FreeBSD (ELFv2) 5 years ago
  Rajalakshmi Srinivasaraghavan 2afc074803 Fix DYNAMIC_ARCH build for POWER9 5 years ago
  Martin Kroeker 4f371b0fbf
Use POWER8 kernels on big-endian POWER9 for now 5 years ago
  Martin Kroeker 4046985913
Add proper defaults for IxMIN/IxMAX kernels 5 years ago
  Martin Kroeker 0b39cf95b0
Fix endianness conditionals 5 years ago
  Martin Kroeker 9f39f0a2c3
Specify ismin/ismax assembly kernels for POWER8 directly 5 years ago
  Martin Kroeker d483e9270a
Update KERNEL.POWER8 5 years ago
  Martin Kroeker 01834aee33
Merge pull request #29 from xianyi/develop 5 years ago
  Martin Kroeker d92bd5be24
Update KERNEL.POWER8 5 years ago
  Martin Kroeker 46e4b12946
Update KERNEL.POWER8 5 years ago
  Martin Kroeker cafdd999b8
Update caxpy_power8.S 5 years ago
  Martin Kroeker 92ca92a46c
Update caxpy_power8.S 5 years ago
  Martin Kroeker 486c35c5dc
Update icamin_power8.S 5 years ago