2023 Commits (cd8ac192a901b38980755583faaa35559df7910a)

Author SHA1 Message Date
  Arjan van de Ven d321448a63 dgemm: use dgemm_ncopy_8_skylakex.c also for Haswell 7 years ago
  Arjan van de Ven c43331ad0a dgemm: Use the skylakex beta function also for haswell 7 years ago
  Martin Kroeker c4e23dd016
Update Makefile 7 years ago
  Martin Kroeker cfc4acc221
typo 7 years ago
  Martin Kroeker 545c2b1bbb
Add -mavx2 on Haswell only if the compiler supports it 7 years ago
  Arjan van de Ven 69d206440a Make the skylakex/haswell sgemm code compile and run even with compilers without avx2 support 7 years ago
  Martin Kroeker 3843e3e017
use -maxv2 on haswell 7 years ago
  Martin Kroeker fbcb14a74b
should be core-avx2 7 years ago
  Martin Kroeker 2a3190dc76
fix elseifeq and use older option core2-avx for compatibility 7 years ago
  Martin Kroeker 1ebe5c0f49
Add -march=haswell to HASWELL part of DYNAMIC_ARCH build 7 years ago
  Arjan van de Ven 0586899a10 Use sgemm_ncopy_4_skylakex.c also for Haswell 7 years ago
  Arjan van de Ven 00dc09ad19 Use the skylake sgemm beta code also for haswell 7 years ago
  Arjan van de Ven cdc668d82b Add a "sgemm direct" mode for small matrixes 7 years ago
  Martin Kroeker 87718807f0
Merge pull request #1910 from martin-frbg/issue1909 7 years ago
  Martin Kroeker 51aec8e96b
make sure the added march=skylake-avx512 does not cause problems on Windows 7 years ago
  Martin Kroeker 06f7d78d70
Add -march=skylake-avx512 to SkylakeX part of DYNAMIC_ARCH builds 7 years ago
  Martin Kroeker 7639f2e1f0
Rewrite the conditional for OSX to fix cmake parsing on others 7 years ago
  Martin Kroeker 2fc712469d
Avoid creating spurious non-suffixed c/zgemm_kernels 7 years ago
  Martin Kroeker 6ba30e270d
Fix typo that broke CNRM2 on ARMV8 since 0.3.0 7 years ago
  Martin Kroeker 701ea88347
Use p2align instead of align for OSX compatibility 7 years ago
  Martin Kroeker 6c7b691083
Really revert xDOT changes from 1832 7 years ago
  Martin Kroeker 5f4c550c27
Merge pull request #1892 from martin-frbg/mipsdot 7 years ago
  Martin Kroeker 95a5542e3c
Revert DOT kernel changes from #1834 7 years ago
  Martin Kroeker 7a2e1bc804
Use generic kernel for DSDOT/SDSDOT 7 years ago
  Martin Kroeker 35653e38b3
Merge pull request #1834 from fengrl/develop 7 years ago
  Andrew 19c4bdd8b3 Add return value so that freebsd system clang does not err out 7 years ago
  Renato Golin 310ea55f29 Simplifying ARMv8 build parameters 7 years ago
  fengruilin 43bb386b10 fix dot problem on 64bit mips 7 years ago
  Arjan van de Ven dcc5d6291e skylakex: Make the sgemm/dgemm beta code robust for a N=0 or M=0 case 7 years ago
  fengrl 2d8064174c
register push/pop command change 7 years ago
  Ashwin Sekhar T K d5aeff636f ARM64: Enable DYNAMIC_ARCH 7 years ago
  Ashwin Sekhar T K e7b66cd36e ARM64: Fix DYNAMIC_ARCH compilation for cores which dont use GEMM3M 7 years ago
  Ashwin Sekhar T K d50abc8903 ARM64: Move parameters from parameter.c to param.h 7 years ago
  Ashwin Sekhar T K 351a0c777c ARM64: Remove XGENE1 references 7 years ago
  Ashwin Sekhar T K 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8 7 years ago
  Ashwin Sekhar T K caf339412f ARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile 7 years ago
  Ashwin Sekhar T K 8001fdcd2a ARM64: Remove dependency of THUNDERX Makefile on ARMV8 Makefile 7 years ago
  Ashwin Sekhar T K 162e312832 ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile 7 years ago
  Ashwin Sekhar T K c3d93caa8d ARM64: Remove dependency of XGENE1 Makefile on ARMV8 Makefile 7 years ago
  Arjan van de Ven 55b244ca0d enable the SGEMM/SKX C based kernel 7 years ago
  Arjan van de Ven d4bad73834 Add a C+intrinsics version of the SGEMM/skylakex kernel 7 years ago
  Arjan van de Ven 582c589727 dgemm/skylakex: replace discrete mul/add with fma 7 years ago
  Arjan van de Ven adbf6afa25 Add vector optimizations for ncopy as well for dgemm/skylakex 7 years ago
  Arjan van de Ven 32bec8afbb add a skylakex optimized dgemm beta function 7 years ago
  Arjan van de Ven 20c5d668fe dgemm/avx512 simplify and speed up the 4x4 kernel 7 years ago
  Arjan van de Ven 6d43c51ccf undo slow dgemm/skylake microoptimization 7 years ago
  Arjan van de Ven d74dc39b0f Add optimized *copy versions for skylakex 7 years ago
  Arjan van de Ven 66b43affbc Add a 24x8 kernel to the skylakex dgemm implementation 7 years ago
  Arjan van de Ven 1938819c25 skylake dgemm: Add a 16x8 kernel 7 years ago
  Martin Kroeker b7496c3638
Function name needs to be CNAME, set from outside to allow suffixing for dynamic_arch 7 years ago