2023 Commits (cd8ac192a901b38980755583faaa35559df7910a)

Author SHA1 Message Date
  Rajalakshmi Srinivasaraghavan d23419accc powerpc: Optimized SHGEMM kernel for POWER10 5 years ago
  Martin Kroeker c854ef5471
Fix variable names in conditional 5 years ago
  Martin Kroeker c0afc11742
Fix POWERPC builds on AIX (gcc/gfortran 7) 5 years ago
  Gordon Fossum bb2f52844b powerpc: Optimized ZGEMM kernel for POWER10 5 years ago
  Rajalakshmi Srinivasaraghavan 571eadb880 powerpc: Optimized SGEMM/DGEMM/CGEMM for POWER10 5 years ago
  Kavana Bhat df4ade070f Fix for #2671 5 years ago
  Martin Kroeker 93592d1260
Merge pull request #2675 from wjc404/develop 5 years ago
  wjc404 086d87a302
AVX512 dgemm tcopy_16 function 5 years ago
  Rajalakshmi Srinivasaraghavan 9fe930f205 powerpc: Add support for future processor 5 years ago
  ZhangDanfeng bc6fd20a40 fix INIT8x4 5 years ago
  Martin Kroeker 89091e6b64
Merge pull request #2645 from martin-frbg/misc_fixes 5 years ago
  Martin Kroeker c3574ffe53
Merge pull request #2646 from wjc404/develop 5 years ago
  wjc404 0e3ac4a06b
Add files via upload 5 years ago
  Martin Kroeker 7f60fb6b91
Delete spurious copy of common_param.h 5 years ago
  ZhangDanfeng 9b7877ccf1 sgemm copy source init 5 years ago
  ZhangDanfeng f82fa802d1 Insert prefetch 5 years ago
  Martin Kroeker b1ee81228a
Change complex DOT and ROT to generic kernels and switch CGEMM 5 years ago
  张丹枫 9df79ae9a3 update sgemm and strmm kernel selecting strategy 5 years ago
  张丹枫 a1fc6041cd use general register to speedup 5 years ago
  张丹枫 edb423d772 align general register using to strmm_kernel_8x8 5 years ago
  zhangdanfeng 0e6eb8c247 sgemm kernel use sgemm_kernel_8x8_cortexa53 5 years ago
  zhangdanfeng d475db29c6 optimized for cortex-a53 5 years ago
  Marius Hillenbrand 89fe17f20e s390x: Use new sgemm kernel also for DGEMM and DTRMM on Z14 5 years ago
  Marius Hillenbrand bdd795ed03 s390x/GEMM: replace 0-init with peeled first iteration 5 years ago
  Marius Hillenbrand 2840432e49 s390x: improvise vector alignment hints for older compilers 5 years ago
  Marius Hillenbrand 1b0b4349a1 s390x/Z14: Change register blocking for SGEMM to 16x4 5 years ago
  Marius Hillenbrand 71b6eaf459 s390x: Use new sgemm kernel also for strmm on Z14 and newer 5 years ago
  Marius Hillenbrand 43c0d4f312 s390x: Add vectorized sgemm kernel for Z14 and newer 5 years ago
  Martin Kroeker 2271c3506b
Work around excessive LAPACK test failures on Skylake-X 5 years ago
  Rajalakshmi Srinivasaraghavan bd9ff820bc Fix cmake compilation issue - POWER9 5 years ago
  Ashwin Sekhar T K 8353cb245a ARM64: Improve DAXPY for ThunderX2 5 years ago
  Martin Kroeker 90dba9f716
Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version 5 years ago
  Martin Kroeker 5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 5 years ago
  Martin Kroeker 06208c8d01
Limit this fix to ELFv2 builds 5 years ago
  Martin Kroeker f5c4c28b98
Work around POWER8BE bugs on FreeBSD (ELFv2) 5 years ago
  Martin Kroeker fa42588e1f
Merge pull request #2565 from martin-frbg/mips24k 5 years ago
  Martin Kroeker e55ec82bb9
Delete KERNEL.1004K 5 years ago
  Martin Kroeker 7353ea5afc
Delete KERNEL.24K 5 years ago
  Martin Kroeker 6a04efb122
Rename KERNEL files to include MIPS prefix 5 years ago
  Martin Kroeker d712ea724c
Add MIPS24K support 5 years ago
  Rajalakshmi Srinivasaraghavan 22bb50fb81 cmake fixes 5 years ago
  Rajalakshmi Srinivasaraghavan 67cc4b9e16 Fix warnings in clang and export symbol 5 years ago
  Rajalakshmi Srinivasaraghavan a87793e03c Fix DYNAMIC_ARCH compilation errors 5 years ago
  Rajalakshmi Srinivasaraghavan ff010f496e Build shgemm for all architecture 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  Martin Kroeker 5b0093b5fe
Convert aligned moves to unaligned 5 years ago
  Martin Kroeker e9bfa2291a
Fix parameter overflow 5 years ago
  gxw 8d07cf9b67 Fix compilation problem on loongson platform 5 years ago
  Martin Kroeker 806f89166e
Make ARMV7 compile with xcode and add a CI job for it (#2537) 5 years ago
  Martin Kroeker c6af9bbb32
Merge pull request #2534 from martin-frbg/issue2496 5 years ago