5425 Commits (89ae305e11dacb4622f58b03e48b4bb361acf94c)
 

Author SHA1 Message Date
  Martin Kroeker 89ae305e11
Workaround for cmake having its own C_COMPILER variable 4 years ago
  Martin Kroeker b716c0ef01
Add workaround for NVIDIA HPC 4 years ago
  Martin Kroeker 2efa3b70dc
Add workaround for NVIDIA HPC 4 years ago
  Martin Kroeker 49959d4f1c
Add workaround for NVIDIA HPC 4 years ago
  Martin Kroeker 0f27a03607
Add workaround for NVIDIA HPC mishandling of the asm DOT kernels 4 years ago
  Martin Kroeker c2a8ebfe69
Add workaround for NVIDIA HPC mishandling of the asm DOT kernels 4 years ago
  Martin Kroeker 43aac5bacc
Support NVIDIA HPC compiler 4 years ago
  Martin Kroeker bff2b7c94d
Support compilation with NVIDIA HPC compilers (which do not take gcc-style arch options) 4 years ago
  Martin Kroeker 2d45a262d9
Support compilation with nvfortran 4 years ago
  Martin Kroeker 018dec8588
Merge pull request #7 from xianyi/develop 4 years ago
  Martin Kroeker 5d6209e1f9
Merge pull request #3055 from RajalakshmiSR/swapp10 4 years ago
  Rajalakshmi Srinivasaraghavan 601b711c78 Optimize swap function for POWER10 4 years ago
  Martin Kroeker 78702753f2
Merge pull request #3053 from pkubaj/patch-1 4 years ago
  pkubaj 7aa1ff8ff6
Fix build on FreeBSD/powerpc64le 4 years ago
  Martin Kroeker d6c97cf010
Merge pull request #3052 from ashwinyes/arm64_fix_nrm2 4 years ago
  Ashwin Sekhar T K 1b2508362b arm64: Fix nrm2 for input vectors with Inf 4 years ago
  Martin Kroeker cd898af59f
Merge pull request #3050 from aurel32/riscv64-openblas-supported 4 years ago
  Aurelien Jarno 0a535e58d8 getarch.c: define OPENBLAS_SUPPORTED for riscv64 4 years ago
  Martin Kroeker 9ce9e295fe
Merge pull request #3049 from martin-frbg/readme 4 years ago
  Martin Kroeker 9a38592c79
Add pointers to the netlib documentation and Gilbert Strang's linear algebra primers 4 years ago
  Martin Kroeker 9b3965b08c
Merge pull request #6 from xianyi/develop 4 years ago
  Martin Kroeker 531cb4f673
Merge pull request #3035 from Joshua-Ashton/patch-1 4 years ago
  Martin Kroeker 3559c5d7a2
Merge pull request #3048 from martin-frbg/issue2998 4 years ago
  Martin Kroeker 8631e2976a
Temporarily revert to the old nrm2 kernels 4 years ago
  Martin Kroeker 2768bc1764
Temporarily revert to the old nrm2 kernels 4 years ago
  Martin Kroeker 6f4698ee1f
Temporarily revert to the old nrm2 kernel 4 years ago
  Martin Kroeker 85e5165e98
Merge pull request #3046 from martin-frbg/nvidiasdk-ppc 4 years ago
  Martin Kroeker 17c16f2a71
Implement builtin_cpu_is and limit cpu choices to P8 and P9 for NVIDIA compilers 4 years ago
  Martin Kroeker 91c3f86c2b
NVIDIA compiler does not yet support POWER10 4 years ago
  Martin Kroeker 75b1f3becc
Limit POWERPC DYNAMIC_CORE list to P8 and P9 for NVIDIA compilers 4 years ago
  Martin Kroeker 07c5e549b2
Merge pull request #3045 from martin-frbg/nvidiasdk 4 years ago
  Martin Kroeker 114eb159a4
Disable FMA intrinsics in the srot kernel when the compiler is PGI/NVIDIA 4 years ago
  Martin Kroeker 005cce5507
Amend SkylakeX options to support the NVIDIA compiler 4 years ago
  Martin Kroeker b859b6e79d
Add nvfortran 4 years ago
  Martin Kroeker b212a2fb9f
Add/modify "PGI" compiler options for NVIDIA SDK 20.11 4 years ago
  Martin Kroeker e40416567a
Add version printout for PGI/NVIDIA compiler 4 years ago
  Martin Kroeker b37e5fa2f8
Merge pull request #5 from xianyi/develop 4 years ago
  Martin Kroeker 326469ef4a
Merge pull request #3042 from martin-frbg/develop 4 years ago
  Martin Kroeker c73d8ee40d
Conditionally add -mfma to compiler options where needed 4 years ago
  Martin Kroeker abef2ea770
Move -fma option setting to kernel/Makefile.L1 4 years ago
  Martin Kroeker b26e32c3af
Merge pull request #3040 from martin-frbg/fixfcheck 4 years ago
  Martin Kroeker 7822eff936
Merge pull request #3038 from martin-frbg/issue3037 4 years ago
  Martin Kroeker b03dc011be
Fix undefined CC variable in clang check 4 years ago
  Martin Kroeker 00ce35336e
Fix spurious removal of a trailing character from the hostarch string on x86_64 4 years ago
  Martin Kroeker 723776ddf7
Merge pull request #4 from xianyi/develop 4 years ago
  Martin Kroeker 5a77ec7f1c
Merge pull request #3036 from RajalakshmiSR/p10copyalign 4 years ago
  Rajalakshmi Srinivasaraghavan 2fb11f873b POWER10: Improve copy performance 4 years ago
  Joshie ad63647446
Define BLAS acronym in README 4 years ago
  Martin Kroeker 87315e8a8d
Update version to 0.3.13.dev 4 years ago
  Martin Kroeker 9031ebd7d5
Update version to 0.3.13.dev 4 years ago