445 Commits (develop)

Author SHA1 Message Date
  Chen, Guobing deaeb6c5b8 Add bfloat16 based dot and conversion with single/double 5 years ago
  Martin Kroeker 75eeb265d7
[WIP] Refactor the driver code for direct SGEMM (#2782) 5 years ago
  Martin Kroeker fee361ae64
fix another source of NO_CBLAS=0 surprise 5 years ago
  Ashwin Sekhar T K 4e1be0e481 ARM64: Add THUNDERX3T110 Target 5 years ago
  Martin Kroeker 5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 5 years ago
  Martin Kroeker 2db5178e2d
enable cblas interfaces to GEMM3M in CMAKE builds 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  Martin Kroeker 8229c163b7
Use runtime check for AVX512 (sgemm_direct) capability when using DYNAMIC_ARCH 5 years ago
  Martin Kroeker 6a14b34c20
Avoid calling DIRECT codepath in DYNAMIC_ARCH on non-SKX 5 years ago
  Martin Kroeker d65e9a2bbd
Merge pull request #2253 from thrasibule/xerbla 5 years ago
  Guillaume Horel 2463938879 fix error message 6 years ago
  Guillaume Horel 5d6525c87c more bugfix 6 years ago
  Guillaume Horel 459bb9291d fix error codes 6 years ago
  Guillaume Horel 5997b6b491 bugfix 6 years ago
  Guillaume Horel 7ec7b999a5 add missing file 6 years ago
  Guillaume Horel af9ac0898a fix Makefile 6 years ago
  Guillaume Horel 9b2f0323d6 update Makefile 6 years ago
  Guillaume Horel ea747cf933 start working on ?trtrs 6 years ago
  luz.paz daf2fec12d Misc. typo fixes 6 years ago
  Martin Kroeker 268c28db7d
Merge pull request #2095 from martin-frbg/trsm 6 years ago
  Martin Kroeker 0bd956fd21 Correct length of name string in xerbla call 6 years ago
  Martin Kroeker 79cfc24a62
Add interface for ?sum (derived from ?asum) 6 years ago
  Martin Kroeker c19a449096
Merge pull request #2071 from martin-frbg/issue2068 6 years ago
  Martin Kroeker 3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX 6 years ago
  Martin Kroeker e29b0cfcc4
Allow multithreading TRMV again 6 years ago
  Martin Kroeker 8533aca964
Avoid penalizing tall skinny matrices 6 years ago
  Martin Kroeker cda81cfae0
Shift transition to multithreading towards larger matrix sizes 6 years ago
  Arjan van de Ven cdc668d82b Add a "sgemm direct" mode for small matrixes 6 years ago
  Martin Kroeker 5393759a98
Merge pull request #1869 from martin-frbg/axpy0 6 years ago
  Martin Kroeker c171b8ad13
Handle special case INCX=0,INCY=0 in the axpy interface 6 years ago
  Martin Kroeker 96d2f2c9b2
Merge pull request #1831 from brada4/hemv 7 years ago
  Andrew 2992e3886a disable threading in C/ZSWAP copying from S/DSWAP 7 years ago
  Martin Kroeker e3c262e5cf
Merge pull request #1825 from brada4/hemv 7 years ago
  Andrew a293bdcd5e re-arrange new code for readability 7 years ago
  Andrew c7bbf9c987 Attempt to tame _hemv threading #1820 7 years ago
  Ashwin Sekhar T K 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8 7 years ago
  Martin Kroeker b991570210
Merge pull request #1762 from martin-frbg/issue1710-2 7 years ago
  Martin Kroeker f3c262156e
Add an explicit cast to silence a warning 7 years ago
  Martin Kroeker 30f5a69ab8
Add explicit cast to silence a warning 7 years ago
  Martin Kroeker 4a553e8678
Merge pull request #1713 from martin-frbg/issue1710 7 years ago
  Martin Kroeker 165f00c159
fabs -> fabsl 7 years ago
  Martin Kroeker 933896a1d0
Use blasabs to switch between abs and labs as needed for INTERFACE64 7 years ago
  Steven G. Johnson a4e321400b
fabs -> fabsl 7 years ago
  Martin Kroeker 9cf22b7d91
Build cblas_iXamin interfaces 7 years ago
  Craig Donner c2545b0fd6 Fixed a few more unnecessary calls to num_cpu_avail. 7 years ago
  Craig Donner 66316b9f4c Improve performance of GEMM for small matrices when SMP is defined. 7 years ago
  Martin Kroeker e8880c1699
Use a single thread for small input size 7 years ago
  Martin Kroeker 1d27fa8507
Merge pull request #1539 from martin-frbg/ztrmv-1332 7 years ago
  Martin Kroeker a8ed428bab
Disable multithreading in ztrmv 7 years ago
  Martin Kroeker 809fd0d451
Rewrite ROTMG to address cases not covered by the netlib algorithm (#1480) 7 years ago