104 Commits (develop)

Author SHA1 Message Date
  Martin Kroeker b66a01f909
Fix building of bgemm tests on GEMM3M-capable (x86) targets 2 months ago
  Chris Sidebottom ea2faf0c9a Add optimized BGEMM for NEOVERSEN2 target 2 months ago
  Chris Sidebottom 2c3cdaf74e Optimized BGEMV for NEOVERSEV1 target 2 months ago
  Martin Kroeker 38e6999295
format cleanup 2 months ago
  Martin Kroeker 3df503cafd
portability fix and cleanup 2 months ago
  Chris Sidebottom e105411460 Add infrastructure for bgemv/bscal 2 months ago
  Chris Sidebottom 09a016fdf6 Split sbgemv test from sbgemm test 2 months ago
  Chris Sidebottom 3f110c8272 Improve bgemm and sbgemm testing 2 months ago
  Martin Kroeker aad97c7763
Fix return type declaration 2 months ago
  Chris Sidebottom 740efd71c4 Add optimized BGEMM kernel for NEOVERSEV1 target 2 months ago
  Martin Kroeker 9a272fece6
Re-enable the BGEMM tests 2 months ago
  Martin Kroeker b54aec804e
remove spurious include 2 months ago
  Chris Sidebottom 8cd4be8d47 Temporarily disable test_bgemm 2 months ago
  Chris Sidebottom f95e7b0e32 Add infrastructure for BGEMM 3 months ago
  Martin Kroeker 70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake 5 months ago
  Martin Kroeker 1c5d0d5539
move libomp to extralib 5 months ago
  Ruiyang Wu 1b0c0f00e9 CMake: Avoid mixed OpenMP linkage 6 months ago
  Ye Tao 4346b91559 add beta and alpha testcase for sbgemv 7 months ago
  Chip Kerchner 36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 11 months ago
  Rohit Goswami 722e4ae07a
MAINT: Explicitly replace instead of unknown 1 year ago
  Rohit Goswami a6b7751881
BUG: Allow tests to be run multiple times 1 year ago
  Chip Kerchner 89702e1f4a Fix zero element GEMV test. 1 year ago
  Chip Kerchner 77f85c7c00 GEMV tests don't like zero elements. 1 year ago
  Chip Kerchner 868aa857bc Change malloc zero to return one byte and update the SBGEMM test to again use sizes of zero. 1 year ago
  Chip Kerchner b1802f4dc8 Fix unit test to start at 1 instead of 0 - since malloc zero bytes fails on some systems. 1 year ago
  Chip Kerchner c23897f585 Add GEMV testing to SBGEMx vs SGEMx testing. 1 year ago
  Martin Kroeker 6452f7b46d
Merge pull request #4873 from ChipKerchner/fixSBGEMMDefaults 1 year ago
  Chip Kerchner 31226740d6 Cleanup of SBGEMM unit test. 1 year ago
  Henry Chen 20bdb65882 Fix recursive variable expansion in Makefiles for LOONGSON3A 1 year ago
  Chip Kerchner b1737698db Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences. 1 year ago
  Martin Kroeker 76db713e79
fix invocation of GEMM3M tests 1 year ago
  Vladimir Nikolić 56e1782ffb
Add another missing parenthesis 1 year ago
  Chip Kerchner f708944fea Add all 4 variations of the SBGEMM to compare_sgemm_sbgemm 1 year ago
  Martin Kroeker edacf9b397
Work around spurious BLAS3 test errors on LOONGSON3R3/4 (#4667) 1 year ago
  Martin Kroeker 28f151808e
Avoid overriding the global USE_GEMM3M 1 year ago
  Martin Kroeker ba201c1939
Enable GEMM3M tests on supported platforms 1 year ago
  Martin Kroeker 4adfe4d531
Avoid linking both libgomp and libomp in mixed clang/gfortran builds 1 year ago
  Martin Kroeker e9f480111e
fix sbgemm bfloat16 conversion errors introduced in PR 4488 1 year ago
  Martin Kroeker fb99fc2e6e
fix type conversion warnings 1 year ago
  Chip Kerchner 61c8e19f95 Fix Makefile to support OpenMP on AIX for xlc (clang) with xlf. 1 year ago
  Isuru Fernando 6b2651ece3 Fix building test_sbgemm 1 year ago
  Chip-Kerchner d46eba06a7 Pack structure only on AIX. 2 years ago
  Chip-Kerchner e98e3c4783 Fix float32_bits union so that it always the sizeof float. 2 years ago
  Chip-Kerchner 97a61d0577 Fix bfloat16_bits union so that it always the sizeof unsigned short. 2 years ago
  Martin Kroeker 2a9981a244
Add -lgomp when IBM xlf is combined with gcc in OPENMP builds 2 years ago
  Martin Kroeker 44e6e5479b
Use the C compiler for the C SBGEMM test source 2 years ago
  Aiden Grossman b209915121 Fix build with clang 2 years ago
  Martin Kroeker 3d338b57de
remove spurious loops 3 years ago
  Martin Kroeker d9dc015cfc
Use blasint for INTERFACE64 compatibility 3 years ago
  Rajalakshmi Srinivasaraghavan 1d97405c02 POWER: Enable bfloat16 kernels by default 3 years ago