2525 Commits (df013c5e281c2b21e2a3aebee100b3b003eea19a)

Author SHA1 Message Date
  Iha, Taisei f7ad906b49 Performance improvements of [SD]DOT with loop-unrolling on A64FX 3 months ago
  Martin Kroeker d96daa220d
Merge pull request #5290 from Srangrang/develop 3 months ago
  Martin Kroeker ee26caffb3
Merge pull request #5309 from davidz-ampere/dev-ampereone 3 months ago
  davidz-ampere aa90ab4142 Add support for Ampere AmpereOne processors 3 months ago
  Ian McInerney badef1d32e Update sbgemm_tcopy_4_neoversev1 kernel to use standard C types 3 months ago
  Martin Kroeker 3318a2b904
override CDOT and ZDOT with the generic C kernel 3 months ago
  davidz-ampere 84730068af reduce duplicate kernel code 3 months ago
  davidz-ampere be68ef03b4 Add support for Ampere processors 3 months ago
  Srangrang 9f13b2c6ac style: modify HALF to BFLOAT16 in benchmark folder 3 months ago
  Srangrang ec14e1648c fix: resolve non-RISCV host build failed issue 3 months ago
  Martin Kroeker e338d34ce1
fix path 3 months ago
  Martin Kroeker d36093d084
temporarily change default C/ZSCAL to the non-asm implementation 3 months ago
  Martin Kroeker b3c90564d7
resync with the generic arm version for inf/nan handling 3 months ago
  Martin Kroeker 6bdc7f9eb7
Merge pull request #5300 from martin-frbg/fixup5296 3 months ago
  Martin Kroeker 73af02b89f
use dummy2 as Inf/NAN handling flag 3 months ago
  Martin Kroeker 549a9f1dbb
Disable the default SSE kernels for CSCAL/ZSCAL for now 3 months ago
  Martin Kroeker 58eeb9041c
fix handling of dummy2 3 months ago
  Martin Kroeker 7c77537b25
Merge pull request #5297 from martin-frbg/zscal_x86_sparc 3 months ago
  Martin Kroeker 63287e1855
Merge pull request #5296 from martin-frbg/zscal_riscv 3 months ago
  Martin Kroeker d2855d3dab
Merge pull request #5285 from martin-frbg/zscal_zarch 3 months ago
  Martin Kroeker 1408be5fe0
Merge pull request #5282 from martin-frbg/zscal_power 3 months ago
  Martin Kroeker 1589d0b21e
Merge pull request #5281 from martin-frbg/zscal_arm64 3 months ago
  Martin Kroeker a86419fb66
Merge pull request #5280 from martin-frbg/zscal_x86_64 3 months ago
  Martin Kroeker 11ff18bb0f
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal 3 months ago
  Martin Kroeker f4194fc65f
Merge branch 'develop' into la64_fixed_cscal_zscal 3 months ago
  Martin Kroeker e12132abd4
Use generic C/ZSCAL kernels to address inf/nan handling for now 3 months ago
  Martin Kroeker 1cefbea7ea
Use generic SCAL kernels to address inf/nan handling for now 3 months ago
  Martin Kroeker f18b7a46bf
add dummy2 flag handling for inf/nan agnostic zeroing 3 months ago
  Martin Kroeker fe220a0d7d
Merge pull request #5291 from guoyuanplct/develop 3 months ago
  Arne Juul 5442aff218 Accumulate results in output register explicitly 3 months ago
  guoyuanplct 2ae019161a fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small 3 months ago
  Srangrang fb89820f20 Merge branch 'develop' of https://github.com/Srangrang/OpenBLAS into develop 4 months ago
  Srangrang 4e1a381e5b fix: resolve the compilation failure without zfh instruction 4 months ago
  gkdddd 670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B 4 months ago
  guoyuanplct d2003dc886 del lines 4 months ago
  guoyuanplct 45fd2d9b07 Optimized the axpby function. 4 months ago
  Martin Kroeker fb8dc8ff5c
Add dummy2 flag handling 4 months ago
  Srangrang 2996c25c94 add shgemm for RISCV_ZVL128B 4 months ago
  Martin Kroeker cf06250d36
add handling of dummy2 flag 4 months ago
  Martin Kroeker 28f8fdaf0f
support flag for NaN/Inf handling and fix scaling of NaN/Inf values 4 months ago
  Martin Kroeker 669c847ceb
support extra flag for NaN handling 4 months ago
  Martin Kroeker 0b0bb9951d
Merge pull request #5265 from guoyuanplct/develop 4 months ago
  guoyuanplct be9f7550b5 Format Code 4 months ago
  guoyuanplct 4d213653d8 kernel/riscv64:Added support for omatcopy on riscv64. 4 months ago
  Martin Kroeker 8afddc1a81
Merge pull request #5262 from guoyuanplct/develop 4 months ago
  guoyuanplct 9a7e3f102b kernel/riscv64:Fixed the bug of openblas_utest_ext failing in c/zgemv and some c/zgbmv tests: 4 months ago
  pengxu a978ad3180 Loongarch64: add C functions of zgemm_ncopy_16 4 months ago
  pengxu 0ccb050583 Loongarch64: fixed cgemm_ncopy_16_lasx 4 months ago
  Martin Kroeker 5141a90993
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS (#5222) 4 months ago
  Martin Kroeker 151b74284e
Merge pull request #5203 from quic/fix-sgemmdirect-sme1 4 months ago