9506 Commits (5e43ba948c3cb35864c7b0953b8dd02374dd3967)
 

Author SHA1 Message Date
  Martin Kroeker 5e43ba948c
Merge pull request #5419 from Mousius/bgemm-optimisation 1 month ago
  Chris Sidebottom 5f47b872f1 Remove older kernels for BGEMM on NEOVERSEV1 1 month ago
  Chris Sidebottom 114316f361 Optimize SBGEMM / BGEMM for NEOVERSEV1 further 1 month ago
  Martin Kroeker 75c6ab4036
CI: Update WoA job to use LLVM 20.1.8 and avoid stray preinstalled LLVM19 (#5411) 1 month ago
  Martin Kroeker 5c5f852ee3
Merge pull request #5415 from martin-frbg/Fixum-5399 1 month ago
  Martin Kroeker f1ee61ea30
Include NEON header for the bfloat conversion functions 1 month ago
  Martin Kroeker b3ffd5524a
Include NEON header for the bfloat conversion functions 1 month ago
  Martin Kroeker d23680b81d
Merge pull request #5407 from nakagawa-fj/feature/gemm_divide_rate_for_neoversev1 2 months ago
  Martin Kroeker b4cc4be2ce
Merge pull request #5410 from martin-frbg/issue5404 2 months ago
  Martin Kroeker 0968dddf1a
Merge pull request #5409 from martin-frbg/issue5372 2 months ago
  Martin Kroeker eddfe1e6b3
Merge pull request #5408 from ChipKerchner/fixRISCV64GEMVInitializationAndWarnings 2 months ago
  Martin Kroeker 30d11bc92c
Adjust multithreading threshold and add an intermediate step 2 months ago
  Martin Kroeker a3b9c933c5
mark xbuffer as volatile to work around gcc15.1 optimizer bug 2 months ago
  Chip Kerchner 72f082f31d Fix bad vector zero initializer and other compiler warnings for RISC-V. 2 months ago
  Masato Nakagawa 7e29f11396 Multi-thread GEMM Performance Improvement on NeoverseV1 (DIVIDE_RATE=1) 2 months ago
  Martin Kroeker 9a64b32b44
Merge pull request #5406 from martin-frbg/fixbgemmtest 2 months ago
  Martin Kroeker b66a01f909
Fix building of bgemm tests on GEMM3M-capable (x86) targets 2 months ago
  Martin Kroeker a5e7c0e3e0
Merge pull request #5396 from abhishek-iitmadras/abhishekk_bfloat16 2 months ago
  abhishek-fujitsu 6356190d06 fix gfortran link path in dynamic_arch.yml 2 months ago
  abhishek-fujitsu 4c8dcb3a8f Darwin/arm64: disable SVE/SME and fix gfortran link path 2 months ago
  Martin Kroeker 33b50548eb
Merge pull request #5403 from martin-frbg/issue5402 2 months ago
  Martin Kroeker c504aedca1
Merge pull request #5400 from Mousius/neoversev2-target 2 months ago
  Martin Kroeker b9e107932a
add NeoverseV2 2 months ago
  Martin Kroeker 2f89a5970e
fix NeoverseV2 typo 2 months ago
  Martin Kroeker a9e8fa06bf
Introduce a (crude) threshold to multithreading 2 months ago
  Martin Kroeker b4c2b34a45
Merge pull request #5401 from martin-frbg/followup-5397 2 months ago
  Martin Kroeker c9204f7b6f
Merge pull request #5399 from Mousius/bgemm-8x4 2 months ago
  Martin Kroeker a55e65dba9
Merge pull request #5391 from martin-frbg/issue5387 2 months ago
  abhishek-fujitsu 0bc79da587 add neon header 2 months ago
  abhishek-fujitsu 720a4743b9 update contribution list 2 months ago
  abhishek-fujitsu 05fc88180c ARM64: Enable bfloat16 kernels by default 4 months ago
  Martin Kroeker 965463f177
Include float-bfloat conversion functions in ONLY_CBLAS builds as well 2 months ago
  Martin Kroeker 4272cf8c7f
Merge pull request #5398 from martin-frbg/fixup-5394 2 months ago
  Chris Sidebottom 87247daadc Add NEOVERSEV2 target support 2 months ago
  Chris Sidebottom ea2faf0c9a Add optimized BGEMM for NEOVERSEN2 target 2 months ago
  Martin Kroeker a5b55f6fe3
remove CBLAS restriction on GEMM_GEMV forwarding 2 months ago
  Martin Kroeker a4f4662459
Merge pull request #5397 from omegacoleman/fix-cblas-bgemm 2 months ago
  Martin Kroeker 82954ba4ca
Update ?GEMM-to-?GEMV forwarding settings 2 months ago
  Martin Kroeker 392d38168e
Merge pull request #5394 from Mousius/optimize-bgemv 2 months ago
  youcai 41f9701ebc Fix cmake building with cblas_bgemm 2 months ago
  Martin Kroeker f4caa61e47
Merge pull request #5395 from martin-frbg/fixloongsonCI 2 months ago
  Martin Kroeker 444d03db9c
switch to another site that still has libffi6 (for now) 2 months ago
  Chris Sidebottom 2c3cdaf74e Optimized BGEMV for NEOVERSEV1 target 2 months ago
  Martin Kroeker 7d908564fe
Use OpenBLAS_ROOT_DIR in CMake config file generation only if set 2 months ago
  Martin Kroeker 2f81d6e60c
Merge pull request #5390 from martin-frbg/issue5388-2 2 months ago
  Martin Kroeker e2d941e9af
Declare the "small" kernel static in addition to inline 2 months ago
  Martin Kroeker 8214700930
Declare the "small" kernel static in addition to inline 2 months ago
  Martin Kroeker 4ae8707b54
Merge pull request #5389 from martin-frbg/issue5388 2 months ago
  Martin Kroeker b24212f5df
fix numbers 2 months ago
  Martin Kroeker 6ff06f5483
Add cross-compilation data for RISCV64 targets 2 months ago