Martin Kroeker
0e11537cab
Merge pull request #5357 from Mousius/bgemm-init
Add infrastructure for BGEMM
2 months ago
Martin Kroeker
fd37406817
Merge branch 'develop' into optimized_gemv_n_1x3
2 months ago
Chris Sidebottom
f95e7b0e32
Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.
Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
3 months ago
guoyuanplct
4ff549a450
Update CONTRIBUTORS.md
2 months ago
guoyuanplct
309c48e327
Update CONTRIBUTORS.md
2 months ago
Sharif Inamdar
8279e68805
Optimize gemv_n_sve_v1x3 kernel
- Calculate predicate outside the loop
- Divide matrix in blocks of 3
3 months ago
abhishek-fujitsu
0c239c9d48
update contribution list
5 months ago
Annop Wongwathanarat
ec146157d3
Use SVE kernel for S/DGEMVT for SVE machines
6 months ago
Annop Wongwathanarat
9807f56580
Optimize aarch64 sgemm_ncopy
6 months ago
Annop Wongwathanarat
a085b6c9ec
Fix aarch64 sbgemv_t compilation error for GCC < 13
6 months ago
Martin Kroeker
2b941c44b5
Merge branch 'develop' into sbgemv_n_neon
7 months ago
Ye Tao
35bdbca153
Add sbgemv_n_neon kernel for arm64.
7 months ago
Annop Wongwathanarat
edaf51dd99
Add sbgemv_t_bfdot kernel for ARM64
This improves performance for sbgemv_t by up to 100x on NEOVERSEV1.
The geometric mean speedup is ~61x for M=N=[2,512].
7 months ago
Marek Michalowski
650a062e19
Add thread throttling profile for SGEMV on `NEOVERSEV2`
7 months ago
Marek Michalowski
b723c1b7b7
Add thread throttling profile for SGEMM on `NEOVERSEV2`
7 months ago
Ye Tao
c748e6a338
optimized sbgemm kernel for neoverse-v1 (sve-256)
Signed-off-by: Ye Tao <ye.tao@arm.com>
10 months ago
Martin Kroeker
6e393a5599
Merge branch 'develop' into gemv_t
8 months ago
Marek Michalowski
838bb57e27
Merge branch 'develop' into develop
8 months ago
Marek Michalowski
4d5b13f765
Add thread throttling profile for SGEMV on `NEOVERSEV1`
8 months ago
Annop Wongwathanarat
c0318cea6e
Simplify gemv_t_sve_v1x3 kernel
8 months ago
Annop Wongwathanarat
c8cd8da496
Add thread throttling profile for SGEMM on NEOVERSEV1
8 months ago
CDAC-SSDG
41912f9c22
Update CONTRIBUTORS.md
9 months ago
CDAC-SSDG
2718b37fed
Update CONTRIBUTORS.md
11 months ago
Chris Daley
cb48505251
optimize gemv forwarding on ARM64 systems
11 months ago
Jake Arkinstall
44004178aa
Updated CONTRIBUTORS.md
As requested on X (https://x.com/KroekerMartin/status/1755218919290278185 )
1 year ago
Mark Seminatore
b29fd48998
Merge branch 'develop' into win_tidy
1 year ago
Mark Seminatore
10548a0460
update contributors
1 year ago
Dirreke
ec89466e14
Add CSKY support
1 year ago
Mark Seminatore
5f51811728
try at new threading model
1 year ago
Martin Kroeker
616fdea82a
Revert "Improve Windows threading performance scaling"
2 years ago
Mark Seminatore
427f9f2428
update contributors
2 years ago
Chris Sidebottom
bfc20c2e97
Add Chris Sidebottom to CONTRIBUTORS.md
2 years ago
Pablo Romero
1b1f781cf9
Added name and details to contributors' list.
3 years ago
Xianyi Zhang
f9715605ac
Add PLCT to contributors.
3 years ago
Martin Kroeker
5d24f3d210
Update CONTRIBUTORS.md
3 years ago
Martin Kroeker
66a15e15a8
Update CONTRIBUTORS.md
3 years ago
Bine Brank
19d435b1b3
update armv8sve + contributors
3 years ago
Bine Brank
cbcea149f0
update contributors
3 years ago
Bine Brank
ca65a4e91d
update CONTRIBUTORS.md
3 years ago
River Dillon
ddb6cee0d5
Contribution note
4 years ago
Xianyi Zhang
7834c10e2f
Add PingTouGe contribution credit.
4 years ago
Marius Hillenbrand
f7731a358a
Update CONTRIBUTERS.md - clang build fixes for IBM z
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
5 years ago
张丹枫
2a3aa91354
update CONTRIBUTORS.md, adding myself
5 years ago
Marius Hillenbrand
cb9dc36dd5
Update CONTRIBUTORS.md
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
5 years ago
Marius Hillenbrand
d7c1677c20
Update CONTRIBUTORS.md, adding myself
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
5 years ago
Martin Kroeker
3e28db7f38
Update CONTRIBUTORS.md
5 years ago
wjc404
9f5cdc49d4
Update CONTRIBUTORS.md
5 years ago
wjc404
bb2729c855
Update CONTRIBUTORS.md
5 years ago
wjc404
aae44d040d
Update CONTRIBUTORS.md
5 years ago
wjc404
312060d0d6
Update CONTRIBUTORS.md
5 years ago