This website works better with JavaScript.
Home
Issues
Pull Requests
Milestones
AI流水线
Repositories
Datasets
Forum
实训
竞赛
大数据
Register
Sign In
OSchip
/
OpenBLAS
Not watched
Unwatch
Watch all
Watch but not notify
1
Star
0
Fork
0
Code
Releases
66
Wiki
evaluate
Activity
Issues
0
Pull Requests
0
Datasets
Model
Cloudbrain
HPC
Browse Source
For gemm multi-threading, simply split M.
e.g. layer 1: A (1600k, 576), B(576, 64) B is very small. We split M.
optimized_for_deeplearning
Zhang Xianyi
9 years ago
parent
da7f69e8f4
commit
92058a75e2
2 changed files
with
3 additions
and
1 deletions
Split View
Diff Options
Show Stats
Download Patch File
Download Diff File
+1
-1
Makefile.rule
+2
-0
common_param.h
+ 1
- 1
Makefile.rule
View File
@@ -80,7 +80,7 @@ VERSION = 0.2.16.dev
# NO_LAPACKE = 1
# If you want to use legacy threaded Level 3 implementation.
#
USE_SIMPLE_THREADED_LEVEL3 = 1
USE_SIMPLE_THREADED_LEVEL3 = 1
# If you want to drive whole 64bit region by BLAS. Not all Fortran
# compiler supports this. It's safe to keep comment it out if you
+ 2
- 0
common_param.h
View File
@@ -1194,6 +1194,8 @@ extern gotoblas_t *gotoblas;
#define XGEMM_DEFAULT_UNROLL_N 2
#endif
#define GEMM_THREAD gemm_thread_m
#ifndef GEMM_THREAD
#define GEMM_THREAD gemm_thread_n
#endif
Write
Preview
Loading…
Cancel
Save