Martin Kroeker
a3e02742f2
Add USE_PERL fallback option for create script used with FUNCTION_PROFILE
3 years ago
Martin Kroeker
f1c570a5f1
Add back original PERL-based script under new name
3 years ago
Owen Rafferty
42c7a27e6b
rewrite perl scripts in universal shell
3 years ago
Martin Kroeker
7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
3 years ago
Martin Kroeker
d2b5fbf80f
Exclude some complex (LAPACK) functions when NO_LAPACK is set
3 years ago
Martin Kroeker
64365c919e
fix function typecasts
3 years ago
gxw
25f99fa9f8
Add cblas_{c/z}srot cblas_{c/z}rotg support
4 years ago
Martin Kroeker
4b3769823a
Revert #3252
4 years ago
Martin Kroeker
2845f54eb8
Remove dangerous optimization from previous #3252 - buffer is never unused here
4 years ago
Martin Kroeker
c35739db5e
Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla
4 years ago
Martin Kroeker
1085775bc6
really remove the unused variable
4 years ago
Martin Kroeker
20581bf303
Remove unused variable
4 years ago
Wangyang Guo
4289cf048d
sbgemm: avoid falling into SGEMM_KERNEL_DIRECT
4 years ago
Wangyang Guo
2e44ca0136
sbgemm: add missing cblas_sbgemm definition
4 years ago
Wangyang Guo
1d83ca4bca
Small Matrix: support BFLOAT16 data type
4 years ago
Wangyang Guo
c17d6dacb2
Small Matrix: skip compile in unimplemented data type
4 years ago
Wangyang Guo
aa50185647
Small Matrix: better handle with GEMM3M marco
4 years ago
Wangyang Guo
478d1086c1
Small Matrix: support DYNAMIC_ARCH build
4 years ago
Wangyang Guo
5dc7c3c8e5
Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case
4 years ago
Xianyi Zhang
6022e5629c
Refs #2587 fix small matrix c/zgemm bug.
5 years ago
Xianyi Zhang
57ed58cefe
Refs #2587 Add small matrix optimization reference kernel for c/zgemm.
5 years ago
Xianyi Zhang
17d32a4a82
Change a1b0 gemm to b0 gemm.
5 years ago
Xianyi Zhang
4271cfcc6f
Fix gemm interface bug for small matrix.
5 years ago
Xianyi Zhang
be3349405d
Add alpha=1.0 beta=0.0 for small gemm.
5 years ago
Xianyi Zhang
0a2077901c
Add small marix optimization kernel interface.
make SMALL_MATRIX_OPT=1
5 years ago
Martin Kroeker
1dea57ab25
Revert PR #3250 (shortcut without buffer allocation) as it is unsafe on some x86_64
4 years ago
Martin Kroeker
7bb59fceb7
Clean up some warnings
4 years ago
Martin Kroeker
4ed99c2ce3
Merge pull request #3292 from martin-frbg/syrk_limit
Add lower limit for multithreading in xSYRK
4 years ago
Martin Kroeker
8186963d8c
Add lower limit for multithreading
4 years ago
Martin Kroeker
726c44242b
Add lower threshold for multithreading
4 years ago
Martin Kroeker
1b5620b66e
Add lower threshold for multithreading in ?potrf and ?potri
4 years ago
Martin Kroeker
baf03a0937
Merge pull request #3252 from martin-frbg/more_shortcuts
Further shortcuts for (small) cases that do not need buffer allocation
4 years ago
Martin Kroeker
7aab5e826c
Merge pull request #3250 from martin-frbg/gemv-shortcut
Add shortcut for small-size S/D GEMV_N with increments of one
4 years ago
Martin Kroeker
f84197c1a7
Add shortcuts for (small) cases that do not need expensive buffer allocation
4 years ago
Martin Kroeker
734bd265a8
revert symv changes for now
4 years ago
Martin Kroeker
1217eb910d
Fix copy-paste errors in variables used
4 years ago
Martin Kroeker
d6d7a6685d
Add shortcuts for (small) cases that do not need expensive buffer allocation
4 years ago
Martin Kroeker
f0e7345fb8
Add shortcut for small-size gemv_n with increments of one
4 years ago
Martin Kroeker
03297ff9f0
Add fast path for small xSYR with INCX==1
4 years ago
Gordon Fossum
8b599836db
Add error message token for SBGEMM in gemm.c
4 years ago
Martin Kroeker
904b221f03
Add cast to prevent overflow of intermediate result
4 years ago
Martin Kroeker
c5fb91f1bc
Fix division by zero in the non-x86 codepath
4 years ago
Harmen Stoppels
ec6b354c32
use /usr/bin/env perl
4 years ago
Martin Kroeker
bd906e3410
fix copy-paste error in build rules for cblas_crotg and cblas_zrotg
4 years ago
Alex Henrie
f1bf2603e6
Remove dead assignment to dflag in rotmg functions
4 years ago
Alex Henrie
6f32991eae
Don't define the mode variable when not needed in gemm functions
4 years ago
Martin Kroeker
a8f249458d
Build CBLAS interfaces for CROTG and ZROTG as well
4 years ago
Martin Kroeker
ac3e2a3fdd
Add CBLAS interfaces for csrot and zdrot
4 years ago
Martin Kroeker
857afcc41d
Use ifeq instead of ifdef for user-definable build options
5 years ago
Chen, Guobing
a7b1f9b1bb
Implementation of BF16 based gemv
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
5 years ago