This website works better with JavaScript.
Home
Issues
Pull Requests
Milestones
AI流水线
Repositories
Datasets
Forum
实训
竞赛
大数据
Register
Sign In
OSchip
/
OpenBLAS
Not watched
Unwatch
Watch all
Watch but not notify
1
Star
0
Fork
0
Code
Releases
66
Wiki
evaluate
Activity
Issues
0
Pull Requests
0
Datasets
Model
Cloudbrain
HPC
Browse Source
optimized daxpy kernel for increments != 1
tags/v0.2.15^2
Werner Saar
10 years ago
parent
3a67daa954
commit
0273966abb
1 changed files
with
21 additions
and
0 deletions
Split View
Diff Options
Show Stats
Download Patch File
Download Diff File
+21
-0
kernel/x86_64/daxpy.c
+ 21
- 0
kernel/x86_64/daxpy.c
View File
@@ -101,6 +101,27 @@ int CNAME(BLASLONG n, BLASLONG dummy0, BLASLONG dummy1, FLOAT da, FLOAT *x, BLAS
}
BLASLONG n1 = n & -4;
while(i < n1)
{
FLOAT m1 = da * x[ix] ;
FLOAT m2 = da * x[ix+inc_x] ;
FLOAT m3 = da * x[ix+2*inc_x] ;
FLOAT m4 = da * x[ix+3*inc_x] ;
y[iy] += m1 ;
y[iy+inc_y] += m2 ;
y[iy+2*inc_y] += m3 ;
y[iy+3*inc_y] += m4 ;
ix += inc_x*4 ;
iy += inc_y*4 ;
i+=4 ;
}
while(i < n)
{
Write
Preview
Loading…
Cancel
Save