kaustubh
3eaff85191
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
kaustubh
00abce3b93
Add data prefetch in DOT and ASUM functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
Andrew
becf8bc7a0
remove dead code
9 years ago
kaustubh
f3419e634c
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
7472c79ea6
Merge pull request #984 from ksraste/develop
STRSM, DTRSM functions data prefetch
9 years ago
kaustubh
90e2321ac3
STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
Martin Kroeker
4998e19869
Change file comments to work around clang 3.9 assembler bug
9 years ago
Martin Kroeker
91610f3835
Update zdot_msa.c
9 years ago
Martin Kroeker
6e22ecf102
Update zdot.c
9 years ago
Martin Kroeker
6221d6df5f
Update zdot.c
9 years ago
Martin Kroeker
16446d1d23
Remove explicit include of complex.h
9 years ago
Martin Kroeker
a6e9e0b94b
Remove explicit include of complex.h
9 years ago
Martin Kroeker
3178e4fea0
Remove explicit include of complex.h
9 years ago
Martin Kroeker
95c245ddb0
Remove explicit include of complex.h
9 years ago
Martin Kroeker
4b1b27347f
Remove explicit include of complex.h
9 years ago
Shivraj Patil
54747fe24a
DGEMM function split and data prefech
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
515bc56ea9
Refs #946 . Use nrm2 reference implementation for Power8.
9 years ago
Zhang Xianyi
ae70b916f4
Refs #929 . Deal with zero and NaNs for scale.
9 years ago
Shivraj Patil
9687437928
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
d1c6469283
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Ashwin Sekhar T K
c54a29bb48
Cortex A57: Improvements to DGEMM 8x4 kernel
9 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
9 years ago
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
9 years ago
Ashwin Sekhar T K
8a40f1355e
Improvements to GEMV kernels
9 years ago
Ashwin Sekhar T K
78782485b6
Improvements to COPY and IAMAX kernels
9 years ago
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
4a30a2584a
Merge pull request #897 from ksraste/develop
STRSM optimized for MSA
9 years ago
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
9 years ago
Kaustubh Raste
011431b9d7
STRSM optimized for MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Kaustubh Raste
c8a7860eb3
STRSM optimized
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
2daad2bcb5
Merge pull request #893 from biddisco/develop
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
9 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
9 years ago
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
9 years ago
Werner Saar
412bcd187a
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
9 years ago
Werner Saar
bd06b246cc
Merge pull request #890 from wernsaar/develop
optimized dtrsm_kernel_LT for POWER8
9 years ago
Werner Saar
8b140220c8
optimized dtrsm_kernel_LT for POWER8
9 years ago
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
9 years ago
Kaustubh Raste
ad9f317870
STRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
7a19065369
Merge pull request #878 from ksraste/develop
DTRSM bug fix for MIPS P5600 and I6400
9 years ago
Werner Saar
6a2bde7a2d
optimized dgemm and dgetrf for POWER8
9 years ago
Kaustubh Raste
d7cbc7ac13
DTRSM bug fix for MIPS P5600 and I6400
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Werner Saar
88011f625d
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
9 years ago
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
9 years ago
Kaustubh Raste
edb5980c13
DTRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Shivraj Patil
085cf236c2
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
b7b3d8ec8e
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
cd7af5260a
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
9 years ago
Werner Saar
56948dbf0f
optimized dgemm for POWER8
9 years ago