You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Chen, Guobing a7b1f9b1bb Implementation of BF16 based gemv 5 years ago
..
KERNEL Implementation of BF16 based gemv 4 years ago
KERNEL.ATOM Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.BARCELONA Bugfix for ztrmv 9 years ago
KERNEL.BOBCAT Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel. 11 years ago
KERNEL.BULLDOZER Add trivially optimized dsdot based on sdot 7 years ago
KERNEL.COOPERLAKE Enable COOPERLAKE build target 5 years ago
KERNEL.CORE2 Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.DUNNINGTON Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.EXCAVATOR Add trivially optimized dsdot based on sdot 7 years ago
KERNEL.HASWELL Implementaion of dasum, sasum with AVX2 & AVX512 intrinsic 5 years ago
KERNEL.NANO Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.NEHALEM Add trivially optimized dsdot based on sdot 7 years ago
KERNEL.OPTERON Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.OPTERON_SSE3 Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel. 11 years ago
KERNEL.PENRYN Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.PILEDRIVER Add trivially optimized dsdot based on sdot 7 years ago
KERNEL.PRESCOTT fallback to zgemm_kernel_4x2_sse.S 11 years ago
KERNEL.SANDYBRIDGE Add trivially optimized dsdot based on sdot 7 years ago
KERNEL.SKYLAKEX AVX512 dgemm tcopy_16 function 5 years ago
KERNEL.STEAMROLLER Add trivially optimized dsdot based on sdot 7 years ago
KERNEL.ZEN Update KERNEL.ZEN 5 years ago
KERNEL.generic Add ?sum definitions for generic kernel 6 years ago
Makefile Import GotoBLAS2 1.13 BSD version codes. 14 years ago
amax.S use emms instead, add WIN guards 5 years ago
amax_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
amax_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
amax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
asum.S use emms instead, add WIN guards 5 years ago
asum_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
asum_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
asum_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
bf16_common_macros.h Implementation of BF16 based gemv 4 years ago
bf16to.c Add bfloat16 based dot and conversion with single/double 5 years ago
builtin_stinit.S Remove all trailing whitespace except lapack-netlib 11 years ago
cabs.S Remove all trailing whitespace except lapack-netlib 11 years ago
caxpy.c Enable COOPERLAKE build target 5 years ago
caxpy_microk_bulldozer-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
caxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
caxpy_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
caxpy_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
cdot.c Enable COOPERLAKE build target 5 years ago
cdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
cdot_microk_haswell-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
cdot_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
cdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
cgemm3m_kernel_8x4_haswell.c Update cgemm3m_kernel_8x4_haswell.c 5 years ago
cgemm_kernel_4x2_bulldozer.S bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel 11 years ago
cgemm_kernel_4x2_piledriver.S bugfix for piledriver cgemm-, zgemm- and zgemv-kernel 11 years ago
cgemm_kernel_4x8_sandy.S Update organization info. 11 years ago
cgemm_kernel_8x2_haswell.S modification for clang compiler 11 years ago
cgemm_kernel_8x2_haswell.c Update cgemm_kernel_8x2_haswell.c 5 years ago
cgemm_kernel_8x2_sandy.S optimization of sandybridge cgemm-kernel 11 years ago
cgemm_kernel_8x2_skylakex.c AVX512 CGEMM & ZGEMM kernels 5 years ago
cgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
cgemv_n_4.c Enable COOPERLAKE build target 5 years ago
cgemv_n_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 7 years ago
cgemv_n_microk_haswell-4.c Tag %1 and %2 as both input and output 7 years ago
cgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
cgemv_t_4.c Enable COOPERLAKE build target 5 years ago
cgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 7 years ago
cgemv_t_microk_haswell-4.c Tag %1 and %2 as both input and output 7 years ago
copy.S Remove all trailing whitespace except lapack-netlib 11 years ago
copy_sse.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
copy_sse2.S Convert aligned moves to unaligned 5 years ago
cscal.c Enable COOPERLAKE build target 5 years ago
cscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
cscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
cscal_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
ctrsm_kernel_LN_bulldozer.c added optimized trsm_kernels 9 years ago
ctrsm_kernel_LT_bulldozer.c added optimized trsm_kernels 9 years ago
ctrsm_kernel_RN_bulldozer.c added optimized trsm_kernels 9 years ago
ctrsm_kernel_RT_bulldozer.c added optimized trsm_kernels 9 years ago
dasum.c align to 64, using SSE when input size is small 5 years ago
dasum_microk_haswell-2.c align to 64, using SSE when input size is small 5 years ago
dasum_microk_skylakex-2.c align to 64, using SSE when input size is small 5 years ago
daxpy.c Add double precision universal intrinsics for X86/ARM 5 years ago
daxpy_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
daxpy_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
daxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
daxpy_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
daxpy_microk_piledriver-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
daxpy_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
daxpy_microk_skylakex-2.c Add a AVX512 enabled SAXPY/DAXPY functions 7 years ago
daxpy_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
dcopy_bulldozer.S added dcopy_bulldozer.S 12 years ago
ddot.c Enable COOPERLAKE build target 5 years ago
ddot_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
ddot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
ddot_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
ddot_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
ddot_microk_piledriver-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
ddot_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
ddot_microk_skylakex-2.c Add an AVX512 enabled DDOT function 7 years ago
ddot_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dgemm_beta_skylakex.c Fix thinko in skylake beta handling 6 years ago
dgemm_kernel_4x4_haswell.S small optimization on dgemm_kernel for N=1 10 years ago
dgemm_kernel_4x8_haswell.S Add files via upload 6 years ago
dgemm_kernel_4x8_sandy.S Change file comments to work around clang 3.9 assembler bug 9 years ago
dgemm_kernel_4x8_skylakex.c Use p2align instead of align for OSX compatibility 6 years ago
dgemm_kernel_4x8_skylakex_2.c Update dgemm_kernel_4x8_skylakex_2.c 5 years ago
dgemm_kernel_6x4_piledriver.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_kernel_8x2_bulldozer.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 11 years ago
dgemm_kernel_8x2_piledriver.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 11 years ago
dgemm_kernel_8x8_skylakex.c Update dgemm_kernel_8x8_skylakex.c 6 years ago
dgemm_kernel_16x2_haswell.S Refs #330. Fixed the compatible issue with clang on Mac OSX. 11 years ago
dgemm_kernel_16x2_skylakex.S Use AVX512 also for DGEMM 7 years ago
dgemm_kernel_16x2_skylakex.c Add files via upload 5 years ago
dgemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_4.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_8.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_8_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_8_skylakex.c Add vector optimizations for ncopy as well for dgemm/skylakex 7 years ago
dgemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_4.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_8.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_8_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_8_skylakex.c Add optimized *copy versions for skylakex 7 years ago
dgemm_tcopy_16_skylakex.c Fix build with -Werror=return-type 5 years ago
dgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_n_4.c Enable COOPERLAKE build target 5 years ago
dgemv_n_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_n_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_n_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dgemv_n_microk_nehalem-4.c Replace .align with .p2align in the Nehalem microkernels 7 years ago
dgemv_n_microk_piledriver-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dgemv_n_microk_skylakex-4.c Add an AVX512 enabled DGEMV (n) function 7 years ago
dgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_t_4.c Enable COOPERLAKE build target 5 years ago
dgemv_t_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_t_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_t_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dger.c optimized dger kernel for sandybridge 10 years ago
dger_microk_sandy-2.c Fix declaration of input arguments in the Sandybridge GER microkernels (#1967) 6 years ago
dot.S use emms instead, add WIN guards 5 years ago
dot_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
dot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
dot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
dscal.c Enable COOPERLAKE build target 5 years ago
dscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
dscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
dscal_microk_sandy-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
dscal_microk_skylakex-2.c Add an AVX512 enabled DSCAL function 7 years ago
dsymv_L.c Enable COOPERLAKE build target 5 years ago
dsymv_L_microk_bulldozer-2.c Fix declaration of arguments in inline assembly 6 years ago
dsymv_L_microk_haswell-2.c Fix declaration of arguments in inline assembly 6 years ago
dsymv_L_microk_nehalem-2.c Fix declaration of arguments in inline assembly 6 years ago
dsymv_L_microk_sandy-2.c Fix declaration of arguments in inline assembly 6 years ago
dsymv_L_microk_skylakex-2.c Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version 5 years ago
dsymv_U.c Enable COOPERLAKE build target 5 years ago
dsymv_U_microk_bulldozer-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
dsymv_U_microk_haswell-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
dsymv_U_microk_nehalem-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
dsymv_U_microk_sandy-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
dtobf16_microk_cooperlake.c Add bfloat16 based dot and conversion with single/double 5 years ago
dtrmm_kernel_4x8_haswell.c Replace vpermpd with vpermilpd in the Haswell DTRMM kernel 6 years ago
dtrsm_kernel_LN_bulldozer.c Remove unused variables from Haswell dtrmm and Bulldozer dtrsm 7 years ago
dtrsm_kernel_LT_8x2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dtrsm_kernel_RN_8x2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dtrsm_kernel_RN_haswell.c Replace most vpermpd calls in the Haswell DTRSM_RN kernel 6 years ago
dtrsm_kernel_RT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 6 years ago
gemm_beta.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x8_nano.S Fix crash in sgemm SSE/nano kernel on x86_64 6 years ago
gemm_kernel_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_sse.S Fix crash in sgemm SSE/nano kernel on x86_64 6 years ago
gemm_kernel_8x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_4.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_4_opteron.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_4.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_4_opteron.S Remove all trailing whitespace except lapack-netlib 11 years ago
iamax.S use emms instead, add WIN guards 5 years ago
iamax_sse.S Silence a redefinition warning 5 years ago
iamax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
izamax.S use emms instead, add WIN guards 5 years ago
izamax_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
izamax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
lsame.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
mcount.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
nrm2.S use emms instead, add WIN guards 5 years ago
nrm2_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
qconjg.S use emms instead, add WIN guards 5 years ago
qdot.S use emms instead, add WIN guards 5 years ago
qgemm_kernel_2x2.S use emms instead, add WIN guards 5 years ago
qgemv_n.S use emms instead, add WIN guards 5 years ago
qgemv_t.S use emms instead, add WIN guards 5 years ago
qtrsm_kernel_LN_2x2.S use emms instead, add WIN guards 5 years ago
qtrsm_kernel_LT_2x2.S use emms instead, add WIN guards 5 years ago
qtrsm_kernel_RT_2x2.S use emms instead, add WIN guards 5 years ago
rot.S Remove all trailing whitespace except lapack-netlib 11 years ago
rot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
rot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
sasum.c align to 64, using SSE when input size is small 5 years ago
sasum_microk_haswell-2.c align to 64, using SSE when input size is small 5 years ago
sasum_microk_skylakex-2.c align to 64, using SSE when input size is small 5 years ago
saxpy.c Enable COOPERLAKE build target 5 years ago
saxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
saxpy_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
saxpy_microk_piledriver-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
saxpy_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
saxpy_microk_skylakex-2.c Add a AVX512 enabled SAXPY/DAXPY functions 7 years ago
sbdot.c Rename "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
sbdot_microk_cooperlake.c Rename "HALF" and "sh" to "BFLOAT16" and "sb" 5 years ago
sbgemv_n.c Implementation of BF16 based gemv 4 years ago
sbgemv_n_microk_cooperlake.c Implementation of BF16 based gemv 4 years ago
sbgemv_n_microk_cooperlake_template.c Implementation of BF16 based gemv 4 years ago
sbgemv_t.c Implementation of BF16 based gemv 4 years ago
sbgemv_t_microk_cooperlake.c Implementation of BF16 based gemv 4 years ago
sbgemv_t_microk_cooperlake_template.c Implementation of BF16 based gemv 4 years ago
scal.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
scal_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
scal_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
scal_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
sdot.c Enable COOPERLAKE build target 5 years ago
sdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
sdot_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sdot_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
sdot_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sdot_microk_skylakex-2.c Fix typo in sdot function 7 years ago
sdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
sgemm_beta_skylakex.c Fix thinko in skylake beta handling 6 years ago
sgemm_direct_performant.c [WIP] Refactor the driver code for direct SGEMM (#2782) 5 years ago
sgemm_direct_skylakex.c sgemm_direct_skylakex: fix 75eeb26 regression. 5 years ago
sgemm_kernel_8x4_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
sgemm_kernel_8x4_haswell.c Update sgemm_kernel_8x4_haswell.c 5 years ago
sgemm_kernel_8x4_haswell_2.c Strip UTF8 byte order marker from source 5 years ago
sgemm_kernel_8x8_sandy.S Update organization info. 11 years ago
sgemm_kernel_16x2_bulldozer.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 11 years ago
sgemm_kernel_16x2_piledriver.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 11 years ago
sgemm_kernel_16x4_haswell.S modification for clang compiler 11 years ago
sgemm_kernel_16x4_sandy.S Refs #535. Fix the wrong vector instruction in sgemm sandy bridge kernel. 10 years ago
sgemm_kernel_16x4_skylakex.S Use AVX512 also for DGEMM 7 years ago
sgemm_kernel_16x4_skylakex.c make skylakex sgemm code more friendly for readers 5 years ago
sgemm_kernel_16x4_skylakex_2.c AVX512 STRMM kernel 5 years ago
sgemm_kernel_16x4_skylakex_3.c [WIP] Refactor the driver code for direct SGEMM (#2782) 5 years ago
sgemm_ncopy_4_skylakex.c Use sgemm_ncopy_4_skylakex.c also for Haswell 6 years ago
sgemm_tcopy_16_skylakex.c Add a C+intrinsics version of the SGEMM/skylakex kernel 7 years ago
sgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
sgemv_n.c removed obsolete gemv kernel files 11 years ago
sgemv_n_4.c Enable COOPERLAKE build target 5 years ago
sgemv_n_microk_bulldozer-4.c Fix inline assembly constraints 6 years ago
sgemv_n_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sgemv_n_microk_nehalem-4.c Fix inline assembly constraints 6 years ago
sgemv_n_microk_sandy-4.c Fix inline assembly constraints 6 years ago
sgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
sgemv_t.c removed obsolete gemv kernel files 11 years ago
sgemv_t_4.c Enable COOPERLAKE build target 5 years ago
sgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 7 years ago
sgemv_t_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sgemv_t_microk_nehalem-4.c Replace .align with .p2align in the Nehalem microkernels 7 years ago
sgemv_t_microk_sandy-4.c Use .p2align instead of .align for compatibility on Sandybridge as well 7 years ago
sger.c added optimized sger kernel for sandybridge 10 years ago
sger_microk_sandy-2.c Fix declaration of input arguments in the Sandybridge GER microkernels (#1967) 6 years ago
ssymv_L.c Enable COOPERLAKE build target 5 years ago
ssymv_L_microk_bulldozer-2.c Fix declaration of arguments in inline assembly 6 years ago
ssymv_L_microk_haswell-2.c Fix declaration of arguments in inline assembly 6 years ago
ssymv_L_microk_nehalem-2.c Fix declaration of arguments in inline assembly 6 years ago
ssymv_L_microk_sandy-2.c Fix declaration of arguments in inline assembly 6 years ago
ssymv_U.c Enable COOPERLAKE build target 5 years ago
ssymv_U_microk_bulldozer-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
ssymv_U_microk_haswell-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
ssymv_U_microk_nehalem-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
ssymv_U_microk_sandy-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 6 years ago
staticbuffer.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
stobf16_microk_cooperlake.c Add bfloat16 based dot and conversion with single/double 5 years ago
strsm_kernel_8x4_haswell_LN.c Strip UTF8 byte order marker from source 5 years ago
strsm_kernel_8x4_haswell_LT.c AVX2 STRSM kernel 5 years ago
strsm_kernel_8x4_haswell_L_common.h Strip UTF8 byte order marker from source 5 years ago
strsm_kernel_8x4_haswell_RN.c AVX2 STRSM kernel 5 years ago
strsm_kernel_8x4_haswell_RT.c AVX2 STRSM kernel 5 years ago
strsm_kernel_8x4_haswell_R_common.h AVX2 STRSM kernel 5 years ago
strsm_kernel_LN_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 6 years ago
strsm_kernel_LT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 6 years ago
strsm_kernel_RN_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 6 years ago
strsm_kernel_RT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 6 years ago
sum.S use emms instead, add WIN guards 5 years ago
swap.S Remove all trailing whitespace except lapack-netlib 11 years ago
swap_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
swap_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
symv_L_sse.S Enable COOPERLAKE build target 5 years ago
symv_L_sse2.S Enable COOPERLAKE build target 5 years ago
symv_U_sse.S Enable COOPERLAKE build target 5 years ago
symv_U_sse2.S Enable COOPERLAKE build target 5 years ago
tobf16.c Add bfloat16 based dot and conversion with single/double 5 years ago
trsm_kernel_LN_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
xdot.S use emms instead, add WIN guards 5 years ago
xgemm3m_kernel_2x2.S use emms instead, add WIN guards 5 years ago
xgemm_kernel_1x1.S use emms instead, add WIN guards 5 years ago
xgemv_n.S use emms instead, add WIN guards 5 years ago
xgemv_t.S use emms instead, add WIN guards 5 years ago
xtrsm_kernel_LT_1x1.S use emms instead, add WIN guards 5 years ago
zamax.S use emms instead, add WIN guards 5 years ago
zamax_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zamax_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zamax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zasum.S use emms instead, add WIN guards 5 years ago
zasum_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zasum_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zasum_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy.c Enable COOPERLAKE build target 5 years ago
zaxpy_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy_microk_bulldozer-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zcopy.S Remove all trailing whitespace except lapack-netlib 11 years ago
zcopy_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zcopy_sse2.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
zdot.S use emms instead, add WIN guards 5 years ago
zdot.c Fix mssing dummy parameter (imag part of alpha) of zdot_thread_function 5 years ago
zdot_atom.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
zdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
zdot_microk_haswell-2.c Replace vpermpd with vpermilpd 6 years ago
zdot_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
zdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 6 years ago
zdot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zdot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_haswell.c Update zgemm3m_kernel_4x4_haswell.c 5 years ago
zgemm3m_kernel_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_beta.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x1_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_bulldozer.S bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel 11 years ago
zgemm_kernel_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_piledriver.S bugfix for piledriver cgemm-, zgemm- and zgemv-kernel 11 years ago
zgemm_kernel_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_haswell.S modification for clang compiler 11 years ago
zgemm_kernel_4x2_haswell.c Update zgemm_kernel_4x2_haswell.c 5 years ago
zgemm_kernel_4x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_skylakex.c AVX512 CGEMM & ZGEMM kernels 5 years ago
zgemm_kernel_4x2_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x4_sandy.S Update organization info. 11 years ago
zgemm_ncopy_1.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_tcopy_1.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n_4.c Enable COOPERLAKE build target 5 years ago
zgemv_n_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n_dup.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 7 years ago
zgemv_n_microk_haswell-4.c Tag %1 and %2 as both input and output 7 years ago
zgemv_n_microk_sandy-4.c Use .p2align instead of .align for compatibility on Sandybridge as well 7 years ago
zgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_t_4.c Enable COOPERLAKE build target 5 years ago
zgemv_t_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_t_dup.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 7 years ago
zgemv_t_microk_haswell-4.c Tag %1 and %2 as both input and output 7 years ago
znrm2.S use emms instead, add WIN guards 5 years ago
znrm2_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zrot.S Remove all trailing whitespace except lapack-netlib 11 years ago
zrot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zrot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zscal.S use emms instead, add WIN guards 5 years ago
zscal.c Enable COOPERLAKE build target 5 years ago
zscal_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
zscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
zscal_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 6 years ago
zscal_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zscal_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zsum.S use emms instead, add WIN guards 5 years ago
zswap.S Remove all trailing whitespace except lapack-netlib 11 years ago
zswap_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zswap_sse2.S Import GotoBLAS2 1.13 BSD version codes. 14 years ago
zsymv_L_sse.S Enable COOPERLAKE build target 5 years ago
zsymv_L_sse2.S Enable COOPERLAKE build target 5 years ago
zsymv_U_sse.S Enable COOPERLAKE build target 5 years ago
zsymv_U_sse2.S Enable COOPERLAKE build target 5 years ago
ztrsm_kernel_LN_2x1_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_4x2_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_bulldozer.c added optimized trsm_kernels 9 years ago
ztrsm_kernel_LT_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x1_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_4x2_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_bulldozer.c added optimized trsm_kernels 9 years ago
ztrsm_kernel_RN_bulldozer.c added optimized trsm_kernels 9 years ago
ztrsm_kernel_RT_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_4x2_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_bulldozer.c added optimized trsm_kernels 9 years ago