|
@@ -1,4 +1,77 @@ |
|
|
OpenBLAS ChangeLog |
|
|
OpenBLAS ChangeLog |
|
|
|
|
|
==================================================================== |
|
|
|
|
|
Version 0.3.10 |
|
|
|
|
|
14-Jun-2020 |
|
|
|
|
|
|
|
|
|
|
|
common: |
|
|
|
|
|
* Improved thread locking behaviour in blas_server and parallel getrf |
|
|
|
|
|
* Imported bugfix 394 from LAPACK (spurious reference to "XERBL" |
|
|
|
|
|
due to overlong lines) |
|
|
|
|
|
* Imported bugfix 403 from LAPACK (compile option "recursive" required |
|
|
|
|
|
for correctness with Intel and PGI) |
|
|
|
|
|
* Imported bugfix 408 from LAPACK (wrong scaling in ZHEEQUB) |
|
|
|
|
|
* Imported bugfix 411 from LAPACK (infinite loop in LARGV/LARTG/LARTGP) |
|
|
|
|
|
* Fixed mismatches between BUFFERSIZE and GEMM_UNROLL parameters that |
|
|
|
|
|
could lead to crashes at large matrix sizes |
|
|
|
|
|
* Restored internal soname in dynamic libraries on FreeBSD and Dragonfly |
|
|
|
|
|
* Added API (openblas_setaffinity) to set the thread affinity on Linux |
|
|
|
|
|
* Added initial infrastructure for half-precision floating point |
|
|
|
|
|
(bfloat16) support with a generic implementation of SHGEMM |
|
|
|
|
|
* Added CMAKE build system support for building the cblas_Xgemm3m |
|
|
|
|
|
functions |
|
|
|
|
|
* Fixed CMAKE support for building in a path with embedded spaces |
|
|
|
|
|
* Fixed CMAKE (non)handling of NO_EXPRECISION and MAX_STACK_ALLOC |
|
|
|
|
|
* Fixed GCC version detection in the Makefiles |
|
|
|
|
|
* Allowed overriding the names of AR, AS and LD in Makefile builds |
|
|
|
|
|
|
|
|
|
|
|
POWER: |
|
|
|
|
|
* Fixed big-endian POWER8 ELFv2 builds on FreeBSD |
|
|
|
|
|
* Fixed GCC version checks and DYNAMIC_ARCH builds on POWER9 |
|
|
|
|
|
* Fixed CMAKE build support for POWER9 |
|
|
|
|
|
* fixed a potential race condition in the thread buffer allocation |
|
|
|
|
|
* Worked around LAPACK test failures on PPC G4 |
|
|
|
|
|
|
|
|
|
|
|
MIPS: |
|
|
|
|
|
* Fixed a potential race condition in the thread buffer allocation |
|
|
|
|
|
* Added support for MIPS 24K/24KE family based on P5600 kernels |
|
|
|
|
|
|
|
|
|
|
|
MIPS64: |
|
|
|
|
|
* fixed a potential race condition in the thread buffer allocation |
|
|
|
|
|
* Added TARGET=GENERIC |
|
|
|
|
|
|
|
|
|
|
|
ARMV7: |
|
|
|
|
|
* Fixed a race condition in the thread buffer allocation |
|
|
|
|
|
|
|
|
|
|
|
ARMV8: |
|
|
|
|
|
* Fixed a race condition in the thread buffer allocation |
|
|
|
|
|
* Fixed zero initialisation in the assembly for SGEMM and DGEMM BETA |
|
|
|
|
|
* Improved performance of the ThunderX2 DAXPY kernel |
|
|
|
|
|
* Added an optimized SGEMM kernel for Cortex A53 |
|
|
|
|
|
* Fixed Makefile support for INTERFACE64 (8-byte integer) |
|
|
|
|
|
|
|
|
|
|
|
x86_64: |
|
|
|
|
|
* Fixed a syntax error in the CMAKE setup for SkylakeX |
|
|
|
|
|
* Improved performance of STRSM on Haswell, SkylakeX and Ryzen |
|
|
|
|
|
* Improved SGEMM performance on SGEMM for workloads with ldc a |
|
|
|
|
|
multiple of 1024 |
|
|
|
|
|
* Improved DGEMM performance on Skylake X |
|
|
|
|
|
* Fixed unwanted AVX512-dependency of SGEMM in DYNAMIC_ARCH |
|
|
|
|
|
builds created on SkylakeX |
|
|
|
|
|
* Removed data alignment requirement in the SSE2 copy kernels |
|
|
|
|
|
that could cause spurious crashes |
|
|
|
|
|
* Added a workaround for an optimizer bug in AppleClang 11.0.3 |
|
|
|
|
|
* Fixed LAPACK test failures due to wrong options for Intel Fortran |
|
|
|
|
|
* Fixed compilation and LAPACK test results with recent Flang |
|
|
|
|
|
and AMD AOCC |
|
|
|
|
|
* Fixed DYNAMIC_ARCH builds with CMAKE on OS X |
|
|
|
|
|
* Fixed missing exports of cblas_i?amin, cblas_i?min, cblas_i?max, |
|
|
|
|
|
cblas_?sum, cblas_?gemm3m in the shared library on OS |
|
|
|
|
|
* Fixed reporting of cpu name in DYNAMIC_ARCH builds (would sometimes |
|
|
|
|
|
show the name of an older generation chip supported by the same kernels) |
|
|
|
|
|
|
|
|
|
|
|
IBM Z: |
|
|
|
|
|
* Improved performance of SGEMM/STRMM and DGEMM/DTRMM on Z14 |
|
|
|
|
|
|
|
|
==================================================================== |
|
|
==================================================================== |
|
|
Version 0.3.9 |
|
|
Version 0.3.9 |
|
|
1-Mar-2020 |
|
|
1-Mar-2020 |
|
|