Update Changelog for 0.3.22 (#3964)

2 years ago · c05da5960d
--- a/Changelog.txt
+++ b/Changelog.txt
@@ -1,4 +1,80 @@
 OpenBLAS ChangeLog
 ====================================================================
 Version 0.3.22
 26-Mar-2023

 general:
 - Updated the included LAPACK to Reference-LAPACK release 3.11.0
   plus post-release corrections and improvements
 - Added initial support for processing with the EMSCRIPTEN javascript
   converter (yielding a single-threaded build only)
 - Added a threshold for multithreading in SYMM, SYMV and SYR2K
 - Increased the threshold for multithreading in SYRK
 - OpenBLAS no longer decreases the global OMP_NUM_THREADS when it
   exceeds the maximum thread count the library was compiled for.
 - fixed ?GETF2 potentially returning NaN with tiny matrix elements
 - fixed openblas_set_num_threads to work in USE_OPENMP builds
 - fixed cpu core counting in USE_OPENMP builds returning the number
   of OMP "places" rather than cores
 - fixed interpretation of USE_PERL=0 in build scripts
 - fixed linking of the library with libm in CMAKE builds
 - fixed startup delays resulting from a wrong default setting of 
   NO_WARMUP in CMAKE builds
 - fixed inconsistent defaults for overriding of LAPACK SPMV, SPR, 
   SYMV, SYR functions in gmake and CMAKE builds
 - fixed stride calculation in the optimized small-matrix path of 
   complex SYR
 - fixed compilation of ReLAPACK with CMAKE
 - fixed pkgconfig file contents for INTERFACE64 builds
 - fixed building of Reference-LAPACK with recent gfortran
 - fixed building with only a subset of precision types on Windows
 - added new environment variable OPENBLAS_DEFAULT_NUM_THREADS
 - added a GEMV-based implementation of GEMMT 
 - added support for building under QNX
 - updated support for (cross-)building for ALPHA targets

 x86_64:
 - added autodetection of Intel Raptor Lake cpu models
 - added SSCAL microkernels for Haswell and newer targets
 - improved the performance of the Haswell DSCAL microkernel
 - added CSCAL and ZSCAL microkernels for SkylakeX targets
 - fixed detection of gfortran and Cray CCE compilers
 - fixed detection of recent versions of the Intel Fortran compiler
 - fixed compilation with LLVM to no longer run out of AVX512 registers
 - fix cpu type option setting with recent NVIDIA HPC compiler versions
 - fixed compilation for/on AMD Ryzen 4 cpus
 - fixed compilation of AVX2-capable targets with Apple Clang
 - fixed runtime selection of COOPERLAKE in DYNAMIC_ARCH builds
 - worked around gcc/llvm using risky FMA operations in CSCAL/ZSCAL
 - worked around miscompilations of GEMV, SYMV and ZDOT kernels
   by gcc12's tree-vectorizer on OSX and Windows

 ARM:
 - fixed cross-compilation to ARMV5 and ARMV6 targets with CMAKE

 ARMV8:
 - fixed cross-compilation to CortexA53 with CMAKE
 - fixed compilation with CMAKE and "Arm Compiler for Linux 22.1"
 - added cpu autodetection for Cortex X3 and A715
 - fixed conditional compilation of SVE-capable targets in DYNAMIC_ARCH
 - sped up SVE kernels by removing unnecessary prefetches
 - improved the GEMM performance of Neoverse V1
 - added SVE kernels for SDOT and DDOT
 - added an SBGEMM kernel for Neoverse N2
 - improved cpu-specific compiler option selection for Neoverse cpus
 - added support for setting CONSISTENT_FPCSR 

 MIPS64:
 - improved MSA capability detection and handling
 - added a MIPS64_GENERIC build target
 - fixed corner cases in DNRM2

 LOONGARCH64:
 - fixed handling of the INTERFACE64 option

 RISCV:
 - fixed handling of the INTERFACE64 option

 ====================================================================
 Version 0.3.21
 07-Aug-2022