|
|
@@ -0,0 +1,270 @@ |
|
|
|
# Guidance for redistributing OpenBLAS |
|
|
|
|
|
|
|
*We note that this document contains recommendations only - packagers and other |
|
|
|
redistributors are in charge of how OpenBLAS is built and distributed in their |
|
|
|
systems, and may have good reasons to deviate from the guidance given on this |
|
|
|
page. These recommendations are aimed at general packaging systems, with a user |
|
|
|
base that typically is large, open source (or freely available at least), and |
|
|
|
doesn't behave uniformly or that the packager is directly connected with.* |
|
|
|
|
|
|
|
OpenBLAS has a large number of build-time options which can be used to change |
|
|
|
how it behaves at runtime, how artifacts or symbols are named, etc. Variation |
|
|
|
in build configuration can be necessary to acheive a given end goal within a |
|
|
|
distribution or as an end user. However, such variation can also make it more |
|
|
|
difficult to build on top of OpenBLAS and ship code or other packages in a way |
|
|
|
that works across many different distros. Here we provide guidance about the |
|
|
|
most important build options, what effects they may have when changed, and |
|
|
|
which ones to default to. |
|
|
|
|
|
|
|
The Make and CMake build systems provide equivalent options and yield more or |
|
|
|
less the same artifacts, but not exactly (the CMake builds are still |
|
|
|
experimental). You can choose either one and the options will function in the |
|
|
|
same way, however the CMake outputs may require some renaming. To review |
|
|
|
available build options, see `Makefile.rule` or `CMakeLists.txt` in the root of |
|
|
|
the repository. |
|
|
|
|
|
|
|
Build options typically fall into two categories: (a) options that affect the |
|
|
|
user interface, such as library and symbol names or APIs that are made |
|
|
|
available, and (b) options that affect performance and runtime behavior, such |
|
|
|
as threading behavior or CPU architecture-specific code paths. The user |
|
|
|
interface options are more important to keep aligned between distributions, |
|
|
|
while for the performance-related options there are typically more reasons to |
|
|
|
make choices that deviate from the defaults. |
|
|
|
|
|
|
|
Here are recommendations for user interface related packaging choices where it |
|
|
|
is not likely to be a good idea to deviate (typically these are the default |
|
|
|
settings): |
|
|
|
|
|
|
|
1. Include CBLAS. The CBLAS interface is widely used and it doesn't affect |
|
|
|
binary size much, so don't turn it off. |
|
|
|
2. Include LAPACK and LAPACKE. The LAPACK interface is also widely used, and |
|
|
|
while it does make up a significant part of the binary size of the installed |
|
|
|
library, that does not outweigh the regression in usability when deviating |
|
|
|
from the default here.[^1] |
|
|
|
3. Always distribute the pkg-config (`.pc`) and CMake `.cmake`) dependency |
|
|
|
detection files. These files are used by build systems when users want to |
|
|
|
link against OpenBLAS, and there is no benefit of leaving them out. |
|
|
|
4. Provide the LP64 interface by default, and if in addition to that you choose |
|
|
|
to provide an ILP64 interface build as well, use a symbol suffix to avoid |
|
|
|
symbol name clashes (see the next section). |
|
|
|
|
|
|
|
[^1] All major distributions do include LAPACK as of mid 2023 as far as we |
|
|
|
know. Older versions of Arch Linux did not, and that was known to cause |
|
|
|
problems. |
|
|
|
|
|
|
|
|
|
|
|
## ILP64 interface builds |
|
|
|
|
|
|
|
The LP64 (32-bit integer) interface is the default build, and has |
|
|
|
well-established C and Fortran APIs as determined by the reference (Netlib) |
|
|
|
BLAS and LAPACK libraries. The ILP64 (64-bit integer) interface however does |
|
|
|
not have a standard API: symbol names and shared/static library names can be |
|
|
|
produced in multiple ways, and this tends to make it difficult to use. |
|
|
|
As of today there is an agreed-upon way of choosing names for OpenBLAS between |
|
|
|
a number of key users/redistributors, which is the closest thing to a standard |
|
|
|
that there is now. However, there is an ongoing standardization effort in the |
|
|
|
reference BLAS and LAPACK libraries, which differs from the current OpenBLAS |
|
|
|
agreed-upon convention. In this section we'll aim to explain both. |
|
|
|
|
|
|
|
Those two methods are fairly similar, and have a key thing in common: *using a |
|
|
|
symbol suffix*. This is good practice; it is recommended that if you distribute |
|
|
|
an ILP64 build, to have it use a symbol suffix containing `64` in the name. |
|
|
|
This avoids potential symbol clashes when different packages which depend on |
|
|
|
OpenBLAS load both an LP64 and an ILP64 library into memory at the same time. |
|
|
|
|
|
|
|
### The current OpenBLAS agreed-upon ILP64 convention |
|
|
|
|
|
|
|
This convention comprises the shared library name and the symbol suffix in the |
|
|
|
shared library. The symbol suffix to use is `64_`, implying that the library |
|
|
|
name will be `libopenblas64_.so` and the symbols in that library end in `64_`. |
|
|
|
The central issue where this was discussed is |
|
|
|
[openblas#646](https://github.com/xianyi/OpenBLAS/issues/646), and adopters |
|
|
|
include Fedora, Julia, NumPy and SciPy - SuiteSparse already used it as well. |
|
|
|
|
|
|
|
To build shared and static libraries with the currently recommended ILP64 |
|
|
|
conventions with Make: |
|
|
|
```bash |
|
|
|
$ make INTERFACE64=1 SYMBOLSUFFIX=64_ |
|
|
|
``` |
|
|
|
|
|
|
|
This will produce libraries named `libopenblas64_.so|a`, a pkg-config file |
|
|
|
named `openblas64.pc`, and CMake and header files. |
|
|
|
|
|
|
|
Installing locally and inspecting the output will show a few more details: |
|
|
|
```bash |
|
|
|
$ make install PREFIX=$PWD/../openblas/make64 INTERFACE64=1 SYMBOLSUFFIX=64_ |
|
|
|
$ tree . # output slightly edited down |
|
|
|
. |
|
|
|
├── include |
|
|
|
│ ├── cblas.h |
|
|
|
│ ├── f77blas.h |
|
|
|
│ ├── lapacke_config.h |
|
|
|
│ ├── lapacke.h |
|
|
|
│ ├── lapacke_mangling.h |
|
|
|
│ ├── lapacke_utils.h |
|
|
|
│ ├── lapack.h |
|
|
|
│ └── openblas_config.h |
|
|
|
└── lib |
|
|
|
├── cmake |
|
|
|
│ └── openblas |
|
|
|
│ ├── OpenBLASConfig.cmake |
|
|
|
│ └── OpenBLASConfigVersion.cmake |
|
|
|
├── libopenblas64_.a |
|
|
|
├── libopenblas64_.so |
|
|
|
└── pkgconfig |
|
|
|
└── openblas64.pc |
|
|
|
``` |
|
|
|
|
|
|
|
A key point are the symbol names. These will equal the LP64 symbol names, then |
|
|
|
(for Fortran only) the compiler mangling, and then the `64_` symbol suffix. |
|
|
|
Hence to obtain the final symbol names, we need to take into account which |
|
|
|
Fortran compiler we are using. For the most common cases (e.g., gfortran, Intel |
|
|
|
Fortran, or Flang), that means appending a single underscore. In that case, the |
|
|
|
result is: |
|
|
|
|
|
|
|
| base API name | binary symbol name | call from Fortran code | call from C code | |
|
|
|
|---------------|--------------------|------------------------|-----------------------| |
|
|
|
| `dgemm` | `dgemm_64_` | `dgemm_64(...)` | `dgemm_64_(...)` | |
|
|
|
| `cblas_dgemm` | `cblas_dgemm64_` | n/a | `cblas_dgemm64_(...)` | |
|
|
|
|
|
|
|
It is quite useful to have these symbol names be as uniform as possible across |
|
|
|
different packaging systems. |
|
|
|
|
|
|
|
The equivalent build options with CMake are: |
|
|
|
```bash |
|
|
|
$ mkdir build && cd build |
|
|
|
$ cmake .. -DINTERFACE64=1 -DSYMBOLSUFFIX=64_ -DBUILD_SHARED_LIBS=ON -DBUILD_STATIC_LIBS=ON |
|
|
|
$ cmake --build . -j |
|
|
|
``` |
|
|
|
|
|
|
|
Note that the result is not 100% identical to the Make result. For example, the |
|
|
|
library name ends in `_64` rather than `64_` - it is recommended to rename them |
|
|
|
to match the Make library names (also update the `libsuffix` entry in |
|
|
|
`openblas64.pc` to match that rename). |
|
|
|
```bash |
|
|
|
$ cmake --install . --prefix $PWD/../../openblas/cmake64 |
|
|
|
$ tree . |
|
|
|
. |
|
|
|
├── include |
|
|
|
│ └── openblas64 |
|
|
|
│ ├── cblas.h |
|
|
|
│ ├── f77blas.h |
|
|
|
│ ├── lapacke_config.h |
|
|
|
│ ├── lapacke_example_aux.h |
|
|
|
│ ├── lapacke.h |
|
|
|
│ ├── lapacke_mangling.h |
|
|
|
│ ├── lapacke_utils.h |
|
|
|
│ ├── lapack.h |
|
|
|
│ ├── openblas64 |
|
|
|
│ │ └── lapacke_mangling.h |
|
|
|
│ └── openblas_config.h |
|
|
|
└── lib |
|
|
|
├── cmake |
|
|
|
│ └── OpenBLAS64 |
|
|
|
│ ├── OpenBLAS64Config.cmake |
|
|
|
│ ├── OpenBLAS64ConfigVersion.cmake |
|
|
|
│ ├── OpenBLAS64Targets.cmake |
|
|
|
│ └── OpenBLAS64Targets-noconfig.cmake |
|
|
|
├── libopenblas_64.a |
|
|
|
├── libopenblas_64.so -> libopenblas_64.so.0 |
|
|
|
└── pkgconfig |
|
|
|
└── openblas64.pc |
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### The upcoming standardized ILP64 convention |
|
|
|
|
|
|
|
While the `64_` convention above got some adoption, it's slightly hacky and is |
|
|
|
implemented through the use of `objcopy`. An effort is ongoing for a more |
|
|
|
broadly adopted convention in the reference BLAS and LAPACK libraries, using |
|
|
|
(a) the `_64` suffix, and (b) applying that suffix _before_ rather than after |
|
|
|
Fortran compiler mangling. The central issue for this is |
|
|
|
[lapack#666](https://github.com/Reference-LAPACK/lapack/issues/666). |
|
|
|
|
|
|
|
For the most common cases of compiler mangling (a single `_` appended), the end |
|
|
|
result will be: |
|
|
|
|
|
|
|
| base API name | binary symbol name | call from Fortran code | call from C code | |
|
|
|
|---------------|--------------------|------------------------|-----------------------| |
|
|
|
| `dgemm` | `dgemm_64_` | `dgemm_64(...)` | `dgemm_64_(...)` | |
|
|
|
| `cblas_dgemm` | `cblas_dgemm_64` | n/a | `cblas_dgemm_64(...)` | |
|
|
|
|
|
|
|
For other compiler mangling schemes, replace the trailing `_` by the scheme in use. |
|
|
|
|
|
|
|
The shared library name for this `_64` convention should be `libopenblas_64.so`. |
|
|
|
|
|
|
|
Note: it is not yet possible to produce an OpenBLAS build which employs this |
|
|
|
convention! Once reference BLAS and LAPACK with support for `_64` have been |
|
|
|
released, a future OpenBLAS release will support it. For now, please use the |
|
|
|
older `64_` scheme and avoid using the name `libopenblas_64.so`; it should be |
|
|
|
considered reserved for future use of the `_64` standard as prescribed by |
|
|
|
reference BLAS/LAPACK. |
|
|
|
|
|
|
|
|
|
|
|
## Performance and runtime behavior related build options |
|
|
|
|
|
|
|
For these options there are multiple reasonable or common choices. |
|
|
|
|
|
|
|
### Threading related options |
|
|
|
|
|
|
|
OpenBLAS can be built as a multi-threaded or single-threaded library, with the |
|
|
|
default being multi-threaded. It's expected that the default `libopenblas` |
|
|
|
library is multi-threaded; if you'd like to also distribute single-threaded |
|
|
|
builds, consider naming them `libopenblas_sequential`. |
|
|
|
|
|
|
|
OpenBLAS can be built with pthreads or OpenMP as the threading model, with the |
|
|
|
default being pthreads. Both options are commonly used, and the choice here |
|
|
|
should not influence the shared library name. The choice will be captured by |
|
|
|
the `.pc` file. E.g.,: |
|
|
|
```bash |
|
|
|
$ pkg-config --libs openblas |
|
|
|
-fopenmp -lopenblas |
|
|
|
|
|
|
|
$ cat openblas.pc |
|
|
|
... |
|
|
|
openblas_config= ... USE_OPENMP=0 MAX_THREADS=24 |
|
|
|
``` |
|
|
|
|
|
|
|
The maximum number of threads users will be able to use is determined at build |
|
|
|
time by the `NUM_THREADS` build option. It defaults to 24, and there's a wide |
|
|
|
range of values that are reasonable to use (up to 256). 64 is a typical choice |
|
|
|
here; there is a memory footprint penalty that is linear in `NUM_THREADS`. |
|
|
|
Please see `Makefile.rule` for more details. |
|
|
|
|
|
|
|
### CPU architecture related options |
|
|
|
|
|
|
|
OpenBLAS contains a lot of CPU architecture-specific optimizations, hence when |
|
|
|
distributing to a user base with a variety of hardware, it is recommended to |
|
|
|
enable CPU architecture runtime detection. This will dynamically select |
|
|
|
optimized kernels for individual APIs. To do this, use the `DYNAMIC_ARCH=1` |
|
|
|
build option. This is usually done on all common CPU families, except when |
|
|
|
there are known issues. |
|
|
|
|
|
|
|
In case the CPU architecture is known (e.g. you're building binaries for macOS |
|
|
|
M1 users), it is possible to specify the target architecture directly with the |
|
|
|
`TARGET=` build option. |
|
|
|
|
|
|
|
`DYNAMIC_ARCH` and `TARGET` are covered in more detail in the main `README.md` |
|
|
|
in this repository. |
|
|
|
|
|
|
|
|
|
|
|
## Real-world examples |
|
|
|
|
|
|
|
OpenBLAS is likely to be distributed in one of these distribution models: |
|
|
|
|
|
|
|
1. As a standalone package, or multiple packages, in a packaging ecosystem like |
|
|
|
a Linux distro, Homebrew, conda-forge or MSYS2. |
|
|
|
2. Vendored as part of a larger package, e.g. in Julia, NumPy, SciPy, or R. |
|
|
|
3. Locally, e.g. making available as a build on a single HPC cluster. |
|
|
|
|
|
|
|
The guidance on this page is most important for models (1) and (2). These links |
|
|
|
to build recipes for a representative selection of packaging systems may be |
|
|
|
helpful as a reference: |
|
|
|
|
|
|
|
- [Fedora](https://src.fedoraproject.org/rpms/openblas/blob/rawhide/f/openblas.spec) |
|
|
|
- [Debian](https://salsa.debian.org/science-team/openblas/-/blob/master/debian/rules) |
|
|
|
- [Homebrew](https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/openblas.rb) |
|
|
|
- [MSYS2](https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-openblas/PKGBUILD) |
|
|
|
- [conda-forge](https://github.com/conda-forge/openblas-feedstock/blob/main/recipe/build.sh) |
|
|
|
- [NumPy/SciPy](https://github.com/MacPython/openblas-libs/blob/main/tools/build_openblas.sh) |
|
|
|
- [Nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/libraries/science/math/openblas/default.nix) |