|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687 |
- RELAPACK Configuration
- ======================
-
- ReLAPACK has two configuration files: `make.inc`, which is included by the
- Makefile, and `config.h` which is included in the source files.
-
-
- Build and Testing Environment
- -----------------------------
- The build environment (compiler and flags) and the test configuration (linker
- flags for BLAS and LAPACK) are specified in `make.inc`. The test matrix size
- and error bounds are defined in `test/config.h`.
-
- The library `librelapack.a` is compiled by invoking `make`. The tests are
- performed by either `make test` or calling `make` in the test folder.
-
-
- BLAS/LAPACK complex function interfaces
- ---------------------------------------
- For BLAS and LAPACK functions that return a complex number, there exist two
- conflicting (FORTRAN compiler dependent) calling conventions: either the result
- is returned as a `struct` of two floating point numbers or an additional first
- argument with a pointer to such a `struct` is used. By default ReLAPACK uses
- the former (which is what gfortran uses), but it can switch to the latter by
- setting `COMPLEX_FUNCTIONS_AS_ROUTINES` (or explicitly the BLAS and LAPACK
- specific counterparts) to `1` in `config.h`.
-
- **For MKL, `COMPLEX_FUNCTIONS_AS_ROUTINES` must be set to `1`.**
-
- (Using the wrong convention will break `ctrsyl` and `ztrsyl` and the test cases
- will segfault or return errors on the order of 1 or larger.)
-
-
- BLAS extension `xgemmt`
- -----------------------
- The LDL decompositions require a general matrix-matrix product that updates only
- a triangular matrix called `xgemmt`. If the BLAS implementation linked against
- provides such a routine, set the flag `HAVE_XGEMMT` to `1` in `config.h`;
- otherwise, ReLAPACK uses its own recursive implementation of these kernels.
-
- `xgemmt` is provided by MKL.
-
-
- Routine Selection
- -----------------
- ReLAPACK's routines are named `RELAPACK_X` (e.g., `RELAPACK_dgetrf`). If the
- corresponding `INCLUDE_X` flag in `config.h` (e.g., `INCLUDE_DGETRF`) is set to
- `1`, ReLAPACK additionally provides a wrapper under the LAPACK name (e.g.,
- `dgetrf_`). By default, wrappers for all routines are enabled.
-
-
- Crossover Size
- --------------
- The crossover size determines below which matrix sizes ReLAPACK's recursive
- algorithms switch to LAPACK's unblocked routines to avoid tiny BLAS Level 3
- routines. The crossover size is set in `config.h` and can be chosen either
- globally for the entire library, by operation, or individually by routine.
-
-
- Allowing Temporary Buffers
- --------------------------
- Two of ReLAPACK's routines make use of temporary buffers, which are allocated
- and freed within ReLAPACK. Setting `ALLOW_MALLOC` (or one of the routine
- specific counterparts) to 0 in `config.h` will disable these buffers. The
- affected routines are:
-
- * `xsytrf`: The LDL decomposition requires a buffer of size n^2 / 2. As in
- LAPACK, this size can be queried by setting `lWork = -1` and the passed
- buffer will be used if it is large enough; only if it is not, a local buffer
- will be allocated.
-
- The advantage of this mechanism is that ReLAPACK will seamlessly work even
- with codes that statically provide too little memory instead of breaking
- them.
-
- * `xsygst`: The reduction of a real symmetric-definite generalized eigenproblem
- to standard form can use an auxiliary buffer of size n^2 / 2 to avoid
- redundant computations. It thereby performs about 30% less FLOPs than
- LAPACK.
-
-
- FORTRAN symbol names
- --------------------
- ReLAPACK is commonly linked to BLAS and LAPACK with standard FORTRAN interfaces.
- Since these libraries usually have an underscore to their symbol names, ReLAPACK
- has configuration switches in `config.h` to adjust the corresponding routine
- names.
|