You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

getarch.c 68 kB

6 years ago
14 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
10 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
6 years ago
6 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242025202620272028202920302031203220332034203520362037203820392040204120422043204420452046204720482049205020512052205320542055205620572058205920602061206220632064206520662067
  1. /*****************************************************************************
  2. Copyright (c) 2011-2014, The OpenBLAS Project
  3. All rights reserved.
  4. Redistribution and use in source and binary forms, with or without
  5. modification, are permitted provided that the following conditions are
  6. met:
  7. 1. Redistributions of source code must retain the above copyright
  8. notice, this list of conditions and the following disclaimer.
  9. 2. Redistributions in binary form must reproduce the above copyright
  10. notice, this list of conditions and the following disclaimer in
  11. the documentation and/or other materials provided with the
  12. distribution.
  13. 3. Neither the name of the OpenBLAS project nor the names of
  14. its contributors may be used to endorse or promote products
  15. derived from this software without specific prior written
  16. permission.
  17. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
  18. AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  19. IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  20. ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
  21. LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  22. DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
  23. SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  24. CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  25. OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
  26. USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  27. **********************************************************************************/
  28. /*********************************************************************/
  29. /* Copyright 2009, 2010 The University of Texas at Austin. */
  30. /* All rights reserved. */
  31. /* */
  32. /* Redistribution and use in source and binary forms, with or */
  33. /* without modification, are permitted provided that the following */
  34. /* conditions are met: */
  35. /* */
  36. /* 1. Redistributions of source code must retain the above */
  37. /* copyright notice, this list of conditions and the following */
  38. /* disclaimer. */
  39. /* */
  40. /* 2. Redistributions in binary form must reproduce the above */
  41. /* copyright notice, this list of conditions and the following */
  42. /* disclaimer in the documentation and/or other materials */
  43. /* provided with the distribution. */
  44. /* */
  45. /* THIS SOFTWARE IS PROVIDED BY THE UNIVERSITY OF TEXAS AT */
  46. /* AUSTIN ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, */
  47. /* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF */
  48. /* MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE */
  49. /* DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY OF TEXAS AT */
  50. /* AUSTIN OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, */
  51. /* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES */
  52. /* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE */
  53. /* GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR */
  54. /* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF */
  55. /* LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT */
  56. /* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT */
  57. /* OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE */
  58. /* POSSIBILITY OF SUCH DAMAGE. */
  59. /* */
  60. /* The views and conclusions contained in the software and */
  61. /* documentation are those of the authors and should not be */
  62. /* interpreted as representing official policies, either expressed */
  63. /* or implied, of The University of Texas at Austin. */
  64. /*********************************************************************/
  65. #if defined(__WIN32__) || defined(__WIN64__) || defined(__CYGWIN32__) || defined(__CYGWIN64__) || defined(_WIN32) || defined(_WIN64)
  66. #define OS_WINDOWS
  67. #endif
  68. #if defined(__i386__) || defined(__x86_64__) || defined(_M_IX86) || defined(_M_X64)
  69. #define INTEL_AMD
  70. #endif
  71. #include <stdio.h>
  72. #include <string.h>
  73. #ifdef OS_WINDOWS
  74. #include <windows.h>
  75. #endif
  76. #if defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__DragonFly__) || defined(__APPLE__)
  77. #include <sys/types.h>
  78. #include <sys/sysctl.h>
  79. #endif
  80. #if defined(linux) || defined(__sun__)
  81. #include <sys/sysinfo.h>
  82. #include <unistd.h>
  83. #endif
  84. #if defined(_AIX)
  85. #include <unistd.h>
  86. #include <sys/systemcfg.h>
  87. #include <sys/sysinfo.h>
  88. #endif
  89. /* #define FORCE_P2 */
  90. /* #define FORCE_KATMAI */
  91. /* #define FORCE_COPPERMINE */
  92. /* #define FORCE_NORTHWOOD */
  93. /* #define FORCE_PRESCOTT */
  94. /* #define FORCE_BANIAS */
  95. /* #define FORCE_YONAH */
  96. /* #define FORCE_CORE2 */
  97. /* #define FORCE_PENRYN */
  98. /* #define FORCE_DUNNINGTON */
  99. /* #define FORCE_NEHALEM */
  100. /* #define FORCE_SANDYBRIDGE */
  101. /* #define FORCE_ATOM */
  102. /* #define FORCE_ATHLON */
  103. /* #define FORCE_OPTERON */
  104. /* #define FORCE_OPTERON_SSE3 */
  105. /* #define FORCE_BARCELONA */
  106. /* #define FORCE_SHANGHAI */
  107. /* #define FORCE_ISTANBUL */
  108. /* #define FORCE_BOBCAT */
  109. /* #define FORCE_BULLDOZER */
  110. /* #define FORCE_PILEDRIVER */
  111. /* #define FORCE_SSE_GENERIC */
  112. /* #define FORCE_VIAC3 */
  113. /* #define FORCE_NANO */
  114. /* #define FORCE_POWER3 */
  115. /* #define FORCE_POWER4 */
  116. /* #define FORCE_POWER5 */
  117. /* #define FORCE_POWER6 */
  118. /* #define FORCE_POWER7 */
  119. /* #define FORCE_POWER8 */
  120. /* #define FORCE_PPCG4 */
  121. /* #define FORCE_PPC970 */
  122. /* #define FORCE_PPC970MP */
  123. /* #define FORCE_PPC440 */
  124. /* #define FORCE_PPC440FP2 */
  125. /* #define FORCE_CELL */
  126. /* #define FORCE_MIPS64_GENERIC */
  127. /* #define FORCE_SICORTEX */
  128. /* #define FORCE_LOONGSON3R3 */
  129. /* #define FORCE_LOONGSON3R4 */
  130. /* #define FORCE_LOONGSON3R5 */
  131. /* #define FORCE_LOONGSON2K1000 */
  132. /* #define FORCE_LOONGSONGENERIC */
  133. /* #define FORCE_I6400 */
  134. /* #define FORCE_P6600 */
  135. /* #define FORCE_P5600 */
  136. /* #define FORCE_I6500 */
  137. /* #define FORCE_ITANIUM2 */
  138. /* #define FORCE_SPARC */
  139. /* #define FORCE_SPARCV7 */
  140. /* #define FORCE_ZARCH_GENERIC */
  141. /* #define FORCE_Z13 */
  142. /* #define FORCE_EV4 */
  143. /* #define FORCE_EV5 */
  144. /* #define FORCE_EV6 */
  145. /* #define FORCE_CSKY */
  146. /* #define FORCE_CK860FV */
  147. /* #define FORCE_GENERIC */
  148. #ifdef FORCE_P2
  149. #define FORCE
  150. #define FORCE_INTEL
  151. #define ARCHITECTURE "X86"
  152. #define SUBARCHITECTURE "PENTIUM2"
  153. #define ARCHCONFIG "-DPENTIUM2 " \
  154. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  155. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  156. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  157. "-DHAVE_CMOV -DHAVE_MMX"
  158. #define LIBNAME "p2"
  159. #define CORENAME "P5"
  160. #endif
  161. #ifdef FORCE_KATMAI
  162. #define FORCE
  163. #define FORCE_INTEL
  164. #define ARCHITECTURE "X86"
  165. #define SUBARCHITECTURE "PENTIUM3"
  166. #define ARCHCONFIG "-DPENTIUM3 " \
  167. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  168. "-DL2_SIZE=524288 -DL2_LINESIZE=32 " \
  169. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  170. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE "
  171. #define LIBNAME "katmai"
  172. #define CORENAME "KATMAI"
  173. #endif
  174. #ifdef FORCE_COPPERMINE
  175. #define FORCE
  176. #define FORCE_INTEL
  177. #define ARCHITECTURE "X86"
  178. #define SUBARCHITECTURE "PENTIUM3"
  179. #define ARCHCONFIG "-DPENTIUM3 " \
  180. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  181. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  182. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  183. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE "
  184. #define LIBNAME "coppermine"
  185. #define CORENAME "COPPERMINE"
  186. #endif
  187. #ifdef FORCE_NORTHWOOD
  188. #define FORCE
  189. #define FORCE_INTEL
  190. #define ARCHITECTURE "X86"
  191. #define SUBARCHITECTURE "PENTIUM4"
  192. #define ARCHCONFIG "-DPENTIUM4 " \
  193. "-DL1_DATA_SIZE=8192 -DL1_DATA_LINESIZE=64 " \
  194. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  195. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  196. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  197. #define LIBNAME "northwood"
  198. #define CORENAME "NORTHWOOD"
  199. #endif
  200. #ifdef FORCE_PRESCOTT
  201. #define FORCE
  202. #define FORCE_INTEL
  203. #define ARCHITECTURE "X86"
  204. #define SUBARCHITECTURE "PENTIUM4"
  205. #define ARCHCONFIG "-DPENTIUM4 " \
  206. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  207. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  208. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  209. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3"
  210. #define LIBNAME "prescott"
  211. #define CORENAME "PRESCOTT"
  212. #endif
  213. #ifdef FORCE_BANIAS
  214. #define FORCE
  215. #define FORCE_INTEL
  216. #define ARCHITECTURE "X86"
  217. #define SUBARCHITECTURE "BANIAS"
  218. #define ARCHCONFIG "-DPENTIUMM " \
  219. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  220. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  221. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  222. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  223. #define LIBNAME "banias"
  224. #define CORENAME "BANIAS"
  225. #endif
  226. #ifdef FORCE_YONAH
  227. #define FORCE
  228. #define FORCE_INTEL
  229. #define ARCHITECTURE "X86"
  230. #define SUBARCHITECTURE "YONAH"
  231. #define ARCHCONFIG "-DPENTIUMM " \
  232. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  233. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  234. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  235. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  236. #define LIBNAME "yonah"
  237. #define CORENAME "YONAH"
  238. #endif
  239. #ifdef FORCE_CORE2
  240. #define FORCE
  241. #define FORCE_INTEL
  242. #define ARCHITECTURE "X86"
  243. #define SUBARCHITECTURE "CONRORE"
  244. #define ARCHCONFIG "-DCORE2 " \
  245. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  246. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  247. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 " \
  248. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3"
  249. #define LIBNAME "core2"
  250. #define CORENAME "CORE2"
  251. #endif
  252. #ifdef FORCE_PENRYN
  253. #define FORCE
  254. #define FORCE_INTEL
  255. #define ARCHITECTURE "X86"
  256. #define SUBARCHITECTURE "PENRYN"
  257. #define ARCHCONFIG "-DPENRYN " \
  258. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  259. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  260. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 " \
  261. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1"
  262. #define LIBNAME "penryn"
  263. #define CORENAME "PENRYN"
  264. #endif
  265. #ifdef FORCE_DUNNINGTON
  266. #define FORCE
  267. #define FORCE_INTEL
  268. #define ARCHITECTURE "X86"
  269. #define SUBARCHITECTURE "DUNNINGTON"
  270. #define ARCHCONFIG "-DDUNNINGTON " \
  271. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  272. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  273. "-DL3_SIZE=16777216 -DL3_LINESIZE=64 " \
  274. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 " \
  275. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1"
  276. #define LIBNAME "dunnington"
  277. #define CORENAME "DUNNINGTON"
  278. #endif
  279. #ifdef FORCE_NEHALEM
  280. #define FORCE
  281. #define FORCE_INTEL
  282. #define ARCHITECTURE "X86"
  283. #define SUBARCHITECTURE "NEHALEM"
  284. #define ARCHCONFIG "-DNEHALEM " \
  285. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  286. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  287. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  288. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  289. #define LIBNAME "nehalem"
  290. #define CORENAME "NEHALEM"
  291. #endif
  292. #ifdef FORCE_SANDYBRIDGE
  293. #define FORCE
  294. #define FORCE_INTEL
  295. #define ARCHITECTURE "X86"
  296. #ifdef NO_AVX
  297. #define SUBARCHITECTURE "NEHALEM"
  298. #define ARCHCONFIG "-DNEHALEM " \
  299. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  300. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  301. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  302. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  303. #define LIBNAME "nehalem"
  304. #define CORENAME "NEHALEM"
  305. #else
  306. #define SUBARCHITECTURE "SANDYBRIDGE"
  307. #define ARCHCONFIG "-DSANDYBRIDGE " \
  308. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  309. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  310. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  311. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  312. #define LIBNAME "sandybridge"
  313. #define CORENAME "SANDYBRIDGE"
  314. #endif
  315. #endif
  316. #ifdef FORCE_HASWELL
  317. #define FORCE
  318. #define FORCE_INTEL
  319. #define ARCHITECTURE "X86"
  320. #ifdef NO_AVX2
  321. #ifdef NO_AVX
  322. #define SUBARCHITECTURE "NEHALEM"
  323. #define ARCHCONFIG "-DNEHALEM " \
  324. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  325. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  326. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  327. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  328. #define LIBNAME "nehalem"
  329. #define CORENAME "NEHALEM"
  330. #else
  331. #define SUBARCHITECTURE "SANDYBRIDGE"
  332. #define ARCHCONFIG "-DSANDYBRIDGE " \
  333. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  334. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  335. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  336. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  337. #define LIBNAME "sandybridge"
  338. #define CORENAME "SANDYBRIDGE"
  339. #endif
  340. #else
  341. #define SUBARCHITECTURE "HASWELL"
  342. #define ARCHCONFIG "-DHASWELL " \
  343. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  344. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  345. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  346. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  347. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  348. #define LIBNAME "haswell"
  349. #define CORENAME "HASWELL"
  350. #endif
  351. #endif
  352. #ifdef FORCE_SKYLAKEX
  353. #define FORCE
  354. #define FORCE_INTEL
  355. #define ARCHITECTURE "X86"
  356. #ifdef NO_AVX512
  357. #ifdef NO_AVX2
  358. #ifdef NO_AVX
  359. #define SUBARCHITECTURE "NEHALEM"
  360. #define ARCHCONFIG "-DNEHALEM " \
  361. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  362. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  363. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  364. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  365. #define LIBNAME "nehalem"
  366. #define CORENAME "NEHALEM"
  367. #else
  368. #define SUBARCHITECTURE "SANDYBRIDGE"
  369. #define ARCHCONFIG "-DSANDYBRIDGE " \
  370. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  371. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  372. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  373. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  374. #define LIBNAME "sandybridge"
  375. #define CORENAME "SANDYBRIDGE"
  376. #endif
  377. #else
  378. #define SUBARCHITECTURE "HASWELL"
  379. #define ARCHCONFIG "-DHASWELL " \
  380. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  381. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  382. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  383. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  384. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  385. #define LIBNAME "haswell"
  386. #define CORENAME "HASWELL"
  387. #endif
  388. #else
  389. #define SUBARCHITECTURE "SKYLAKEX"
  390. #define ARCHCONFIG "-DSKYLAKEX " \
  391. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  392. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  393. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  394. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  395. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3 -DHAVE_AVX512VL -march=skylake-avx512"
  396. #define LIBNAME "skylakex"
  397. #define CORENAME "SKYLAKEX"
  398. #endif
  399. #endif
  400. #ifdef FORCE_COOPERLAKE
  401. #define FORCE
  402. #define FORCE_INTEL
  403. #define ARCHITECTURE "X86"
  404. #ifdef NO_AVX512
  405. #ifdef NO_AVX2
  406. #ifdef NO_AVX
  407. #define SUBARCHITECTURE "NEHALEM"
  408. #define ARCHCONFIG "-DNEHALEM " \
  409. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  410. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  411. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  412. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  413. #define LIBNAME "nehalem"
  414. #define CORENAME "NEHALEM"
  415. #else
  416. #define SUBARCHITECTURE "SANDYBRIDGE"
  417. #define ARCHCONFIG "-DSANDYBRIDGE " \
  418. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  419. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  420. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  421. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  422. #define LIBNAME "sandybridge"
  423. #define CORENAME "SANDYBRIDGE"
  424. #endif
  425. #else
  426. #define SUBARCHITECTURE "HASWELL"
  427. #define ARCHCONFIG "-DHASWELL " \
  428. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  429. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  430. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  431. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  432. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  433. #define LIBNAME "haswell"
  434. #define CORENAME "HASWELL"
  435. #endif
  436. #else
  437. #define SUBARCHITECTURE "COOPERLAKE"
  438. #define ARCHCONFIG "-DCOOPERLAKE " \
  439. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  440. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  441. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  442. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  443. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3 -DHAVE_AVX512VL -DHAVE_AVX512BF16 -march=cooperlake"
  444. #define LIBNAME "cooperlake"
  445. #define CORENAME "COOPERLAKE"
  446. #endif
  447. #endif
  448. #ifdef FORCE_SAPPHIRERAPIDS
  449. #define FORCE
  450. #define FORCE_INTEL
  451. #define ARCHITECTURE "X86"
  452. #ifdef NO_AVX512
  453. #ifdef NO_AVX2
  454. #ifdef NO_AVX
  455. #define SUBARCHITECTURE "NEHALEM"
  456. #define ARCHCONFIG "-DNEHALEM " \
  457. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  458. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  459. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  460. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  461. #define LIBNAME "nehalem"
  462. #define CORENAME "NEHALEM"
  463. #else
  464. #define SUBARCHITECTURE "SANDYBRIDGE"
  465. #define ARCHCONFIG "-DSANDYBRIDGE " \
  466. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  467. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  468. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  469. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  470. #define LIBNAME "sandybridge"
  471. #define CORENAME "SANDYBRIDGE"
  472. #endif
  473. #else
  474. #define SUBARCHITECTURE "HASWELL"
  475. #define ARCHCONFIG "-DHASWELL " \
  476. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  477. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  478. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  479. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  480. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  481. #define LIBNAME "haswell"
  482. #define CORENAME "HASWELL"
  483. #endif
  484. #else
  485. #define SUBARCHITECTURE "SAPPHIRERAPIDS"
  486. #define ARCHCONFIG "-DSAPPHIRERAPIDS " \
  487. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  488. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  489. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  490. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  491. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3 -DHAVE_AVX512VL -DHAVE_AVX512BF16 -march=sapphirerapids"
  492. #define LIBNAME "sapphirerapids"
  493. #define CORENAME "SAPPHIRERAPIDS"
  494. #endif
  495. #endif
  496. #ifdef FORCE_ATOM
  497. #define FORCE
  498. #define FORCE_INTEL
  499. #define ARCHITECTURE "X86"
  500. #define SUBARCHITECTURE "ATOM"
  501. #define ARCHCONFIG "-DATOM " \
  502. "-DL1_DATA_SIZE=24576 -DL1_DATA_LINESIZE=64 " \
  503. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  504. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  505. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3"
  506. #define LIBNAME "atom"
  507. #define CORENAME "ATOM"
  508. #endif
  509. #ifdef FORCE_ATHLON
  510. #define FORCE
  511. #define FORCE_INTEL
  512. #define ARCHITECTURE "X86"
  513. #define SUBARCHITECTURE "ATHLON"
  514. #define ARCHCONFIG "-DATHLON " \
  515. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  516. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  517. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DHAVE_3DNOW " \
  518. "-DHAVE_3DNOWEX -DHAVE_MMX -DHAVE_SSE "
  519. #define LIBNAME "athlon"
  520. #define CORENAME "ATHLON"
  521. #endif
  522. #ifdef FORCE_OPTERON
  523. #define FORCE
  524. #define FORCE_INTEL
  525. #define ARCHITECTURE "X86"
  526. #define SUBARCHITECTURE "OPTERON"
  527. #define ARCHCONFIG "-DOPTERON " \
  528. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  529. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  530. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DHAVE_3DNOW " \
  531. "-DHAVE_3DNOWEX -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  532. #define LIBNAME "opteron"
  533. #define CORENAME "OPTERON"
  534. #endif
  535. #ifdef FORCE_OPTERON_SSE3
  536. #define FORCE
  537. #define FORCE_INTEL
  538. #define ARCHITECTURE "X86"
  539. #define SUBARCHITECTURE "OPTERON"
  540. #define ARCHCONFIG "-DOPTERON " \
  541. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  542. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  543. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DHAVE_3DNOW " \
  544. "-DHAVE_3DNOWEX -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3"
  545. #define LIBNAME "opteron"
  546. #define CORENAME "OPTERON"
  547. #endif
  548. #if defined(FORCE_BARCELONA) || defined(FORCE_SHANGHAI) || defined(FORCE_ISTANBUL)
  549. #define FORCE
  550. #define FORCE_INTEL
  551. #define ARCHITECTURE "X86"
  552. #define SUBARCHITECTURE "BARCELONA"
  553. #define ARCHCONFIG "-DBARCELONA " \
  554. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  555. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL3_SIZE=2097152 " \
  556. "-DDTB_DEFAULT_ENTRIES=48 -DDTB_SIZE=4096 " \
  557. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 " \
  558. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU"
  559. #define LIBNAME "barcelona"
  560. #define CORENAME "BARCELONA"
  561. #endif
  562. #if defined(FORCE_BOBCAT)
  563. #define FORCE
  564. #define FORCE_INTEL
  565. #define ARCHITECTURE "X86"
  566. #define SUBARCHITECTURE "BOBCAT"
  567. #define ARCHCONFIG "-DBOBCAT " \
  568. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  569. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  570. "-DDTB_DEFAULT_ENTRIES=40 -DDTB_SIZE=4096 " \
  571. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 " \
  572. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_CFLUSH -DHAVE_CMOV"
  573. #define LIBNAME "bobcat"
  574. #define CORENAME "BOBCAT"
  575. #endif
  576. #if defined (FORCE_BULLDOZER)
  577. #define FORCE
  578. #define FORCE_INTEL
  579. #define ARCHITECTURE "X86"
  580. #define SUBARCHITECTURE "BULLDOZER"
  581. #define ARCHCONFIG "-DBULLDOZER " \
  582. "-DL1_DATA_SIZE=49152 -DL1_DATA_LINESIZE=64 " \
  583. "-DL2_SIZE=1024000 -DL2_LINESIZE=64 -DL3_SIZE=16777216 " \
  584. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 " \
  585. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 " \
  586. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU " \
  587. "-DHAVE_AVX"
  588. #define LIBNAME "bulldozer"
  589. #define CORENAME "BULLDOZER"
  590. #endif
  591. #if defined (FORCE_PILEDRIVER)
  592. #define FORCE
  593. #define FORCE_INTEL
  594. #define ARCHITECTURE "X86"
  595. #define SUBARCHITECTURE "PILEDRIVER"
  596. #define ARCHCONFIG "-DPILEDRIVER " \
  597. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  598. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL3_SIZE=12582912 " \
  599. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  600. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  601. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  602. "-DHAVE_AVX -DHAVE_FMA3"
  603. #define LIBNAME "piledriver"
  604. #define CORENAME "PILEDRIVER"
  605. #endif
  606. #if defined (FORCE_STEAMROLLER)
  607. #define FORCE
  608. #define FORCE_INTEL
  609. #define ARCHITECTURE "X86"
  610. #define SUBARCHITECTURE "STEAMROLLER"
  611. #define ARCHCONFIG "-DSTEAMROLLER " \
  612. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  613. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL3_SIZE=12582912 " \
  614. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  615. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  616. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  617. "-DHAVE_AVX -DHAVE_FMA3"
  618. #define LIBNAME "steamroller"
  619. #define CORENAME "STEAMROLLER"
  620. #endif
  621. #if defined (FORCE_EXCAVATOR)
  622. #define FORCE
  623. #define FORCE_INTEL
  624. #define ARCHITECTURE "X86"
  625. #define SUBARCHITECTURE "EXCAVATOR"
  626. #define ARCHCONFIG "-DEXCAVATOR " \
  627. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  628. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL3_SIZE=12582912 " \
  629. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  630. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  631. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  632. "-DHAVE_AVX -DHAVE_FMA3"
  633. #define LIBNAME "excavator"
  634. #define CORENAME "EXCAVATOR"
  635. #endif
  636. #if defined (FORCE_ZEN)
  637. #define FORCE
  638. #define FORCE_INTEL
  639. #define ARCHITECTURE "X86"
  640. #ifdef NO_AVX2
  641. #ifdef NO_AVX
  642. #define SUBARCHITECTURE "NEHALEM"
  643. #define ARCHCONFIG "-DNEHALEM " \
  644. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  645. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  646. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  647. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  648. #define LIBNAME "nehalem"
  649. #define CORENAME "NEHALEM"
  650. #else
  651. #define SUBARCHITECTURE "SANDYBRIDGE"
  652. #define ARCHCONFIG "-DSANDYBRIDGE " \
  653. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  654. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  655. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  656. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  657. #define LIBNAME "sandybridge"
  658. #define CORENAME "SANDYBRIDGE"
  659. #endif
  660. #else
  661. #define SUBARCHITECTURE "ZEN"
  662. #define ARCHCONFIG "-DZEN " \
  663. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  664. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL2_CODE_ASSOCIATIVE=8 " \
  665. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  666. "-DL3_SIZE=16777216 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=8 " \
  667. "-DITB_DEFAULT_ENTRIES=64 -DITB_SIZE=4096 " \
  668. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  669. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  670. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  671. "-DHAVE_AVX -DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  672. #define LIBNAME "zen"
  673. #define CORENAME "ZEN"
  674. #endif
  675. #endif
  676. #ifdef FORCE_SSE_GENERIC
  677. #define FORCE
  678. #define FORCE_INTEL
  679. #define ARCHITECTURE "X86"
  680. #define SUBARCHITECTURE "GENERIC"
  681. #define ARCHCONFIG "-DGENERIC " \
  682. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  683. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  684. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  685. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2"
  686. #define LIBNAME "generic"
  687. #define CORENAME "GENERIC"
  688. #endif
  689. #ifdef FORCE_VIAC3
  690. #define FORCE
  691. #define FORCE_INTEL
  692. #define ARCHITECTURE "X86"
  693. #define SUBARCHITECTURE "VIAC3"
  694. #define ARCHCONFIG "-DVIAC3 " \
  695. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  696. "-DL2_SIZE=65536 -DL2_LINESIZE=32 " \
  697. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 " \
  698. "-DHAVE_MMX -DHAVE_SSE "
  699. #define LIBNAME "viac3"
  700. #define CORENAME "VIAC3"
  701. #endif
  702. #ifdef FORCE_NANO
  703. #define FORCE
  704. #define FORCE_INTEL
  705. #define ARCHITECTURE "X86"
  706. #define SUBARCHITECTURE "NANO"
  707. #define ARCHCONFIG "-DNANO " \
  708. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  709. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  710. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  711. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3"
  712. #define LIBNAME "nano"
  713. #define CORENAME "NANO"
  714. #endif
  715. #ifdef FORCE_POWER3
  716. #define FORCE
  717. #define ARCHITECTURE "POWER"
  718. #define SUBARCHITECTURE "POWER3"
  719. #define SUBDIRNAME "power"
  720. #define ARCHCONFIG "-DPOWER3 " \
  721. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=128 " \
  722. "-DL2_SIZE=2097152 -DL2_LINESIZE=128 " \
  723. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  724. #define LIBNAME "power3"
  725. #define CORENAME "POWER3"
  726. #endif
  727. #ifdef FORCE_POWER4
  728. #define FORCE
  729. #define ARCHITECTURE "POWER"
  730. #define SUBARCHITECTURE "POWER4"
  731. #define SUBDIRNAME "power"
  732. #define ARCHCONFIG "-DPOWER4 " \
  733. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  734. "-DL2_SIZE=1509949 -DL2_LINESIZE=128 " \
  735. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=6 "
  736. #define LIBNAME "power4"
  737. #define CORENAME "POWER4"
  738. #endif
  739. #ifdef FORCE_POWER5
  740. #define FORCE
  741. #define ARCHITECTURE "POWER"
  742. #define SUBARCHITECTURE "POWER5"
  743. #define SUBDIRNAME "power"
  744. #define ARCHCONFIG "-DPOWER5 " \
  745. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  746. "-DL2_SIZE=1509949 -DL2_LINESIZE=128 " \
  747. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=6 "
  748. #define LIBNAME "power5"
  749. #define CORENAME "POWER5"
  750. #endif
  751. #if defined(FORCE_POWER6) || defined(FORCE_POWER7)
  752. #define FORCE
  753. #define ARCHITECTURE "POWER"
  754. #define SUBARCHITECTURE "POWER6"
  755. #define SUBDIRNAME "power"
  756. #define ARCHCONFIG "-DPOWER6 " \
  757. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=128 " \
  758. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  759. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  760. #define LIBNAME "power6"
  761. #define CORENAME "POWER6"
  762. #endif
  763. #if defined(FORCE_POWER8)
  764. #define FORCE
  765. #define ARCHITECTURE "POWER"
  766. #define SUBARCHITECTURE "POWER8"
  767. #define SUBDIRNAME "power"
  768. #define ARCHCONFIG "-DPOWER8 " \
  769. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=128 " \
  770. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  771. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  772. #define LIBNAME "power8"
  773. #define CORENAME "POWER8"
  774. #endif
  775. #if defined(FORCE_POWER9)
  776. #define FORCE
  777. #define ARCHITECTURE "POWER"
  778. #define SUBARCHITECTURE "POWER9"
  779. #define SUBDIRNAME "power"
  780. #define ARCHCONFIG "-DPOWER9 " \
  781. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  782. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  783. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  784. #define LIBNAME "power9"
  785. #define CORENAME "POWER9"
  786. #endif
  787. #if defined(FORCE_POWER10)
  788. #define FORCE
  789. #define ARCHITECTURE "POWER"
  790. #define SUBARCHITECTURE "POWER10"
  791. #define SUBDIRNAME "power"
  792. #define ARCHCONFIG "-DPOWER10 " \
  793. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  794. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  795. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  796. #define LIBNAME "power10"
  797. #define CORENAME "POWER10"
  798. #endif
  799. #ifdef FORCE_PPCG4
  800. #define FORCE
  801. #define ARCHITECTURE "POWER"
  802. #define SUBARCHITECTURE "PPCG4"
  803. #define SUBDIRNAME "power"
  804. #define ARCHCONFIG "-DPPCG4 " \
  805. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  806. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  807. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  808. #define LIBNAME "ppcg4"
  809. #define CORENAME "PPCG4"
  810. #endif
  811. #ifdef FORCE_PPC970
  812. #define FORCE
  813. #define ARCHITECTURE "POWER"
  814. #define SUBARCHITECTURE "PPC970"
  815. #define SUBDIRNAME "power"
  816. #define ARCHCONFIG "-DPPC970 " \
  817. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  818. "-DL2_SIZE=512488 -DL2_LINESIZE=128 " \
  819. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  820. #define LIBNAME "ppc970"
  821. #define CORENAME "PPC970"
  822. #endif
  823. #ifdef FORCE_PPC970MP
  824. #define FORCE
  825. #define ARCHITECTURE "POWER"
  826. #define SUBARCHITECTURE "PPC970"
  827. #define SUBDIRNAME "power"
  828. #define ARCHCONFIG "-DPPC970 " \
  829. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  830. "-DL2_SIZE=1024976 -DL2_LINESIZE=128 " \
  831. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  832. #define LIBNAME "ppc970mp"
  833. #define CORENAME "PPC970"
  834. #endif
  835. #ifdef FORCE_PPC440
  836. #define FORCE
  837. #define ARCHITECTURE "POWER"
  838. #define SUBARCHITECTURE "PPC440"
  839. #define SUBDIRNAME "power"
  840. #define ARCHCONFIG "-DPPC440 " \
  841. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  842. "-DL2_SIZE=16384 -DL2_LINESIZE=128 " \
  843. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 "
  844. #define LIBNAME "ppc440"
  845. #define CORENAME "PPC440"
  846. #endif
  847. #ifdef FORCE_PPC440FP2
  848. #define FORCE
  849. #define ARCHITECTURE "POWER"
  850. #define SUBARCHITECTURE "PPC440FP2"
  851. #define SUBDIRNAME "power"
  852. #define ARCHCONFIG "-DPPC440FP2 " \
  853. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  854. "-DL2_SIZE=16384 -DL2_LINESIZE=128 " \
  855. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 "
  856. #define LIBNAME "ppc440FP2"
  857. #define CORENAME "PPC440FP2"
  858. #endif
  859. #ifdef FORCE_CELL
  860. #define FORCE
  861. #define ARCHITECTURE "POWER"
  862. #define SUBARCHITECTURE "CELL"
  863. #define SUBDIRNAME "power"
  864. #define ARCHCONFIG "-DCELL " \
  865. "-DL1_DATA_SIZE=262144 -DL1_DATA_LINESIZE=128 " \
  866. "-DL2_SIZE=512488 -DL2_LINESIZE=128 " \
  867. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  868. #define LIBNAME "cell"
  869. #define CORENAME "CELL"
  870. #endif
  871. #ifdef FORCE_MIPS64_GENERIC
  872. #define FORCE
  873. #define ARCHITECTURE "MIPS"
  874. #define SUBARCHITECTURE "MIPS64_GENERIC"
  875. #define SUBDIRNAME "mips64"
  876. #define ARCHCONFIG "-DMIPS64_GENERIC " \
  877. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  878. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  879. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  880. #define LIBNAME "mips64_generic"
  881. #define CORENAME "MIPS64_GENERIC"
  882. #else
  883. #endif
  884. #ifdef FORCE_SICORTEX
  885. #define FORCE
  886. #define ARCHITECTURE "MIPS"
  887. #define SUBARCHITECTURE "SICORTEX"
  888. #define SUBDIRNAME "mips"
  889. #define ARCHCONFIG "-DSICORTEX " \
  890. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  891. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  892. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  893. #define LIBNAME "mips"
  894. #define CORENAME "sicortex"
  895. #endif
  896. #if defined FORCE_LOONGSON3R3 || defined FORCE_LOONGSON3A || defined FORCE_LOONGSON3B
  897. #define FORCE
  898. #define ARCHITECTURE "MIPS"
  899. #define SUBARCHITECTURE "LOONGSON3R3"
  900. #define SUBDIRNAME "mips64"
  901. #define ARCHCONFIG "-DLOONGSON3R3 " \
  902. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  903. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  904. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  905. #define LIBNAME "loongson3r3"
  906. #define CORENAME "LOONGSON3R3"
  907. #else
  908. #endif
  909. #ifdef FORCE_LOONGSON3R4
  910. #define FORCE
  911. #define ARCHITECTURE "MIPS"
  912. #define SUBARCHITECTURE "LOONGSON3R4"
  913. #define SUBDIRNAME "mips64"
  914. #define ARCHCONFIG "-DLOONGSON3R4 " \
  915. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  916. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  917. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 -DHAVE_MSA"
  918. #define LIBNAME "loongson3r4"
  919. #define CORENAME "LOONGSON3R4"
  920. #else
  921. #endif
  922. #ifdef FORCE_LOONGSON3R5
  923. #define FORCE
  924. #define ARCHITECTURE "LOONGARCH"
  925. #define SUBARCHITECTURE "LOONGSON3R5"
  926. #define SUBDIRNAME "loongarch64"
  927. #define ARCHCONFIG "-DLOONGSON3R5 " \
  928. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  929. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  930. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 -DHAVE_MSA"
  931. #define LIBNAME "loongson3r5"
  932. #define CORENAME "LOONGSON3R5"
  933. #else
  934. #endif
  935. #ifdef FORCE_LOONGSON2K1000
  936. #define FORCE
  937. #define ARCHITECTURE "LOONGARCH"
  938. #define SUBARCHITECTURE "LOONGSON2K1000"
  939. #define SUBDIRNAME "loongarch64"
  940. #define ARCHCONFIG "-DLOONGSON2K1000 " \
  941. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  942. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  943. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 -DHAVE_MSA"
  944. #define LIBNAME "loongson2k1000"
  945. #define CORENAME "LOONGSON2K1000"
  946. #else
  947. #endif
  948. #ifdef FORCE_LOONGSONGENERIC
  949. #define FORCE
  950. #define ARCHITECTURE "LOONGARCH"
  951. #define SUBARCHITECTURE "LOONGSONGENERIC"
  952. #define SUBDIRNAME "loongarch64"
  953. #define ARCHCONFIG "-DLOONGSONGENERIC " \
  954. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  955. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  956. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 -DHAVE_MSA"
  957. #define LIBNAME "loongsongeneric"
  958. #define CORENAME "LOONGSONGENERIC"
  959. #else
  960. #endif
  961. #ifdef FORCE_I6400
  962. #define FORCE
  963. #define ARCHITECTURE "MIPS"
  964. #define SUBARCHITECTURE "I6400"
  965. #define SUBDIRNAME "mips64"
  966. #define ARCHCONFIG "-DI6400 " \
  967. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  968. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  969. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 -DHAVE_MSA "
  970. #define LIBNAME "i6400"
  971. #define CORENAME "I6400"
  972. #else
  973. #endif
  974. #ifdef FORCE_P6600
  975. #define FORCE
  976. #define ARCHITECTURE "MIPS"
  977. #define SUBARCHITECTURE "P6600"
  978. #define SUBDIRNAME "mips64"
  979. #define ARCHCONFIG "-DP6600 " \
  980. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  981. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  982. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  983. #define LIBNAME "p6600"
  984. #define CORENAME "P6600"
  985. #else
  986. #endif
  987. #ifdef FORCE_P5600
  988. #define FORCE
  989. #define ARCHITECTURE "MIPS"
  990. #define SUBARCHITECTURE "P5600"
  991. #define SUBDIRNAME "mips"
  992. #define ARCHCONFIG "-DP5600 " \
  993. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  994. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  995. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8"
  996. #define LIBNAME "p5600"
  997. #define CORENAME "P5600"
  998. #else
  999. #endif
  1000. #ifdef FORCE_MIPS1004K
  1001. #define FORCE
  1002. #define ARCHITECTURE "MIPS"
  1003. #define SUBARCHITECTURE "MIPS1004K"
  1004. #define SUBDIRNAME "mips"
  1005. #define ARCHCONFIG "-DMIPS1004K " \
  1006. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1007. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  1008. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8"
  1009. #define LIBNAME "mips1004K"
  1010. #define CORENAME "MIPS1004K"
  1011. #else
  1012. #endif
  1013. #ifdef FORCE_MIPS24K
  1014. #define FORCE
  1015. #define ARCHITECTURE "MIPS"
  1016. #define SUBARCHITECTURE "MIPS24K"
  1017. #define SUBDIRNAME "mips"
  1018. #define ARCHCONFIG "-DMIPS24K " \
  1019. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1020. "-DL2_SIZE=32768 -DL2_LINESIZE=32 " \
  1021. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8"
  1022. #define LIBNAME "mips24K"
  1023. #define CORENAME "MIPS24K"
  1024. #else
  1025. #endif
  1026. #ifdef FORCE_I6500
  1027. #define FORCE
  1028. #define ARCHITECTURE "MIPS"
  1029. #define SUBARCHITECTURE "I6500"
  1030. #define SUBDIRNAME "mips64"
  1031. #define ARCHCONFIG "-DI6500 " \
  1032. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1033. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1034. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 -DHAVE_MSA"
  1035. #define LIBNAME "i6500"
  1036. #define CORENAME "I6500"
  1037. #else
  1038. #endif
  1039. #ifdef FORCE_ITANIUM2
  1040. #define FORCE
  1041. #define ARCHITECTURE "IA64"
  1042. #define SUBARCHITECTURE "ITANIUM2"
  1043. #define SUBDIRNAME "ia64"
  1044. #define ARCHCONFIG "-DITANIUM2 " \
  1045. "-DL1_DATA_SIZE=262144 -DL1_DATA_LINESIZE=128 " \
  1046. "-DL2_SIZE=1572864 -DL2_LINESIZE=128 -DDTB_SIZE=16384 -DDTB_DEFAULT_ENTRIES=128 "
  1047. #define LIBNAME "itanium2"
  1048. #define CORENAME "itanium2"
  1049. #endif
  1050. #ifdef FORCE_SPARC
  1051. #define FORCE
  1052. #define ARCHITECTURE "SPARC"
  1053. #define SUBARCHITECTURE "SPARC"
  1054. #define SUBDIRNAME "sparc"
  1055. #define ARCHCONFIG "-DSPARC -DV9 " \
  1056. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  1057. "-DL2_SIZE=1572864 -DL2_LINESIZE=64 -DDTB_SIZE=8192 -DDTB_DEFAULT_ENTRIES=64 "
  1058. #define LIBNAME "sparc"
  1059. #define CORENAME "sparc"
  1060. #endif
  1061. #ifdef FORCE_SPARCV7
  1062. #define FORCE
  1063. #define ARCHITECTURE "SPARC"
  1064. #define SUBARCHITECTURE "SPARC"
  1065. #define SUBDIRNAME "sparc"
  1066. #define ARCHCONFIG "-DSPARC -DV7 " \
  1067. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  1068. "-DL2_SIZE=1572864 -DL2_LINESIZE=64 -DDTB_SIZE=8192 -DDTB_DEFAULT_ENTRIES=64 "
  1069. #define LIBNAME "sparcv7"
  1070. #define CORENAME "sparcv7"
  1071. #endif
  1072. #ifdef FORCE_GENERIC
  1073. #define FORCE
  1074. #define ARCHITECTURE "GENERIC"
  1075. #define SUBARCHITECTURE "GENERIC"
  1076. #define SUBDIRNAME "generic"
  1077. #define ARCHCONFIG "-DGENERIC " \
  1078. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  1079. "-DL2_SIZE=512488 -DL2_LINESIZE=128 " \
  1080. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1081. #define LIBNAME "generic"
  1082. #define CORENAME "generic"
  1083. #endif
  1084. #ifdef FORCE_ARMV7
  1085. #define FORCE
  1086. #define ARCHITECTURE "ARM"
  1087. #define SUBARCHITECTURE "ARMV7"
  1088. #define SUBDIRNAME "arm"
  1089. #define ARCHCONFIG "-DARMV7 " \
  1090. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1091. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  1092. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1093. "-DHAVE_VFPV3 -DHAVE_VFP"
  1094. #define LIBNAME "armv7"
  1095. #define CORENAME "ARMV7"
  1096. #else
  1097. #endif
  1098. #ifdef FORCE_CORTEXA9
  1099. #define FORCE
  1100. #define ARCHITECTURE "ARM"
  1101. #define SUBARCHITECTURE "CORTEXA9"
  1102. #define SUBDIRNAME "arm"
  1103. #define ARCHCONFIG "-DCORTEXA9 -DARMV7 " \
  1104. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1105. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1106. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1107. "-DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON"
  1108. #define LIBNAME "cortexa9"
  1109. #define CORENAME "CORTEXA9"
  1110. #else
  1111. #endif
  1112. #ifdef FORCE_RISCV64_GENERIC
  1113. #define FORCE
  1114. #define ARCHITECTURE "RISCV64"
  1115. #define SUBARCHITECTURE "RISCV64_GENERIC"
  1116. #define SUBDIRNAME "riscv64"
  1117. #define ARCHCONFIG "-DRISCV64_GENERIC " \
  1118. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1119. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1120. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1121. #define LIBNAME "riscv64_generic"
  1122. #define CORENAME "RISCV64_GENERIC"
  1123. #else
  1124. #endif
  1125. #ifdef FORCE_CORTEXA15
  1126. #define FORCE
  1127. #define ARCHITECTURE "ARM"
  1128. #define SUBARCHITECTURE "CORTEXA15"
  1129. #define SUBDIRNAME "arm"
  1130. #define ARCHCONFIG "-DCORTEXA15 -DARMV7 " \
  1131. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1132. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1133. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1134. "-DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON"
  1135. #define LIBNAME "cortexa15"
  1136. #define CORENAME "CORTEXA15"
  1137. #else
  1138. #endif
  1139. #ifdef FORCE_ARMV6
  1140. #define FORCE
  1141. #define ARCHITECTURE "ARM"
  1142. #define SUBARCHITECTURE "ARMV6"
  1143. #define SUBDIRNAME "arm"
  1144. #define ARCHCONFIG "-DARMV6 " \
  1145. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1146. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  1147. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1148. "-DHAVE_VFP"
  1149. #define LIBNAME "armv6"
  1150. #define CORENAME "ARMV6"
  1151. #else
  1152. #endif
  1153. #ifdef FORCE_ARMV5
  1154. #define FORCE
  1155. #define ARCHITECTURE "ARM"
  1156. #define SUBARCHITECTURE "ARMV5"
  1157. #define SUBDIRNAME "arm"
  1158. #define ARCHCONFIG "-DARMV5 " \
  1159. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1160. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  1161. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1162. #define LIBNAME "armv5"
  1163. #define CORENAME "ARMV5"
  1164. #else
  1165. #endif
  1166. #ifdef FORCE_ARMV8SVE
  1167. #define FORCE
  1168. #define ARCHITECTURE "ARM64"
  1169. #define SUBARCHITECTURE "ARMV8SVE"
  1170. #define SUBDIRNAME "arm64"
  1171. #define ARCHCONFIG "-DARMV8SVE " \
  1172. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1173. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1174. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1175. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8"
  1176. #define LIBNAME "armv8sve"
  1177. #define CORENAME "ARMV8SVE"
  1178. #endif
  1179. #ifdef FORCE_ARMV8
  1180. #define FORCE
  1181. #define ARCHITECTURE "ARM64"
  1182. #define SUBARCHITECTURE "ARMV8"
  1183. #define SUBDIRNAME "arm64"
  1184. #define ARCHCONFIG "-DARMV8 " \
  1185. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1186. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1187. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1188. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1189. #define LIBNAME "armv8"
  1190. #define CORENAME "ARMV8"
  1191. #endif
  1192. #ifdef FORCE_CORTEXA53
  1193. #define FORCE
  1194. #define ARCHITECTURE "ARM64"
  1195. #define SUBARCHITECTURE "CORTEXA53"
  1196. #define SUBDIRNAME "arm64"
  1197. #define ARCHCONFIG "-DCORTEXA53 " \
  1198. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1199. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1200. "-DL2_SIZE=262144 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1201. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1202. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1203. #define LIBNAME "cortexa53"
  1204. #define CORENAME "CORTEXA53"
  1205. #endif
  1206. #ifdef FORCE_CORTEXA57
  1207. #define FORCE
  1208. #define ARCHITECTURE "ARM64"
  1209. #define SUBARCHITECTURE "CORTEXA57"
  1210. #define SUBDIRNAME "arm64"
  1211. #define ARCHCONFIG "-DCORTEXA57 " \
  1212. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1213. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1214. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1215. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1216. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1217. #define LIBNAME "cortexa57"
  1218. #define CORENAME "CORTEXA57"
  1219. #endif
  1220. #ifdef FORCE_CORTEXA72
  1221. #define FORCE
  1222. #define ARCHITECTURE "ARM64"
  1223. #define SUBARCHITECTURE "CORTEXA72"
  1224. #define SUBDIRNAME "arm64"
  1225. #define ARCHCONFIG "-DCORTEXA72 " \
  1226. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1227. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1228. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1229. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1230. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1231. #define LIBNAME "cortexa72"
  1232. #define CORENAME "CORTEXA72"
  1233. #endif
  1234. #ifdef FORCE_CORTEXA73
  1235. #define FORCE
  1236. #define ARCHITECTURE "ARM64"
  1237. #define SUBARCHITECTURE "CORTEXA73"
  1238. #define SUBDIRNAME "arm64"
  1239. #define ARCHCONFIG "-DCORTEXA73 " \
  1240. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1241. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1242. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1243. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1244. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1245. #define LIBNAME "cortexa73"
  1246. #define CORENAME "CORTEXA73"
  1247. #endif
  1248. #ifdef FORCE_CORTEXX1
  1249. #define FORCE
  1250. #define ARCHITECTURE "ARM64"
  1251. #define SUBARCHITECTURE "CORTEXX1"
  1252. #define SUBDIRNAME "arm64"
  1253. #define ARCHCONFIG "-DCORTEXX1 " \
  1254. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1255. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1256. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1257. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1258. #define LIBNAME "cortexx1"
  1259. #define CORENAME "CORTEXX1"
  1260. #endif
  1261. #ifdef FORCE_CORTEXX2
  1262. #define FORCE
  1263. #define ARCHITECTURE "ARM64"
  1264. #define SUBARCHITECTURE "CORTEXX2"
  1265. #define SUBDIRNAME "arm64"
  1266. #define ARCHCONFIG "-DCORTEXX2 " \
  1267. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1268. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1269. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1270. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 -DARMV9"
  1271. #define LIBNAME "cortexx2"
  1272. #define CORENAME "CORTEXX2"
  1273. #endif
  1274. #ifdef FORCE_CORTEXA510
  1275. #define FORCE
  1276. #define ARCHITECTURE "ARM64"
  1277. #define SUBARCHITECTURE "CORTEXA510"
  1278. #define SUBDIRNAME "arm64"
  1279. #define ARCHCONFIG "-DCORTEXA510 " \
  1280. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1281. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1282. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1283. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 -DARMV9"
  1284. #define LIBNAME "cortexa510"
  1285. #define CORENAME "CORTEXA510"
  1286. #endif
  1287. #ifdef FORCE_CORTEXA710
  1288. #define FORCE
  1289. #define ARCHITECTURE "ARM64"
  1290. #define SUBARCHITECTURE "CORTEXA710"
  1291. #define SUBDIRNAME "arm64"
  1292. #define ARCHCONFIG "-DCORTEXA710 " \
  1293. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1294. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1295. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1296. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 -DARMV9"
  1297. #define LIBNAME "cortexa710"
  1298. #define CORENAME "CORTEXA710"
  1299. #endif
  1300. #ifdef FORCE_NEOVERSEN1
  1301. #define FORCE
  1302. #define ARCHITECTURE "ARM64"
  1303. #define SUBARCHITECTURE "NEOVERSEN1"
  1304. #define SUBDIRNAME "arm64"
  1305. #define ARCHCONFIG "-DNEOVERSEN1 " \
  1306. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1307. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1308. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1309. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1310. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8 " \
  1311. "-march=armv8.2-a -mtune=neoverse-n1"
  1312. #define LIBNAME "neoversen1"
  1313. #define CORENAME "NEOVERSEN1"
  1314. #endif
  1315. #ifdef FORCE_NEOVERSEV1
  1316. #define FORCE
  1317. #define ARCHITECTURE "ARM64"
  1318. #define SUBARCHITECTURE "NEOVERSEV1"
  1319. #define SUBDIRNAME "arm64"
  1320. #define ARCHCONFIG "-DNEOVERSEV1 " \
  1321. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1322. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1323. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1324. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1325. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 " \
  1326. "-march=armv8.4-a+sve -mtune=neoverse-v1"
  1327. #define LIBNAME "neoversev1"
  1328. #define CORENAME "NEOVERSEV1"
  1329. #endif
  1330. #ifdef FORCE_NEOVERSEN2
  1331. #define FORCE
  1332. #define ARCHITECTURE "ARM64"
  1333. #define SUBARCHITECTURE "NEOVERSEN2"
  1334. #define SUBDIRNAME "arm64"
  1335. #define ARCHCONFIG "-DNEOVERSEN2 " \
  1336. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1337. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1338. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1339. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1340. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 " \
  1341. "-march=armv8.5-a -mtune=neoverse-n2"
  1342. #define LIBNAME "neoversen2"
  1343. #define CORENAME "NEOVERSEN2"
  1344. #endif
  1345. #ifdef FORCE_CORTEXA55
  1346. #define FORCE
  1347. #define ARCHITECTURE "ARM64"
  1348. #define SUBARCHITECTURE "CORTEXA55"
  1349. #define SUBDIRNAME "arm64"
  1350. #define ARCHCONFIG "-DCORTEXA55 " \
  1351. "-DL1_CODE_SIZE=16384 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1352. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1353. "-DL2_SIZE=65536 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1354. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1355. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1356. #define LIBNAME "cortexa55"
  1357. #define CORENAME "CORTEXA55"
  1358. #endif
  1359. #ifdef FORCE_FALKOR
  1360. #define FORCE
  1361. #define ARCHITECTURE "ARM64"
  1362. #define SUBARCHITECTURE "FALKOR"
  1363. #define SUBDIRNAME "arm64"
  1364. #define ARCHCONFIG "-DFALKOR " \
  1365. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1366. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1367. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1368. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1369. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1370. #define LIBNAME "falkor"
  1371. #define CORENAME "FALKOR"
  1372. #endif
  1373. #ifdef FORCE_THUNDERX
  1374. #define FORCE
  1375. #define ARCHITECTURE "ARM64"
  1376. #define SUBARCHITECTURE "THUNDERX"
  1377. #define SUBDIRNAME "arm64"
  1378. #define ARCHCONFIG "-DTHUNDERX " \
  1379. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  1380. "-DL2_SIZE=16777216 -DL2_LINESIZE=128 -DL2_ASSOCIATIVE=16 " \
  1381. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1382. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1383. #define LIBNAME "thunderx"
  1384. #define CORENAME "THUNDERX"
  1385. #endif
  1386. #ifdef FORCE_THUNDERX2T99
  1387. #define ARMV8
  1388. #define FORCE
  1389. #define ARCHITECTURE "ARM64"
  1390. #define SUBARCHITECTURE "THUNDERX2T99"
  1391. #define SUBDIRNAME "arm64"
  1392. #define ARCHCONFIG "-DTHUNDERX2T99 " \
  1393. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1394. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1395. "-DL2_SIZE=262144 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1396. "-DL3_SIZE=33554432 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=32 " \
  1397. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1398. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1399. #define LIBNAME "thunderx2t99"
  1400. #define CORENAME "THUNDERX2T99"
  1401. #endif
  1402. #ifdef FORCE_TSV110
  1403. #define FORCE
  1404. #define ARCHITECTURE "ARM64"
  1405. #define SUBARCHITECTURE "TSV110"
  1406. #define SUBDIRNAME "arm64"
  1407. #define ARCHCONFIG "-DTSV110 " \
  1408. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1409. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1410. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1411. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1412. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1413. #define LIBNAME "tsv110"
  1414. #define CORENAME "TSV110"
  1415. #endif
  1416. #ifdef FORCE_EMAG8180
  1417. #define ARMV8
  1418. #define FORCE
  1419. #define ARCHITECTURE "ARM64"
  1420. #define SUBARCHITECTURE "EMAG8180"
  1421. #define SUBDIRNAME "arm64"
  1422. #define ARCHCONFIG "-DEMAG8180 " \
  1423. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1424. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1425. "-DL2_SIZE=262144 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1426. "-DL3_SIZE=33554432 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=32 " \
  1427. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1428. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1429. #define LIBNAME "emag8180"
  1430. #define CORENAME "EMAG8180"
  1431. #endif
  1432. #ifdef FORCE_THUNDERX3T110
  1433. #define ARMV8
  1434. #define FORCE
  1435. #define ARCHITECTURE "ARM64"
  1436. #define SUBARCHITECTURE "THUNDERX3T110"
  1437. #define SUBDIRNAME "arm64"
  1438. #define ARCHCONFIG "-DTHUNDERX3T110 " \
  1439. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1440. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1441. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1442. "-DL3_SIZE=94371840 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=32 " \
  1443. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1444. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1445. #define LIBNAME "thunderx3t110"
  1446. #define CORENAME "THUNDERX3T110"
  1447. #endif
  1448. #ifdef FORCE_VORTEX
  1449. #define FORCE
  1450. #define ARCHITECTURE "ARM64"
  1451. #define SUBARCHITECTURE "VORTEX"
  1452. #define SUBDIRNAME "arm64"
  1453. #define ARCHCONFIG "-DVORTEX " \
  1454. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1455. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1456. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1457. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1458. #define LIBNAME "vortex"
  1459. #define CORENAME "VORTEX"
  1460. #endif
  1461. #ifdef FORCE_A64FX
  1462. #define ARMV8
  1463. #define FORCE
  1464. #define ARCHITECTURE "ARM64"
  1465. #define SUBARCHITECTURE "A64FX"
  1466. #define SUBDIRNAME "arm64"
  1467. #define ARCHCONFIG "-DA64FX " \
  1468. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=256 -DL1_CODE_ASSOCIATIVE=8 " \
  1469. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=256 -DL1_DATA_ASSOCIATIVE=8 " \
  1470. "-DL2_SIZE=8388608 -DL2_LINESIZE=256 -DL2_ASSOCIATIVE=8 " \
  1471. "-DL3_SIZE=0 -DL3_LINESIZE=0 -DL3_ASSOCIATIVE=0 " \
  1472. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1473. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8"
  1474. #define LIBNAME "a64fx"
  1475. #define CORENAME "A64FX"
  1476. #endif
  1477. #ifdef FORCE_FT2000
  1478. #define ARMV8
  1479. #define FORCE
  1480. #define ARCHITECTURE "ARM64"
  1481. #define SUBARCHITECTURE "FT2000"
  1482. #define SUBDIRNAME "arm64"
  1483. #define ARCHCONFIG "-DFT2000 " \
  1484. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1485. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1486. "-DL2_SIZE=33554426-DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1487. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1488. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1489. #define LIBNAME "ft2000"
  1490. #define CORENAME "FT2000"
  1491. #endif
  1492. #ifdef FORCE_ZARCH_GENERIC
  1493. #define FORCE
  1494. #define ARCHITECTURE "ZARCH"
  1495. #define SUBARCHITECTURE "ZARCH_GENERIC"
  1496. #define ARCHCONFIG "-DZARCH_GENERIC " \
  1497. "-DDTB_DEFAULT_ENTRIES=64"
  1498. #define LIBNAME "zarch_generic"
  1499. #define CORENAME "ZARCH_GENERIC"
  1500. #endif
  1501. #ifdef FORCE_Z13
  1502. #define FORCE
  1503. #define ARCHITECTURE "ZARCH"
  1504. #define SUBARCHITECTURE "Z13"
  1505. #define ARCHCONFIG "-DZ13 " \
  1506. "-DDTB_DEFAULT_ENTRIES=64"
  1507. #define LIBNAME "z13"
  1508. #define CORENAME "Z13"
  1509. #endif
  1510. #ifdef FORCE_Z14
  1511. #define FORCE
  1512. #define ARCHITECTURE "ZARCH"
  1513. #define SUBARCHITECTURE "Z14"
  1514. #define ARCHCONFIG "-DZ14 " \
  1515. "-DDTB_DEFAULT_ENTRIES=64"
  1516. #define LIBNAME "z14"
  1517. #define CORENAME "Z14"
  1518. #endif
  1519. #ifdef FORCE_EV4
  1520. #define FORCE
  1521. #define ARCHITECTURE "ALPHA"
  1522. #define SUBARCHITECTURE "ev4"
  1523. #define ARCHCONFIG "-DEV4 " \
  1524. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  1525. "-DL2_SIZE=2097152 -DL2_LINESIZE=32 " \
  1526. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=8192 "
  1527. #define LIBNAME "ev4"
  1528. #define CORENAME "EV4"
  1529. #endif
  1530. #ifdef FORCE_EV5
  1531. #define FORCE
  1532. #define ARCHITECTURE "ALPHA"
  1533. #define SUBARCHITECTURE "ev5"
  1534. #define ARCHCONFIG "-DEV5 " \
  1535. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  1536. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 " \
  1537. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=8192 "
  1538. #define LIBNAME "ev5"
  1539. #define CORENAME "EV5"
  1540. #endif
  1541. #ifdef FORCE_EV6
  1542. #define FORCE
  1543. #define ARCHITECTURE "ALPHA"
  1544. #define SUBARCHITECTURE "ev6"
  1545. #define ARCHCONFIG "-DEV6 " \
  1546. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1547. "-DL2_SIZE=4194304 -DL2_LINESIZE=64 " \
  1548. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=8192 "
  1549. #define LIBNAME "ev6"
  1550. #define CORENAME "EV6"
  1551. #endif
  1552. #ifdef FORCE_C910V
  1553. #define FORCE
  1554. #define ARCHITECTURE "RISCV64"
  1555. #ifdef NO_RV64GV
  1556. #define SUBARCHITECTURE "RISCV64_GENERIC"
  1557. #define SUBDIRNAME "riscv64"
  1558. #define ARCHCONFIG "-DRISCV64_GENERIC " \
  1559. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1560. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1561. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1562. #define LIBNAME "riscv64_generic"
  1563. #define CORENAME "RISCV64_GENERIC"
  1564. #else
  1565. #define SUBARCHITECTURE "C910V"
  1566. #define SUBDIRNAME "riscv64"
  1567. #define ARCHCONFIG "-DC910V " \
  1568. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1569. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1570. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1571. #define LIBNAME "c910v"
  1572. #define CORENAME "C910V"
  1573. #endif
  1574. #endif
  1575. #ifdef FORCE_x280
  1576. #define FORCE
  1577. #define ARCHITECTURE "RISCV64"
  1578. #define SUBARCHITECTURE "x280"
  1579. #define SUBDIRNAME "riscv64"
  1580. #define ARCHCONFIG "-Dx280 " \
  1581. "-DL1_DATA_SIZE=64536 -DL1_DATA_LINESIZE=32 " \
  1582. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  1583. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1584. #define LIBNAME "x280"
  1585. #define CORENAME "x280"
  1586. #else
  1587. #endif
  1588. #ifdef FORCE_RISCV64_ZVL256B
  1589. #define FORCE
  1590. #define ARCHITECTURE "RISCV64"
  1591. #define SUBARCHITECTURE "RISCV64_ZVL256B"
  1592. #define SUBDIRNAME "riscv64"
  1593. #define ARCHCONFIG "-DRISCV64_ZVL256B " \
  1594. "-DL1_DATA_SIZE=64536 -DL1_DATA_LINESIZE=32 " \
  1595. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  1596. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1597. #define LIBNAME "riscv64_zvl256b"
  1598. #define CORENAME "RISCV64_ZVL256B"
  1599. #endif
  1600. #ifdef FORCE_RISCV64_ZVL128B
  1601. #define FORCE
  1602. #define ARCHITECTURE "RISCV64"
  1603. #define SUBARCHITECTURE "RISCV64_ZVL128B"
  1604. #define SUBDIRNAME "riscv64"
  1605. #define ARCHCONFIG "-DRISCV64_ZVL128B " \
  1606. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1607. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1608. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1609. #define LIBNAME "riscv64_zvl128b"
  1610. #define CORENAME "RISCV64_ZVL128B"
  1611. #endif
  1612. #if defined(FORCE_E2K) || defined(__e2k__)
  1613. #define FORCE
  1614. #define ARCHITECTURE "E2K"
  1615. #define ARCHCONFIG "-DGENERIC " \
  1616. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  1617. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  1618. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1619. #define LIBNAME "generic"
  1620. #define CORENAME "generic"
  1621. #endif
  1622. #ifdef FORCE_CSKY
  1623. #define FORCE
  1624. #define ARCHITECTURE "CSKY"
  1625. #define SUBARCHITECTURE "CSKY"
  1626. #define SUBDIRNAME "csky"
  1627. #define ARCHCONFIG "-DCSKY" \
  1628. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1629. "-DL2_SIZE=524288 -DL2_LINESIZE=32 " \
  1630. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1631. #define LIBNAME "csky"
  1632. #define CORENAME "CSKY"
  1633. #endif
  1634. #ifdef FORCE_CK860FV
  1635. #define FORCE
  1636. #define ARCHITECTURE "CSKY"
  1637. #define SUBARCHITECTURE "CK860V"
  1638. #define SUBDIRNAME "csky"
  1639. #define ARCHCONFIG "-DCK860FV " \
  1640. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1641. "-DL2_SIZE=524288 -DL2_LINESIZE=32 " \
  1642. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1643. #define LIBNAME "ck860fv"
  1644. #define CORENAME "CK860FV"
  1645. #endif
  1646. #ifndef FORCE
  1647. #ifdef USER_TARGET
  1648. #error "The TARGET specified on the command line or in Makefile.rule is not supported. Please choose a target from TargetList.txt"
  1649. #endif
  1650. #if defined(__powerpc__) || defined(__powerpc) || defined(powerpc) || \
  1651. defined(__PPC__) || defined(PPC) || defined(_POWER) || defined(__POWERPC__)
  1652. #ifndef POWER
  1653. #define POWER
  1654. #endif
  1655. #define OPENBLAS_SUPPORTED
  1656. #endif
  1657. #if defined(__zarch__) || defined(__s390x__)
  1658. #define ZARCH
  1659. #include "cpuid_zarch.c"
  1660. #define OPENBLAS_SUPPORTED
  1661. #endif
  1662. #ifdef INTEL_AMD
  1663. #include "cpuid_x86.c"
  1664. #define OPENBLAS_SUPPORTED
  1665. #endif
  1666. #ifdef __ia64__
  1667. #include "cpuid_ia64.c"
  1668. #define OPENBLAS_SUPPORTED
  1669. #endif
  1670. #ifdef __alpha
  1671. #include "cpuid_alpha.c"
  1672. #define OPENBLAS_SUPPORTED
  1673. #endif
  1674. #ifdef POWER
  1675. #include "cpuid_power.c"
  1676. #define OPENBLAS_SUPPORTED
  1677. #endif
  1678. #ifdef sparc
  1679. #include "cpuid_sparc.c"
  1680. #define OPENBLAS_SUPPORTED
  1681. #endif
  1682. #ifdef __mips__
  1683. #ifdef __mips64
  1684. #include "cpuid_mips64.c"
  1685. #else
  1686. #include "cpuid_mips.c"
  1687. #endif
  1688. #define OPENBLAS_SUPPORTED
  1689. #endif
  1690. #ifdef __loongarch64
  1691. #include "cpuid_loongarch64.c"
  1692. #define OPENBLAS_SUPPORTED
  1693. #endif
  1694. #ifdef __riscv
  1695. #include "cpuid_riscv64.c"
  1696. #define OPENBLAS_SUPPORTED
  1697. #endif
  1698. #ifdef __arm__
  1699. #include "cpuid_arm.c"
  1700. #define OPENBLAS_SUPPORTED
  1701. #endif
  1702. #ifdef __aarch64__
  1703. #include "cpuid_arm64.c"
  1704. #define OPENBLAS_SUPPORTED
  1705. #endif
  1706. #ifndef OPENBLAS_SUPPORTED
  1707. #error "This arch/CPU is not supported by OpenBLAS."
  1708. #endif
  1709. #else
  1710. #endif
  1711. static int get_num_cores(void) {
  1712. int count;
  1713. #ifdef OS_WINDOWS
  1714. SYSTEM_INFO sysinfo;
  1715. #elif defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__DragonFly__) || defined(__APPLE__)
  1716. int m[2];
  1717. size_t len;
  1718. #endif
  1719. #if defined(linux) || defined(__sun__)
  1720. //returns the number of processors which are currently online
  1721. count = sysconf(_SC_NPROCESSORS_CONF);
  1722. if (count <= 0) count = 2;
  1723. return count;
  1724. #elif defined(OS_WINDOWS)
  1725. GetSystemInfo(&sysinfo);
  1726. return sysinfo.dwNumberOfProcessors;
  1727. #elif defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__DragonFly__) || defined(__APPLE__)
  1728. m[0] = CTL_HW;
  1729. m[1] = HW_NCPU;
  1730. len = sizeof(int);
  1731. sysctl(m, 2, &count, &len, NULL, 0);
  1732. if (count <= 0) count = 2;
  1733. return count;
  1734. #elif defined(_AIX)
  1735. //returns the number of processors which are currently online
  1736. count = sysconf(_SC_NPROCESSORS_ONLN);
  1737. if (count <= 0) count = 2;
  1738. return count;
  1739. #else
  1740. return 2;
  1741. #endif
  1742. }
  1743. int main(int argc, char *argv[]){
  1744. #ifdef FORCE
  1745. char buffer[8192], *p, *q;
  1746. int length;
  1747. #endif
  1748. if (argc == 1) return 0;
  1749. switch (argv[1][0]) {
  1750. case '0' : /* for Makefile */
  1751. #ifdef FORCE
  1752. printf("CORE=%s\n", CORENAME);
  1753. #else
  1754. #if defined(INTEL_AMD) || defined(POWER) || defined(__mips__) || defined(__arm__) || defined(__aarch64__) || defined(ZARCH) || defined(sparc) || defined(__loongarch__) || defined(__riscv) || defined(__alpha__) || defined(__csky__)
  1755. printf("CORE=%s\n", get_corename());
  1756. #endif
  1757. #endif
  1758. #ifdef FORCE
  1759. printf("LIBCORE=%s\n", LIBNAME);
  1760. #else
  1761. printf("LIBCORE=");
  1762. get_libname();
  1763. printf("\n");
  1764. #endif
  1765. printf("NUM_CORES=%d\n", get_num_cores());
  1766. #if defined(__arm__)
  1767. #if !defined(FORCE)
  1768. fprintf(stderr,"get features!\n");
  1769. get_features();
  1770. #else
  1771. fprintf(stderr,"split archconfig!\n");
  1772. sprintf(buffer, "%s", ARCHCONFIG);
  1773. p = &buffer[0];
  1774. while (*p) {
  1775. if ((*p == '-') && (*(p + 1) == 'D')) {
  1776. p += 2;
  1777. if (*p != 'H') {
  1778. while( (*p != ' ') && (*p != '-') && (*p != '\0') && (*p != '\n')) {p++; }
  1779. if (*p == '-') continue;
  1780. }
  1781. while ((*p != ' ') && (*p != '\0')) {
  1782. if (*p == '=') {
  1783. printf("=");
  1784. p ++;
  1785. while ((*p != ' ') && (*p != '\0')) {
  1786. printf("%c", *p);
  1787. p ++;
  1788. }
  1789. } else {
  1790. printf("%c", *p);
  1791. p ++;
  1792. if ((*p == ' ') || (*p =='\0')) printf("=1\n");
  1793. }
  1794. }
  1795. } else p ++;
  1796. }
  1797. #endif
  1798. #endif
  1799. #ifdef INTEL_AMD
  1800. #ifndef FORCE
  1801. get_sse();
  1802. #else
  1803. sprintf(buffer, "%s", ARCHCONFIG);
  1804. p = &buffer[0];
  1805. while (*p) {
  1806. if ((*p == '-') && (*(p + 1) == 'D')) {
  1807. p += 2;
  1808. while ((*p != ' ') && (*p != '\0')) {
  1809. if (*p == '=') {
  1810. printf("=");
  1811. p ++;
  1812. while ((*p != ' ') && (*p != '\0')) {
  1813. printf("%c", *p);
  1814. p ++;
  1815. }
  1816. } else {
  1817. printf("%c", *p);
  1818. p ++;
  1819. if ((*p == ' ') || (*p =='\0')) printf("=1");
  1820. }
  1821. }
  1822. printf("\n");
  1823. } else p ++;
  1824. }
  1825. #endif
  1826. #endif
  1827. #if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  1828. printf("__BYTE_ORDER__=__ORDER_BIG_ENDIAN__\n");
  1829. #elif defined(__BIG_ENDIAN__) && __BIG_ENDIAN__ > 0
  1830. printf("__BYTE_ORDER__=__ORDER_BIG_ENDIAN__\n");
  1831. #endif
  1832. #if defined(_CALL_ELF) && (_CALL_ELF == 2)
  1833. printf("ELF_VERSION=2\n");
  1834. #endif
  1835. #ifdef MAKE_NB_JOBS
  1836. #if MAKE_NB_JOBS > 0
  1837. printf("MAKEFLAGS += -j %d\n", MAKE_NB_JOBS);
  1838. #else
  1839. // Let make use parent -j argument or -j1 if there
  1840. // is no make parent
  1841. #endif
  1842. #elif NO_PARALLEL_MAKE==1
  1843. printf("MAKEFLAGS += -j 1\n");
  1844. #else
  1845. printf("MAKEFLAGS += -j %d\n", get_num_cores());
  1846. #endif
  1847. break;
  1848. case '1' : /* For config.h */
  1849. #ifdef FORCE
  1850. sprintf(buffer, "%s -DCORE_%s\n", ARCHCONFIG, CORENAME);
  1851. p = &buffer[0];
  1852. while (*p) {
  1853. if ((*p == '-') && (*(p + 1) == 'D')) {
  1854. p += 2;
  1855. printf("#define ");
  1856. while ((*p != ' ') && (*p != '\0')) {
  1857. if (*p == '=') {
  1858. printf(" ");
  1859. p ++;
  1860. while ((*p != ' ') && (*p != '\0')) {
  1861. printf("%c", *p);
  1862. p ++;
  1863. }
  1864. } else {
  1865. if (*p != '\n')
  1866. printf("%c", *p);
  1867. p ++;
  1868. }
  1869. }
  1870. printf("\n");
  1871. } else p ++;
  1872. }
  1873. #else
  1874. get_cpuconfig();
  1875. #endif
  1876. #ifdef FORCE
  1877. printf("#define CHAR_CORENAME \"%s\"\n", CORENAME);
  1878. #else
  1879. #if defined(INTEL_AMD) || defined(POWER) || defined(__mips__) || defined(__arm__) || defined(__aarch64__) || defined(ZARCH) || defined(sparc) || defined(__loongarch__) || defined(__riscv) || defined(__csky__)
  1880. printf("#define CHAR_CORENAME \"%s\"\n", get_corename());
  1881. #endif
  1882. #endif
  1883. break;
  1884. case '2' : /* SMP */
  1885. if (get_num_cores() > 1) printf("SMP=1\n");
  1886. break;
  1887. }
  1888. fflush(stdout);
  1889. return 0;
  1890. }