You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

getarch.c 68 kB

6 years ago
14 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
10 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
6 years ago
6 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242025202620272028202920302031203220332034203520362037203820392040204120422043204420452046204720482049205020512052205320542055205620572058205920602061206220632064206520662067206820692070207120722073207420752076207720782079208020812082
  1. /*****************************************************************************
  2. Copyright (c) 2011-2014, The OpenBLAS Project
  3. All rights reserved.
  4. Redistribution and use in source and binary forms, with or without
  5. modification, are permitted provided that the following conditions are
  6. met:
  7. 1. Redistributions of source code must retain the above copyright
  8. notice, this list of conditions and the following disclaimer.
  9. 2. Redistributions in binary form must reproduce the above copyright
  10. notice, this list of conditions and the following disclaimer in
  11. the documentation and/or other materials provided with the
  12. distribution.
  13. 3. Neither the name of the OpenBLAS project nor the names of
  14. its contributors may be used to endorse or promote products
  15. derived from this software without specific prior written
  16. permission.
  17. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
  18. AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  19. IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  20. ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
  21. LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  22. DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
  23. SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  24. CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  25. OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
  26. USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  27. **********************************************************************************/
  28. /*********************************************************************/
  29. /* Copyright 2009, 2010 The University of Texas at Austin. */
  30. /* All rights reserved. */
  31. /* */
  32. /* Redistribution and use in source and binary forms, with or */
  33. /* without modification, are permitted provided that the following */
  34. /* conditions are met: */
  35. /* */
  36. /* 1. Redistributions of source code must retain the above */
  37. /* copyright notice, this list of conditions and the following */
  38. /* disclaimer. */
  39. /* */
  40. /* 2. Redistributions in binary form must reproduce the above */
  41. /* copyright notice, this list of conditions and the following */
  42. /* disclaimer in the documentation and/or other materials */
  43. /* provided with the distribution. */
  44. /* */
  45. /* THIS SOFTWARE IS PROVIDED BY THE UNIVERSITY OF TEXAS AT */
  46. /* AUSTIN ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, */
  47. /* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF */
  48. /* MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE */
  49. /* DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY OF TEXAS AT */
  50. /* AUSTIN OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, */
  51. /* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES */
  52. /* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE */
  53. /* GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR */
  54. /* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF */
  55. /* LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT */
  56. /* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT */
  57. /* OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE */
  58. /* POSSIBILITY OF SUCH DAMAGE. */
  59. /* */
  60. /* The views and conclusions contained in the software and */
  61. /* documentation are those of the authors and should not be */
  62. /* interpreted as representing official policies, either expressed */
  63. /* or implied, of The University of Texas at Austin. */
  64. /*********************************************************************/
  65. #if defined(__WIN32__) || defined(__WIN64__) || defined(__CYGWIN32__) || defined(__CYGWIN64__) || defined(_WIN32) || defined(_WIN64)
  66. #define OS_WINDOWS
  67. #endif
  68. #if defined(__i386__) || defined(__x86_64__) || defined(_M_IX86) || defined(_M_X64)
  69. #define INTEL_AMD
  70. #endif
  71. #include <stdio.h>
  72. #include <string.h>
  73. #ifdef OS_WINDOWS
  74. #include <windows.h>
  75. #endif
  76. #if defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__DragonFly__) || defined(__APPLE__)
  77. #include <sys/types.h>
  78. #include <sys/sysctl.h>
  79. #endif
  80. #if defined(linux) || defined(__sun__)
  81. #include <sys/sysinfo.h>
  82. #include <unistd.h>
  83. #endif
  84. #if defined(_AIX)
  85. #include <unistd.h>
  86. #include <sys/systemcfg.h>
  87. #include <sys/sysinfo.h>
  88. #endif
  89. /* #define FORCE_P2 */
  90. /* #define FORCE_KATMAI */
  91. /* #define FORCE_COPPERMINE */
  92. /* #define FORCE_NORTHWOOD */
  93. /* #define FORCE_PRESCOTT */
  94. /* #define FORCE_BANIAS */
  95. /* #define FORCE_YONAH */
  96. /* #define FORCE_CORE2 */
  97. /* #define FORCE_PENRYN */
  98. /* #define FORCE_DUNNINGTON */
  99. /* #define FORCE_NEHALEM */
  100. /* #define FORCE_SANDYBRIDGE */
  101. /* #define FORCE_ATOM */
  102. /* #define FORCE_ATHLON */
  103. /* #define FORCE_OPTERON */
  104. /* #define FORCE_OPTERON_SSE3 */
  105. /* #define FORCE_BARCELONA */
  106. /* #define FORCE_SHANGHAI */
  107. /* #define FORCE_ISTANBUL */
  108. /* #define FORCE_BOBCAT */
  109. /* #define FORCE_BULLDOZER */
  110. /* #define FORCE_PILEDRIVER */
  111. /* #define FORCE_SSE_GENERIC */
  112. /* #define FORCE_VIAC3 */
  113. /* #define FORCE_NANO */
  114. /* #define FORCE_POWER3 */
  115. /* #define FORCE_POWER4 */
  116. /* #define FORCE_POWER5 */
  117. /* #define FORCE_POWER6 */
  118. /* #define FORCE_POWER7 */
  119. /* #define FORCE_POWER8 */
  120. /* #define FORCE_PPCG4 */
  121. /* #define FORCE_PPC970 */
  122. /* #define FORCE_PPC970MP */
  123. /* #define FORCE_PPC440 */
  124. /* #define FORCE_PPC440FP2 */
  125. /* #define FORCE_CELL */
  126. /* #define FORCE_MIPS64_GENERIC */
  127. /* #define FORCE_SICORTEX */
  128. /* #define FORCE_LOONGSON3R3 */
  129. /* #define FORCE_LOONGSON3R4 */
  130. /* #define FORCE_LOONGSON3R5 */
  131. /* #define FORCE_LOONGSON2K1000 */
  132. /* #define FORCE_LOONGSONGENERIC */
  133. /* #define FORCE_I6400 */
  134. /* #define FORCE_P6600 */
  135. /* #define FORCE_P5600 */
  136. /* #define FORCE_I6500 */
  137. /* #define FORCE_ITANIUM2 */
  138. /* #define FORCE_SPARC */
  139. /* #define FORCE_SPARCV7 */
  140. /* #define FORCE_ZARCH_GENERIC */
  141. /* #define FORCE_Z13 */
  142. /* #define FORCE_EV4 */
  143. /* #define FORCE_EV5 */
  144. /* #define FORCE_EV6 */
  145. /* #define FORCE_CSKY */
  146. /* #define FORCE_CK860FV */
  147. /* #define FORCE_GENERIC */
  148. #ifdef FORCE_P2
  149. #define FORCE
  150. #define FORCE_INTEL
  151. #define ARCHITECTURE "X86"
  152. #define SUBARCHITECTURE "PENTIUM2"
  153. #define ARCHCONFIG "-DPENTIUM2 " \
  154. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  155. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  156. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  157. "-DHAVE_CMOV -DHAVE_MMX"
  158. #define LIBNAME "p2"
  159. #define CORENAME "P5"
  160. #endif
  161. #ifdef FORCE_KATMAI
  162. #define FORCE
  163. #define FORCE_INTEL
  164. #define ARCHITECTURE "X86"
  165. #define SUBARCHITECTURE "PENTIUM3"
  166. #define ARCHCONFIG "-DPENTIUM3 " \
  167. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  168. "-DL2_SIZE=524288 -DL2_LINESIZE=32 " \
  169. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  170. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE "
  171. #define LIBNAME "katmai"
  172. #define CORENAME "KATMAI"
  173. #endif
  174. #ifdef FORCE_COPPERMINE
  175. #define FORCE
  176. #define FORCE_INTEL
  177. #define ARCHITECTURE "X86"
  178. #define SUBARCHITECTURE "PENTIUM3"
  179. #define ARCHCONFIG "-DPENTIUM3 " \
  180. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  181. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  182. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  183. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE "
  184. #define LIBNAME "coppermine"
  185. #define CORENAME "COPPERMINE"
  186. #endif
  187. #ifdef FORCE_NORTHWOOD
  188. #define FORCE
  189. #define FORCE_INTEL
  190. #define ARCHITECTURE "X86"
  191. #define SUBARCHITECTURE "PENTIUM4"
  192. #define ARCHCONFIG "-DPENTIUM4 " \
  193. "-DL1_DATA_SIZE=8192 -DL1_DATA_LINESIZE=64 " \
  194. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  195. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  196. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  197. #define LIBNAME "northwood"
  198. #define CORENAME "NORTHWOOD"
  199. #endif
  200. #ifdef FORCE_PRESCOTT
  201. #define FORCE
  202. #define FORCE_INTEL
  203. #define ARCHITECTURE "X86"
  204. #define SUBARCHITECTURE "PENTIUM4"
  205. #define ARCHCONFIG "-DPENTIUM4 " \
  206. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  207. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  208. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  209. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3"
  210. #define LIBNAME "prescott"
  211. #define CORENAME "PRESCOTT"
  212. #endif
  213. #ifdef FORCE_BANIAS
  214. #define FORCE
  215. #define FORCE_INTEL
  216. #define ARCHITECTURE "X86"
  217. #define SUBARCHITECTURE "BANIAS"
  218. #define ARCHCONFIG "-DPENTIUMM " \
  219. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  220. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  221. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  222. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  223. #define LIBNAME "banias"
  224. #define CORENAME "BANIAS"
  225. #endif
  226. #ifdef FORCE_YONAH
  227. #define FORCE
  228. #define FORCE_INTEL
  229. #define ARCHITECTURE "X86"
  230. #define SUBARCHITECTURE "YONAH"
  231. #define ARCHCONFIG "-DPENTIUMM " \
  232. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  233. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  234. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  235. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  236. #define LIBNAME "yonah"
  237. #define CORENAME "YONAH"
  238. #endif
  239. #ifdef FORCE_CORE2
  240. #define FORCE
  241. #define FORCE_INTEL
  242. #define ARCHITECTURE "X86"
  243. #define SUBARCHITECTURE "CONRORE"
  244. #define ARCHCONFIG "-DCORE2 " \
  245. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  246. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  247. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 " \
  248. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3"
  249. #define LIBNAME "core2"
  250. #define CORENAME "CORE2"
  251. #endif
  252. #ifdef FORCE_PENRYN
  253. #define FORCE
  254. #define FORCE_INTEL
  255. #define ARCHITECTURE "X86"
  256. #define SUBARCHITECTURE "PENRYN"
  257. #define ARCHCONFIG "-DPENRYN " \
  258. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  259. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  260. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 " \
  261. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1"
  262. #define LIBNAME "penryn"
  263. #define CORENAME "PENRYN"
  264. #endif
  265. #ifdef FORCE_DUNNINGTON
  266. #define FORCE
  267. #define FORCE_INTEL
  268. #define ARCHITECTURE "X86"
  269. #define SUBARCHITECTURE "DUNNINGTON"
  270. #define ARCHCONFIG "-DDUNNINGTON " \
  271. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  272. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  273. "-DL3_SIZE=16777216 -DL3_LINESIZE=64 " \
  274. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 " \
  275. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1"
  276. #define LIBNAME "dunnington"
  277. #define CORENAME "DUNNINGTON"
  278. #endif
  279. #ifdef FORCE_NEHALEM
  280. #define FORCE
  281. #define FORCE_INTEL
  282. #define ARCHITECTURE "X86"
  283. #define SUBARCHITECTURE "NEHALEM"
  284. #define ARCHCONFIG "-DNEHALEM " \
  285. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  286. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  287. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  288. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  289. #define LIBNAME "nehalem"
  290. #define CORENAME "NEHALEM"
  291. #endif
  292. #ifdef FORCE_SANDYBRIDGE
  293. #define FORCE
  294. #define FORCE_INTEL
  295. #define ARCHITECTURE "X86"
  296. #ifdef NO_AVX
  297. #define SUBARCHITECTURE "NEHALEM"
  298. #define ARCHCONFIG "-DNEHALEM " \
  299. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  300. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  301. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  302. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  303. #define LIBNAME "nehalem"
  304. #define CORENAME "NEHALEM"
  305. #else
  306. #define SUBARCHITECTURE "SANDYBRIDGE"
  307. #define ARCHCONFIG "-DSANDYBRIDGE " \
  308. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  309. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  310. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  311. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  312. #define LIBNAME "sandybridge"
  313. #define CORENAME "SANDYBRIDGE"
  314. #endif
  315. #endif
  316. #ifdef FORCE_HASWELL
  317. #define FORCE
  318. #define FORCE_INTEL
  319. #define ARCHITECTURE "X86"
  320. #ifdef NO_AVX2
  321. #ifdef NO_AVX
  322. #define SUBARCHITECTURE "NEHALEM"
  323. #define ARCHCONFIG "-DNEHALEM " \
  324. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  325. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  326. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  327. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  328. #define LIBNAME "nehalem"
  329. #define CORENAME "NEHALEM"
  330. #else
  331. #define SUBARCHITECTURE "SANDYBRIDGE"
  332. #define ARCHCONFIG "-DSANDYBRIDGE " \
  333. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  334. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  335. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  336. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  337. #define LIBNAME "sandybridge"
  338. #define CORENAME "SANDYBRIDGE"
  339. #endif
  340. #else
  341. #define SUBARCHITECTURE "HASWELL"
  342. #define ARCHCONFIG "-DHASWELL " \
  343. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  344. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  345. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  346. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  347. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  348. #define LIBNAME "haswell"
  349. #define CORENAME "HASWELL"
  350. #endif
  351. #endif
  352. #ifdef FORCE_SKYLAKEX
  353. #define FORCE
  354. #define FORCE_INTEL
  355. #define ARCHITECTURE "X86"
  356. #ifdef NO_AVX512
  357. #ifdef NO_AVX2
  358. #ifdef NO_AVX
  359. #define SUBARCHITECTURE "NEHALEM"
  360. #define ARCHCONFIG "-DNEHALEM " \
  361. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  362. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  363. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  364. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  365. #define LIBNAME "nehalem"
  366. #define CORENAME "NEHALEM"
  367. #else
  368. #define SUBARCHITECTURE "SANDYBRIDGE"
  369. #define ARCHCONFIG "-DSANDYBRIDGE " \
  370. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  371. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  372. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  373. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  374. #define LIBNAME "sandybridge"
  375. #define CORENAME "SANDYBRIDGE"
  376. #endif
  377. #else
  378. #define SUBARCHITECTURE "HASWELL"
  379. #define ARCHCONFIG "-DHASWELL " \
  380. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  381. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  382. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  383. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  384. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  385. #define LIBNAME "haswell"
  386. #define CORENAME "HASWELL"
  387. #endif
  388. #else
  389. #define SUBARCHITECTURE "SKYLAKEX"
  390. #define ARCHCONFIG "-DSKYLAKEX " \
  391. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  392. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  393. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  394. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  395. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3 -DHAVE_AVX512VL -march=skylake-avx512"
  396. #define LIBNAME "skylakex"
  397. #define CORENAME "SKYLAKEX"
  398. #endif
  399. #endif
  400. #ifdef FORCE_COOPERLAKE
  401. #define FORCE
  402. #define FORCE_INTEL
  403. #define ARCHITECTURE "X86"
  404. #ifdef NO_AVX512
  405. #ifdef NO_AVX2
  406. #ifdef NO_AVX
  407. #define SUBARCHITECTURE "NEHALEM"
  408. #define ARCHCONFIG "-DNEHALEM " \
  409. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  410. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  411. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  412. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  413. #define LIBNAME "nehalem"
  414. #define CORENAME "NEHALEM"
  415. #else
  416. #define SUBARCHITECTURE "SANDYBRIDGE"
  417. #define ARCHCONFIG "-DSANDYBRIDGE " \
  418. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  419. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  420. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  421. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  422. #define LIBNAME "sandybridge"
  423. #define CORENAME "SANDYBRIDGE"
  424. #endif
  425. #else
  426. #define SUBARCHITECTURE "HASWELL"
  427. #define ARCHCONFIG "-DHASWELL " \
  428. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  429. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  430. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  431. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  432. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  433. #define LIBNAME "haswell"
  434. #define CORENAME "HASWELL"
  435. #endif
  436. #else
  437. #define SUBARCHITECTURE "COOPERLAKE"
  438. #define ARCHCONFIG "-DCOOPERLAKE " \
  439. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  440. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  441. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  442. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  443. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3 -DHAVE_AVX512VL -DHAVE_AVX512BF16 -march=cooperlake"
  444. #define LIBNAME "cooperlake"
  445. #define CORENAME "COOPERLAKE"
  446. #endif
  447. #endif
  448. #ifdef FORCE_SAPPHIRERAPIDS
  449. #define FORCE
  450. #define FORCE_INTEL
  451. #define ARCHITECTURE "X86"
  452. #ifdef NO_AVX512
  453. #ifdef NO_AVX2
  454. #ifdef NO_AVX
  455. #define SUBARCHITECTURE "NEHALEM"
  456. #define ARCHCONFIG "-DNEHALEM " \
  457. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  458. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  459. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  460. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  461. #define LIBNAME "nehalem"
  462. #define CORENAME "NEHALEM"
  463. #else
  464. #define SUBARCHITECTURE "SANDYBRIDGE"
  465. #define ARCHCONFIG "-DSANDYBRIDGE " \
  466. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  467. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  468. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  469. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  470. #define LIBNAME "sandybridge"
  471. #define CORENAME "SANDYBRIDGE"
  472. #endif
  473. #else
  474. #define SUBARCHITECTURE "HASWELL"
  475. #define ARCHCONFIG "-DHASWELL " \
  476. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  477. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  478. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  479. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  480. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  481. #define LIBNAME "haswell"
  482. #define CORENAME "HASWELL"
  483. #endif
  484. #else
  485. #define SUBARCHITECTURE "SAPPHIRERAPIDS"
  486. #define ARCHCONFIG "-DSAPPHIRERAPIDS " \
  487. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  488. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  489. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  490. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX " \
  491. "-DHAVE_AVX2 -DHAVE_FMA3 -DFMA3 -DHAVE_AVX512VL -DHAVE_AVX512BF16 -march=sapphirerapids"
  492. #define LIBNAME "sapphirerapids"
  493. #define CORENAME "SAPPHIRERAPIDS"
  494. #endif
  495. #endif
  496. #ifdef FORCE_ATOM
  497. #define FORCE
  498. #define FORCE_INTEL
  499. #define ARCHITECTURE "X86"
  500. #define SUBARCHITECTURE "ATOM"
  501. #define ARCHCONFIG "-DATOM " \
  502. "-DL1_DATA_SIZE=24576 -DL1_DATA_LINESIZE=64 " \
  503. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  504. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  505. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3"
  506. #define LIBNAME "atom"
  507. #define CORENAME "ATOM"
  508. #endif
  509. #ifdef FORCE_ATHLON
  510. #define FORCE
  511. #define FORCE_INTEL
  512. #define ARCHITECTURE "X86"
  513. #define SUBARCHITECTURE "ATHLON"
  514. #define ARCHCONFIG "-DATHLON " \
  515. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  516. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  517. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DHAVE_3DNOW " \
  518. "-DHAVE_3DNOWEX -DHAVE_MMX -DHAVE_SSE "
  519. #define LIBNAME "athlon"
  520. #define CORENAME "ATHLON"
  521. #endif
  522. #ifdef FORCE_OPTERON
  523. #define FORCE
  524. #define FORCE_INTEL
  525. #define ARCHITECTURE "X86"
  526. #define SUBARCHITECTURE "OPTERON"
  527. #define ARCHCONFIG "-DOPTERON " \
  528. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  529. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  530. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DHAVE_3DNOW " \
  531. "-DHAVE_3DNOWEX -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 "
  532. #define LIBNAME "opteron"
  533. #define CORENAME "OPTERON"
  534. #endif
  535. #ifdef FORCE_OPTERON_SSE3
  536. #define FORCE
  537. #define FORCE_INTEL
  538. #define ARCHITECTURE "X86"
  539. #define SUBARCHITECTURE "OPTERON"
  540. #define ARCHCONFIG "-DOPTERON " \
  541. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  542. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  543. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DHAVE_3DNOW " \
  544. "-DHAVE_3DNOWEX -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3"
  545. #define LIBNAME "opteron"
  546. #define CORENAME "OPTERON"
  547. #endif
  548. #if defined(FORCE_BARCELONA) || defined(FORCE_SHANGHAI) || defined(FORCE_ISTANBUL)
  549. #define FORCE
  550. #define FORCE_INTEL
  551. #define ARCHITECTURE "X86"
  552. #define SUBARCHITECTURE "BARCELONA"
  553. #define ARCHCONFIG "-DBARCELONA " \
  554. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  555. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL3_SIZE=2097152 " \
  556. "-DDTB_DEFAULT_ENTRIES=48 -DDTB_SIZE=4096 " \
  557. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 " \
  558. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU"
  559. #define LIBNAME "barcelona"
  560. #define CORENAME "BARCELONA"
  561. #endif
  562. #if defined(FORCE_BOBCAT)
  563. #define FORCE
  564. #define FORCE_INTEL
  565. #define ARCHITECTURE "X86"
  566. #define SUBARCHITECTURE "BOBCAT"
  567. #define ARCHCONFIG "-DBOBCAT " \
  568. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  569. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  570. "-DDTB_DEFAULT_ENTRIES=40 -DDTB_SIZE=4096 " \
  571. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 " \
  572. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_CFLUSH -DHAVE_CMOV"
  573. #define LIBNAME "bobcat"
  574. #define CORENAME "BOBCAT"
  575. #endif
  576. #if defined (FORCE_BULLDOZER)
  577. #define FORCE
  578. #define FORCE_INTEL
  579. #define ARCHITECTURE "X86"
  580. #define SUBARCHITECTURE "BULLDOZER"
  581. #define ARCHCONFIG "-DBULLDOZER " \
  582. "-DL1_DATA_SIZE=49152 -DL1_DATA_LINESIZE=64 " \
  583. "-DL2_SIZE=1024000 -DL2_LINESIZE=64 -DL3_SIZE=16777216 " \
  584. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 " \
  585. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 " \
  586. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU " \
  587. "-DHAVE_AVX"
  588. #define LIBNAME "bulldozer"
  589. #define CORENAME "BULLDOZER"
  590. #endif
  591. #if defined (FORCE_PILEDRIVER)
  592. #define FORCE
  593. #define FORCE_INTEL
  594. #define ARCHITECTURE "X86"
  595. #define SUBARCHITECTURE "PILEDRIVER"
  596. #define ARCHCONFIG "-DPILEDRIVER " \
  597. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  598. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL3_SIZE=12582912 " \
  599. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  600. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  601. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  602. "-DHAVE_AVX -DHAVE_FMA3"
  603. #define LIBNAME "piledriver"
  604. #define CORENAME "PILEDRIVER"
  605. #endif
  606. #if defined (FORCE_STEAMROLLER)
  607. #define FORCE
  608. #define FORCE_INTEL
  609. #define ARCHITECTURE "X86"
  610. #define SUBARCHITECTURE "STEAMROLLER"
  611. #define ARCHCONFIG "-DSTEAMROLLER " \
  612. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  613. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL3_SIZE=12582912 " \
  614. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  615. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  616. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  617. "-DHAVE_AVX -DHAVE_FMA3"
  618. #define LIBNAME "steamroller"
  619. #define CORENAME "STEAMROLLER"
  620. #endif
  621. #if defined (FORCE_EXCAVATOR)
  622. #define FORCE
  623. #define FORCE_INTEL
  624. #define ARCHITECTURE "X86"
  625. #define SUBARCHITECTURE "EXCAVATOR"
  626. #define ARCHCONFIG "-DEXCAVATOR " \
  627. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  628. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL3_SIZE=12582912 " \
  629. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  630. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  631. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  632. "-DHAVE_AVX -DHAVE_FMA3"
  633. #define LIBNAME "excavator"
  634. #define CORENAME "EXCAVATOR"
  635. #endif
  636. #if defined (FORCE_ZEN)
  637. #define FORCE
  638. #define FORCE_INTEL
  639. #define ARCHITECTURE "X86"
  640. #ifdef NO_AVX2
  641. #ifdef NO_AVX
  642. #define SUBARCHITECTURE "NEHALEM"
  643. #define ARCHCONFIG "-DNEHALEM " \
  644. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  645. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  646. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  647. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2"
  648. #define LIBNAME "nehalem"
  649. #define CORENAME "NEHALEM"
  650. #else
  651. #define SUBARCHITECTURE "SANDYBRIDGE"
  652. #define ARCHCONFIG "-DSANDYBRIDGE " \
  653. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  654. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  655. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  656. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 -DHAVE_AVX"
  657. #define LIBNAME "sandybridge"
  658. #define CORENAME "SANDYBRIDGE"
  659. #endif
  660. #else
  661. #define SUBARCHITECTURE "ZEN"
  662. #define ARCHCONFIG "-DZEN " \
  663. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  664. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL2_CODE_ASSOCIATIVE=8 " \
  665. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  666. "-DL3_SIZE=16777216 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=8 " \
  667. "-DITB_DEFAULT_ENTRIES=64 -DITB_SIZE=4096 " \
  668. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  669. "-DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 " \
  670. "-DHAVE_SSE4A -DHAVE_MISALIGNSSE -DHAVE_128BITFPU -DHAVE_FASTMOVU -DHAVE_CFLUSH " \
  671. "-DHAVE_AVX -DHAVE_AVX2 -DHAVE_FMA3 -DFMA3"
  672. #define LIBNAME "zen"
  673. #define CORENAME "ZEN"
  674. #endif
  675. #endif
  676. #ifdef FORCE_SSE_GENERIC
  677. #define FORCE
  678. #define FORCE_INTEL
  679. #define ARCHITECTURE "X86"
  680. #define SUBARCHITECTURE "GENERIC"
  681. #define ARCHCONFIG "-DGENERIC " \
  682. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  683. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  684. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  685. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2"
  686. #define LIBNAME "generic"
  687. #define CORENAME "GENERIC"
  688. #endif
  689. #ifdef FORCE_VIAC3
  690. #define FORCE
  691. #define FORCE_INTEL
  692. #define ARCHITECTURE "X86"
  693. #define SUBARCHITECTURE "VIAC3"
  694. #define ARCHCONFIG "-DVIAC3 " \
  695. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  696. "-DL2_SIZE=65536 -DL2_LINESIZE=32 " \
  697. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 " \
  698. "-DHAVE_MMX -DHAVE_SSE "
  699. #define LIBNAME "viac3"
  700. #define CORENAME "VIAC3"
  701. #endif
  702. #ifdef FORCE_NANO
  703. #define FORCE
  704. #define FORCE_INTEL
  705. #define ARCHITECTURE "X86"
  706. #define SUBARCHITECTURE "NANO"
  707. #define ARCHCONFIG "-DNANO " \
  708. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  709. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  710. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 " \
  711. "-DHAVE_CMOV -DHAVE_MMX -DHAVE_SSE -DHAVE_SSE2 -DHAVE_SSE3 -DHAVE_SSSE3"
  712. #define LIBNAME "nano"
  713. #define CORENAME "NANO"
  714. #endif
  715. #ifdef FORCE_POWER3
  716. #define FORCE
  717. #define ARCHITECTURE "POWER"
  718. #define SUBARCHITECTURE "POWER3"
  719. #define SUBDIRNAME "power"
  720. #define ARCHCONFIG "-DPOWER3 " \
  721. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=128 " \
  722. "-DL2_SIZE=2097152 -DL2_LINESIZE=128 " \
  723. "-DDTB_DEFAULT_ENTRIES=256 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  724. #define LIBNAME "power3"
  725. #define CORENAME "POWER3"
  726. #endif
  727. #ifdef FORCE_POWER4
  728. #define FORCE
  729. #define ARCHITECTURE "POWER"
  730. #define SUBARCHITECTURE "POWER4"
  731. #define SUBDIRNAME "power"
  732. #define ARCHCONFIG "-DPOWER4 " \
  733. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  734. "-DL2_SIZE=1509949 -DL2_LINESIZE=128 " \
  735. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=6 "
  736. #define LIBNAME "power4"
  737. #define CORENAME "POWER4"
  738. #endif
  739. #ifdef FORCE_POWER5
  740. #define FORCE
  741. #define ARCHITECTURE "POWER"
  742. #define SUBARCHITECTURE "POWER5"
  743. #define SUBDIRNAME "power"
  744. #define ARCHCONFIG "-DPOWER5 " \
  745. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  746. "-DL2_SIZE=1509949 -DL2_LINESIZE=128 " \
  747. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=6 "
  748. #define LIBNAME "power5"
  749. #define CORENAME "POWER5"
  750. #endif
  751. #if defined(FORCE_POWER6) || defined(FORCE_POWER7)
  752. #define FORCE
  753. #define ARCHITECTURE "POWER"
  754. #define SUBARCHITECTURE "POWER6"
  755. #define SUBDIRNAME "power"
  756. #define ARCHCONFIG "-DPOWER6 " \
  757. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=128 " \
  758. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  759. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  760. #define LIBNAME "power6"
  761. #define CORENAME "POWER6"
  762. #endif
  763. #if defined(FORCE_POWER8)
  764. #define FORCE
  765. #define ARCHITECTURE "POWER"
  766. #define SUBARCHITECTURE "POWER8"
  767. #define SUBDIRNAME "power"
  768. #define ARCHCONFIG "-DPOWER8 " \
  769. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=128 " \
  770. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  771. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  772. #define LIBNAME "power8"
  773. #define CORENAME "POWER8"
  774. #endif
  775. #if defined(FORCE_POWER9)
  776. #define FORCE
  777. #define ARCHITECTURE "POWER"
  778. #define SUBARCHITECTURE "POWER9"
  779. #define SUBDIRNAME "power"
  780. #define ARCHCONFIG "-DPOWER9 " \
  781. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  782. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  783. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  784. #define LIBNAME "power9"
  785. #define CORENAME "POWER9"
  786. #endif
  787. #if defined(FORCE_POWER10)
  788. #define FORCE
  789. #define ARCHITECTURE "POWER"
  790. #define SUBARCHITECTURE "POWER10"
  791. #define SUBDIRNAME "power"
  792. #define ARCHCONFIG "-DPOWER10 " \
  793. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  794. "-DL2_SIZE=4194304 -DL2_LINESIZE=128 " \
  795. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  796. #define LIBNAME "power10"
  797. #define CORENAME "POWER10"
  798. #endif
  799. #ifdef FORCE_PPCG4
  800. #define FORCE
  801. #define ARCHITECTURE "POWER"
  802. #define SUBARCHITECTURE "PPCG4"
  803. #define SUBDIRNAME "power"
  804. #define ARCHCONFIG "-DPPCG4 " \
  805. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  806. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  807. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  808. #define LIBNAME "ppcg4"
  809. #define CORENAME "PPCG4"
  810. #endif
  811. #ifdef FORCE_PPC970
  812. #define FORCE
  813. #define ARCHITECTURE "POWER"
  814. #define SUBARCHITECTURE "PPC970"
  815. #define SUBDIRNAME "power"
  816. #define ARCHCONFIG "-DPPC970 " \
  817. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  818. "-DL2_SIZE=512488 -DL2_LINESIZE=128 " \
  819. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  820. #define LIBNAME "ppc970"
  821. #define CORENAME "PPC970"
  822. #endif
  823. #ifdef FORCE_PPC970MP
  824. #define FORCE
  825. #define ARCHITECTURE "POWER"
  826. #define SUBARCHITECTURE "PPC970"
  827. #define SUBDIRNAME "power"
  828. #define ARCHCONFIG "-DPPC970 " \
  829. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  830. "-DL2_SIZE=1024976 -DL2_LINESIZE=128 " \
  831. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  832. #define LIBNAME "ppc970mp"
  833. #define CORENAME "PPC970"
  834. #endif
  835. #ifdef FORCE_PPC440
  836. #define FORCE
  837. #define ARCHITECTURE "POWER"
  838. #define SUBARCHITECTURE "PPC440"
  839. #define SUBDIRNAME "power"
  840. #define ARCHCONFIG "-DPPC440 " \
  841. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  842. "-DL2_SIZE=16384 -DL2_LINESIZE=128 " \
  843. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 "
  844. #define LIBNAME "ppc440"
  845. #define CORENAME "PPC440"
  846. #endif
  847. #ifdef FORCE_PPC440FP2
  848. #define FORCE
  849. #define ARCHITECTURE "POWER"
  850. #define SUBARCHITECTURE "PPC440FP2"
  851. #define SUBDIRNAME "power"
  852. #define ARCHCONFIG "-DPPC440FP2 " \
  853. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  854. "-DL2_SIZE=16384 -DL2_LINESIZE=128 " \
  855. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 "
  856. #define LIBNAME "ppc440FP2"
  857. #define CORENAME "PPC440FP2"
  858. #endif
  859. #ifdef FORCE_CELL
  860. #define FORCE
  861. #define ARCHITECTURE "POWER"
  862. #define SUBARCHITECTURE "CELL"
  863. #define SUBDIRNAME "power"
  864. #define ARCHCONFIG "-DCELL " \
  865. "-DL1_DATA_SIZE=262144 -DL1_DATA_LINESIZE=128 " \
  866. "-DL2_SIZE=512488 -DL2_LINESIZE=128 " \
  867. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  868. #define LIBNAME "cell"
  869. #define CORENAME "CELL"
  870. #endif
  871. #ifdef FORCE_MIPS64_GENERIC
  872. #define FORCE
  873. #define ARCHITECTURE "MIPS"
  874. #define SUBARCHITECTURE "MIPS64_GENERIC"
  875. #define SUBDIRNAME "mips64"
  876. #define ARCHCONFIG "-DMIPS64_GENERIC " \
  877. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  878. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  879. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  880. #define LIBNAME "mips64_generic"
  881. #define CORENAME "MIPS64_GENERIC"
  882. #else
  883. #endif
  884. #ifdef FORCE_SICORTEX
  885. #define FORCE
  886. #define ARCHITECTURE "MIPS"
  887. #define SUBARCHITECTURE "SICORTEX"
  888. #define SUBDIRNAME "mips"
  889. #define ARCHCONFIG "-DSICORTEX " \
  890. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  891. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  892. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  893. #define LIBNAME "mips"
  894. #define CORENAME "sicortex"
  895. #endif
  896. #if defined FORCE_LOONGSON3R3 || defined FORCE_LOONGSON3A || defined FORCE_LOONGSON3B
  897. #define FORCE
  898. #define ARCHITECTURE "MIPS"
  899. #define SUBARCHITECTURE "LOONGSON3R3"
  900. #define SUBDIRNAME "mips64"
  901. #define ARCHCONFIG "-DLOONGSON3R3 " \
  902. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  903. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  904. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  905. #define LIBNAME "loongson3r3"
  906. #define CORENAME "LOONGSON3R3"
  907. #else
  908. #endif
  909. #ifdef FORCE_LOONGSON3R4
  910. #define FORCE
  911. #define ARCHITECTURE "MIPS"
  912. #define SUBARCHITECTURE "LOONGSON3R4"
  913. #define SUBDIRNAME "mips64"
  914. #define ARCHCONFIG "-DLOONGSON3R4 " \
  915. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  916. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  917. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 -DHAVE_MSA"
  918. #define LIBNAME "loongson3r4"
  919. #define CORENAME "LOONGSON3R4"
  920. #else
  921. #endif
  922. #ifdef FORCE_LOONGSON3R5
  923. #define FORCE
  924. #define ARCHITECTURE "LOONGARCH"
  925. #define SUBARCHITECTURE "LOONGSON3R5"
  926. #define SUBDIRNAME "loongarch64"
  927. #define ARCHCONFIG "-DLOONGSON3R5 " \
  928. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  929. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 " \
  930. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 -DHAVE_MSA"
  931. #define LIBNAME "loongson3r5"
  932. #define CORENAME "LOONGSON3R5"
  933. #else
  934. #endif
  935. #ifdef FORCE_LOONGSON2K1000
  936. #define FORCE
  937. #define ARCHITECTURE "LOONGARCH"
  938. #define SUBARCHITECTURE "LOONGSON2K1000"
  939. #define SUBDIRNAME "loongarch64"
  940. #define ARCHCONFIG "-DLOONGSON2K1000 " \
  941. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  942. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  943. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 -DHAVE_MSA"
  944. #define LIBNAME "loongson2k1000"
  945. #define CORENAME "LOONGSON2K1000"
  946. #else
  947. #endif
  948. #ifdef FORCE_LOONGSONGENERIC
  949. #define FORCE
  950. #define ARCHITECTURE "LOONGARCH"
  951. #define SUBARCHITECTURE "LOONGSONGENERIC"
  952. #define SUBDIRNAME "loongarch64"
  953. #define ARCHCONFIG "-DLOONGSONGENERIC " \
  954. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  955. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  956. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=16 -DHAVE_MSA"
  957. #define LIBNAME "loongsongeneric"
  958. #define CORENAME "LOONGSONGENERIC"
  959. #else
  960. #endif
  961. #ifdef FORCE_I6400
  962. #define FORCE
  963. #define ARCHITECTURE "MIPS"
  964. #define SUBARCHITECTURE "I6400"
  965. #define SUBDIRNAME "mips64"
  966. #define ARCHCONFIG "-DI6400 " \
  967. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  968. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  969. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 -DHAVE_MSA "
  970. #define LIBNAME "i6400"
  971. #define CORENAME "I6400"
  972. #else
  973. #endif
  974. #ifdef FORCE_P6600
  975. #define FORCE
  976. #define ARCHITECTURE "MIPS"
  977. #define SUBARCHITECTURE "P6600"
  978. #define SUBDIRNAME "mips64"
  979. #define ARCHCONFIG "-DP6600 " \
  980. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  981. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  982. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  983. #define LIBNAME "p6600"
  984. #define CORENAME "P6600"
  985. #else
  986. #endif
  987. #ifdef FORCE_P5600
  988. #define FORCE
  989. #define ARCHITECTURE "MIPS"
  990. #define SUBARCHITECTURE "P5600"
  991. #define SUBDIRNAME "mips"
  992. #define ARCHCONFIG "-DP5600 " \
  993. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  994. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  995. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8"
  996. #define LIBNAME "p5600"
  997. #define CORENAME "P5600"
  998. #else
  999. #endif
  1000. #ifdef FORCE_MIPS1004K
  1001. #define FORCE
  1002. #define ARCHITECTURE "MIPS"
  1003. #define SUBARCHITECTURE "MIPS1004K"
  1004. #define SUBDIRNAME "mips"
  1005. #define ARCHCONFIG "-DMIPS1004K " \
  1006. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1007. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  1008. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8"
  1009. #define LIBNAME "mips1004K"
  1010. #define CORENAME "MIPS1004K"
  1011. #else
  1012. #endif
  1013. #ifdef FORCE_MIPS24K
  1014. #define FORCE
  1015. #define ARCHITECTURE "MIPS"
  1016. #define SUBARCHITECTURE "MIPS24K"
  1017. #define SUBDIRNAME "mips"
  1018. #define ARCHCONFIG "-DMIPS24K " \
  1019. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1020. "-DL2_SIZE=32768 -DL2_LINESIZE=32 " \
  1021. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8"
  1022. #define LIBNAME "mips24K"
  1023. #define CORENAME "MIPS24K"
  1024. #else
  1025. #endif
  1026. #ifdef FORCE_I6500
  1027. #define FORCE
  1028. #define ARCHITECTURE "MIPS"
  1029. #define SUBARCHITECTURE "I6500"
  1030. #define SUBDIRNAME "mips64"
  1031. #define ARCHCONFIG "-DI6500 " \
  1032. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1033. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1034. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 -DHAVE_MSA"
  1035. #define LIBNAME "i6500"
  1036. #define CORENAME "I6500"
  1037. #else
  1038. #endif
  1039. #ifdef FORCE_ITANIUM2
  1040. #define FORCE
  1041. #define ARCHITECTURE "IA64"
  1042. #define SUBARCHITECTURE "ITANIUM2"
  1043. #define SUBDIRNAME "ia64"
  1044. #define ARCHCONFIG "-DITANIUM2 " \
  1045. "-DL1_DATA_SIZE=262144 -DL1_DATA_LINESIZE=128 " \
  1046. "-DL2_SIZE=1572864 -DL2_LINESIZE=128 -DDTB_SIZE=16384 -DDTB_DEFAULT_ENTRIES=128 "
  1047. #define LIBNAME "itanium2"
  1048. #define CORENAME "itanium2"
  1049. #endif
  1050. #ifdef FORCE_SPARC
  1051. #define FORCE
  1052. #define ARCHITECTURE "SPARC"
  1053. #define SUBARCHITECTURE "SPARC"
  1054. #define SUBDIRNAME "sparc"
  1055. #define ARCHCONFIG "-DSPARC -DV9 " \
  1056. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  1057. "-DL2_SIZE=1572864 -DL2_LINESIZE=64 -DDTB_SIZE=8192 -DDTB_DEFAULT_ENTRIES=64 "
  1058. #define LIBNAME "sparc"
  1059. #define CORENAME "sparc"
  1060. #endif
  1061. #ifdef FORCE_SPARCV7
  1062. #define FORCE
  1063. #define ARCHITECTURE "SPARC"
  1064. #define SUBARCHITECTURE "SPARC"
  1065. #define SUBDIRNAME "sparc"
  1066. #define ARCHCONFIG "-DSPARC -DV7 " \
  1067. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 " \
  1068. "-DL2_SIZE=1572864 -DL2_LINESIZE=64 -DDTB_SIZE=8192 -DDTB_DEFAULT_ENTRIES=64 "
  1069. #define LIBNAME "sparcv7"
  1070. #define CORENAME "sparcv7"
  1071. #endif
  1072. #ifdef FORCE_GENERIC
  1073. #define FORCE
  1074. #define ARCHITECTURE "GENERIC"
  1075. #define SUBARCHITECTURE "GENERIC"
  1076. #define SUBDIRNAME "generic"
  1077. #define ARCHCONFIG "-DGENERIC " \
  1078. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  1079. "-DL2_SIZE=512488 -DL2_LINESIZE=128 " \
  1080. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1081. #define LIBNAME "generic"
  1082. #define CORENAME "generic"
  1083. #endif
  1084. #ifdef FORCE_ARMV7
  1085. #define FORCE
  1086. #define ARCHITECTURE "ARM"
  1087. #define SUBARCHITECTURE "ARMV7"
  1088. #define SUBDIRNAME "arm"
  1089. #define ARCHCONFIG "-DARMV7 " \
  1090. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1091. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  1092. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1093. "-DHAVE_VFPV3 -DHAVE_VFP"
  1094. #define LIBNAME "armv7"
  1095. #define CORENAME "ARMV7"
  1096. #else
  1097. #endif
  1098. #ifdef FORCE_CORTEXA9
  1099. #define FORCE
  1100. #define ARCHITECTURE "ARM"
  1101. #define SUBARCHITECTURE "CORTEXA9"
  1102. #define SUBDIRNAME "arm"
  1103. #define ARCHCONFIG "-DCORTEXA9 -DARMV7 " \
  1104. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1105. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1106. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1107. "-DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON"
  1108. #define LIBNAME "cortexa9"
  1109. #define CORENAME "CORTEXA9"
  1110. #else
  1111. #endif
  1112. #ifdef FORCE_RISCV64_GENERIC
  1113. #define FORCE
  1114. #define ARCHITECTURE "RISCV64"
  1115. #define SUBARCHITECTURE "RISCV64_GENERIC"
  1116. #define SUBDIRNAME "riscv64"
  1117. #define ARCHCONFIG "-DRISCV64_GENERIC " \
  1118. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1119. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1120. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1121. #define LIBNAME "riscv64_generic"
  1122. #define CORENAME "RISCV64_GENERIC"
  1123. #else
  1124. #endif
  1125. #ifdef FORCE_CORTEXA15
  1126. #define FORCE
  1127. #define ARCHITECTURE "ARM"
  1128. #define SUBARCHITECTURE "CORTEXA15"
  1129. #define SUBDIRNAME "arm"
  1130. #define ARCHCONFIG "-DCORTEXA15 -DARMV7 " \
  1131. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1132. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1133. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1134. "-DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON"
  1135. #define LIBNAME "cortexa15"
  1136. #define CORENAME "CORTEXA15"
  1137. #else
  1138. #endif
  1139. #ifdef FORCE_ARMV6
  1140. #define FORCE
  1141. #define ARCHITECTURE "ARM"
  1142. #define SUBARCHITECTURE "ARMV6"
  1143. #define SUBDIRNAME "arm"
  1144. #define ARCHCONFIG "-DARMV6 " \
  1145. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1146. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  1147. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 " \
  1148. "-DHAVE_VFP"
  1149. #define LIBNAME "armv6"
  1150. #define CORENAME "ARMV6"
  1151. #else
  1152. #endif
  1153. #ifdef FORCE_ARMV5
  1154. #define FORCE
  1155. #define ARCHITECTURE "ARM"
  1156. #define SUBARCHITECTURE "ARMV5"
  1157. #define SUBDIRNAME "arm"
  1158. #define ARCHCONFIG "-DARMV5 " \
  1159. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1160. "-DL2_SIZE=512488 -DL2_LINESIZE=32 " \
  1161. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1162. #define LIBNAME "armv5"
  1163. #define CORENAME "ARMV5"
  1164. #else
  1165. #endif
  1166. #ifdef FORCE_ARMV8SVE
  1167. #define FORCE
  1168. #define ARCHITECTURE "ARM64"
  1169. #define SUBARCHITECTURE "ARMV8SVE"
  1170. #define SUBDIRNAME "arm64"
  1171. #define ARCHCONFIG "-DARMV8SVE " \
  1172. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1173. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1174. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1175. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8"
  1176. #define LIBNAME "armv8sve"
  1177. #define CORENAME "ARMV8SVE"
  1178. #endif
  1179. #ifdef FORCE_ARMV8
  1180. #define FORCE
  1181. #define ARCHITECTURE "ARM64"
  1182. #define SUBARCHITECTURE "ARMV8"
  1183. #define SUBDIRNAME "arm64"
  1184. #define ARCHCONFIG "-DARMV8 " \
  1185. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1186. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1187. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1188. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1189. #define LIBNAME "armv8"
  1190. #define CORENAME "ARMV8"
  1191. #endif
  1192. #ifdef FORCE_CORTEXA53
  1193. #define FORCE
  1194. #define ARCHITECTURE "ARM64"
  1195. #define SUBARCHITECTURE "CORTEXA53"
  1196. #define SUBDIRNAME "arm64"
  1197. #define ARCHCONFIG "-DCORTEXA53 " \
  1198. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1199. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1200. "-DL2_SIZE=262144 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1201. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1202. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1203. #define LIBNAME "cortexa53"
  1204. #define CORENAME "CORTEXA53"
  1205. #endif
  1206. #ifdef FORCE_CORTEXA57
  1207. #define FORCE
  1208. #define ARCHITECTURE "ARM64"
  1209. #define SUBARCHITECTURE "CORTEXA57"
  1210. #define SUBDIRNAME "arm64"
  1211. #define ARCHCONFIG "-DCORTEXA57 " \
  1212. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1213. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1214. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1215. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1216. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1217. #define LIBNAME "cortexa57"
  1218. #define CORENAME "CORTEXA57"
  1219. #endif
  1220. #ifdef FORCE_CORTEXA72
  1221. #define FORCE
  1222. #define ARCHITECTURE "ARM64"
  1223. #define SUBARCHITECTURE "CORTEXA72"
  1224. #define SUBDIRNAME "arm64"
  1225. #define ARCHCONFIG "-DCORTEXA72 " \
  1226. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1227. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1228. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1229. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1230. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1231. #define LIBNAME "cortexa72"
  1232. #define CORENAME "CORTEXA72"
  1233. #endif
  1234. #ifdef FORCE_CORTEXA73
  1235. #define FORCE
  1236. #define ARCHITECTURE "ARM64"
  1237. #define SUBARCHITECTURE "CORTEXA73"
  1238. #define SUBDIRNAME "arm64"
  1239. #define ARCHCONFIG "-DCORTEXA73 " \
  1240. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1241. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1242. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1243. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1244. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1245. #define LIBNAME "cortexa73"
  1246. #define CORENAME "CORTEXA73"
  1247. #endif
  1248. #ifdef FORCE_CORTEXA76
  1249. #define FORCE
  1250. #define ARCHITECTURE "ARM64"
  1251. #define SUBARCHITECTURE "CORTEXA76"
  1252. #define SUBDIRNAME "arm64"
  1253. #define ARCHCONFIG "-DCORTEXA76 " \
  1254. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1255. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1256. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1257. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1258. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1259. #define LIBNAME "cortexa76"
  1260. #define CORENAME "CORTEXA76"
  1261. #endif
  1262. #ifdef FORCE_CORTEXX1
  1263. #define FORCE
  1264. #define ARCHITECTURE "ARM64"
  1265. #define SUBARCHITECTURE "CORTEXX1"
  1266. #define SUBDIRNAME "arm64"
  1267. #define ARCHCONFIG "-DCORTEXX1 " \
  1268. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1269. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1270. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1271. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1272. #define LIBNAME "cortexx1"
  1273. #define CORENAME "CORTEXX1"
  1274. #endif
  1275. #ifdef FORCE_CORTEXX2
  1276. #define FORCE
  1277. #define ARCHITECTURE "ARM64"
  1278. #define SUBARCHITECTURE "CORTEXX2"
  1279. #define SUBDIRNAME "arm64"
  1280. #define ARCHCONFIG "-DCORTEXX2 " \
  1281. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1282. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1283. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1284. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 -DARMV9"
  1285. #define LIBNAME "cortexx2"
  1286. #define CORENAME "CORTEXX2"
  1287. #endif
  1288. #ifdef FORCE_CORTEXA510
  1289. #define FORCE
  1290. #define ARCHITECTURE "ARM64"
  1291. #define SUBARCHITECTURE "CORTEXA510"
  1292. #define SUBDIRNAME "arm64"
  1293. #define ARCHCONFIG "-DCORTEXA510 " \
  1294. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1295. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1296. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1297. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 -DARMV9"
  1298. #define LIBNAME "cortexa510"
  1299. #define CORENAME "CORTEXA510"
  1300. #endif
  1301. #ifdef FORCE_CORTEXA710
  1302. #define FORCE
  1303. #define ARCHITECTURE "ARM64"
  1304. #define SUBARCHITECTURE "CORTEXA710"
  1305. #define SUBDIRNAME "arm64"
  1306. #define ARCHCONFIG "-DCORTEXA710 " \
  1307. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1308. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1309. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1310. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 -DARMV9"
  1311. #define LIBNAME "cortexa710"
  1312. #define CORENAME "CORTEXA710"
  1313. #endif
  1314. #ifdef FORCE_NEOVERSEN1
  1315. #define FORCE
  1316. #define ARCHITECTURE "ARM64"
  1317. #define SUBARCHITECTURE "NEOVERSEN1"
  1318. #define SUBDIRNAME "arm64"
  1319. #define ARCHCONFIG "-DNEOVERSEN1 " \
  1320. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1321. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1322. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1323. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1324. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8 " \
  1325. "-march=armv8.2-a -mtune=neoverse-n1"
  1326. #define LIBNAME "neoversen1"
  1327. #define CORENAME "NEOVERSEN1"
  1328. #endif
  1329. #ifdef FORCE_NEOVERSEV1
  1330. #define FORCE
  1331. #define ARCHITECTURE "ARM64"
  1332. #define SUBARCHITECTURE "NEOVERSEV1"
  1333. #define SUBDIRNAME "arm64"
  1334. #define ARCHCONFIG "-DNEOVERSEV1 " \
  1335. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1336. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1337. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1338. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1339. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 " \
  1340. "-march=armv8.4-a+sve -mtune=neoverse-v1"
  1341. #define LIBNAME "neoversev1"
  1342. #define CORENAME "NEOVERSEV1"
  1343. #endif
  1344. #ifdef FORCE_NEOVERSEN2
  1345. #define FORCE
  1346. #define ARCHITECTURE "ARM64"
  1347. #define SUBARCHITECTURE "NEOVERSEN2"
  1348. #define SUBDIRNAME "arm64"
  1349. #define ARCHCONFIG "-DNEOVERSEN2 " \
  1350. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1351. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1352. "-DL2_SIZE=1048576 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1353. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1354. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8 " \
  1355. "-march=armv8.5-a -mtune=neoverse-n2"
  1356. #define LIBNAME "neoversen2"
  1357. #define CORENAME "NEOVERSEN2"
  1358. #endif
  1359. #ifdef FORCE_CORTEXA55
  1360. #define FORCE
  1361. #define ARCHITECTURE "ARM64"
  1362. #define SUBARCHITECTURE "CORTEXA55"
  1363. #define SUBDIRNAME "arm64"
  1364. #define ARCHCONFIG "-DCORTEXA55 " \
  1365. "-DL1_CODE_SIZE=16384 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1366. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1367. "-DL2_SIZE=65536 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1368. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1369. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1370. #define LIBNAME "cortexa55"
  1371. #define CORENAME "CORTEXA55"
  1372. #endif
  1373. #ifdef FORCE_FALKOR
  1374. #define FORCE
  1375. #define ARCHITECTURE "ARM64"
  1376. #define SUBARCHITECTURE "FALKOR"
  1377. #define SUBDIRNAME "arm64"
  1378. #define ARCHCONFIG "-DFALKOR " \
  1379. "-DL1_CODE_SIZE=49152 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=3 " \
  1380. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=2 " \
  1381. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=16 " \
  1382. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1383. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1384. #define LIBNAME "falkor"
  1385. #define CORENAME "FALKOR"
  1386. #endif
  1387. #ifdef FORCE_THUNDERX
  1388. #define FORCE
  1389. #define ARCHITECTURE "ARM64"
  1390. #define SUBARCHITECTURE "THUNDERX"
  1391. #define SUBDIRNAME "arm64"
  1392. #define ARCHCONFIG "-DTHUNDERX " \
  1393. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=128 " \
  1394. "-DL2_SIZE=16777216 -DL2_LINESIZE=128 -DL2_ASSOCIATIVE=16 " \
  1395. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1396. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1397. #define LIBNAME "thunderx"
  1398. #define CORENAME "THUNDERX"
  1399. #endif
  1400. #ifdef FORCE_THUNDERX2T99
  1401. #define ARMV8
  1402. #define FORCE
  1403. #define ARCHITECTURE "ARM64"
  1404. #define SUBARCHITECTURE "THUNDERX2T99"
  1405. #define SUBDIRNAME "arm64"
  1406. #define ARCHCONFIG "-DTHUNDERX2T99 " \
  1407. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1408. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1409. "-DL2_SIZE=262144 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1410. "-DL3_SIZE=33554432 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=32 " \
  1411. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1412. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1413. #define LIBNAME "thunderx2t99"
  1414. #define CORENAME "THUNDERX2T99"
  1415. #endif
  1416. #ifdef FORCE_TSV110
  1417. #define FORCE
  1418. #define ARCHITECTURE "ARM64"
  1419. #define SUBARCHITECTURE "TSV110"
  1420. #define SUBDIRNAME "arm64"
  1421. #define ARCHCONFIG "-DTSV110 " \
  1422. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=4 " \
  1423. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=4 " \
  1424. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1425. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1426. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1427. #define LIBNAME "tsv110"
  1428. #define CORENAME "TSV110"
  1429. #endif
  1430. #ifdef FORCE_EMAG8180
  1431. #define ARMV8
  1432. #define FORCE
  1433. #define ARCHITECTURE "ARM64"
  1434. #define SUBARCHITECTURE "EMAG8180"
  1435. #define SUBDIRNAME "arm64"
  1436. #define ARCHCONFIG "-DEMAG8180 " \
  1437. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1438. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1439. "-DL2_SIZE=262144 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1440. "-DL3_SIZE=33554432 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=32 " \
  1441. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1442. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1443. #define LIBNAME "emag8180"
  1444. #define CORENAME "EMAG8180"
  1445. #endif
  1446. #ifdef FORCE_THUNDERX3T110
  1447. #define ARMV8
  1448. #define FORCE
  1449. #define ARCHITECTURE "ARM64"
  1450. #define SUBARCHITECTURE "THUNDERX3T110"
  1451. #define SUBDIRNAME "arm64"
  1452. #define ARCHCONFIG "-DTHUNDERX3T110 " \
  1453. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1454. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1455. "-DL2_SIZE=524288 -DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1456. "-DL3_SIZE=94371840 -DL3_LINESIZE=64 -DL3_ASSOCIATIVE=32 " \
  1457. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1458. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1459. #define LIBNAME "thunderx3t110"
  1460. #define CORENAME "THUNDERX3T110"
  1461. #endif
  1462. #ifdef FORCE_VORTEX
  1463. #define FORCE
  1464. #define ARCHITECTURE "ARM64"
  1465. #define SUBARCHITECTURE "VORTEX"
  1466. #define SUBDIRNAME "arm64"
  1467. #define ARCHCONFIG "-DVORTEX " \
  1468. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1469. "-DL2_SIZE=262144 -DL2_LINESIZE=64 " \
  1470. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=32 " \
  1471. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1472. #define LIBNAME "vortex"
  1473. #define CORENAME "VORTEX"
  1474. #endif
  1475. #ifdef FORCE_A64FX
  1476. #define ARMV8
  1477. #define FORCE
  1478. #define ARCHITECTURE "ARM64"
  1479. #define SUBARCHITECTURE "A64FX"
  1480. #define SUBDIRNAME "arm64"
  1481. #define ARCHCONFIG "-DA64FX " \
  1482. "-DL1_CODE_SIZE=65536 -DL1_CODE_LINESIZE=256 -DL1_CODE_ASSOCIATIVE=8 " \
  1483. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=256 -DL1_DATA_ASSOCIATIVE=8 " \
  1484. "-DL2_SIZE=8388608 -DL2_LINESIZE=256 -DL2_ASSOCIATIVE=8 " \
  1485. "-DL3_SIZE=0 -DL3_LINESIZE=0 -DL3_ASSOCIATIVE=0 " \
  1486. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1487. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DHAVE_SVE -DARMV8"
  1488. #define LIBNAME "a64fx"
  1489. #define CORENAME "A64FX"
  1490. #endif
  1491. #ifdef FORCE_FT2000
  1492. #define ARMV8
  1493. #define FORCE
  1494. #define ARCHITECTURE "ARM64"
  1495. #define SUBARCHITECTURE "FT2000"
  1496. #define SUBDIRNAME "arm64"
  1497. #define ARCHCONFIG "-DFT2000 " \
  1498. "-DL1_CODE_SIZE=32768 -DL1_CODE_LINESIZE=64 -DL1_CODE_ASSOCIATIVE=8 " \
  1499. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 -DL1_DATA_ASSOCIATIVE=8 " \
  1500. "-DL2_SIZE=33554426-DL2_LINESIZE=64 -DL2_ASSOCIATIVE=8 " \
  1501. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 " \
  1502. "-DHAVE_VFPV4 -DHAVE_VFPV3 -DHAVE_VFP -DHAVE_NEON -DARMV8"
  1503. #define LIBNAME "ft2000"
  1504. #define CORENAME "FT2000"
  1505. #endif
  1506. #ifdef FORCE_ZARCH_GENERIC
  1507. #define FORCE
  1508. #define ARCHITECTURE "ZARCH"
  1509. #define SUBARCHITECTURE "ZARCH_GENERIC"
  1510. #define ARCHCONFIG "-DZARCH_GENERIC " \
  1511. "-DDTB_DEFAULT_ENTRIES=64"
  1512. #define LIBNAME "zarch_generic"
  1513. #define CORENAME "ZARCH_GENERIC"
  1514. #endif
  1515. #ifdef FORCE_Z13
  1516. #define FORCE
  1517. #define ARCHITECTURE "ZARCH"
  1518. #define SUBARCHITECTURE "Z13"
  1519. #define ARCHCONFIG "-DZ13 " \
  1520. "-DDTB_DEFAULT_ENTRIES=64"
  1521. #define LIBNAME "z13"
  1522. #define CORENAME "Z13"
  1523. #endif
  1524. #ifdef FORCE_Z14
  1525. #define FORCE
  1526. #define ARCHITECTURE "ZARCH"
  1527. #define SUBARCHITECTURE "Z14"
  1528. #define ARCHCONFIG "-DZ14 " \
  1529. "-DDTB_DEFAULT_ENTRIES=64"
  1530. #define LIBNAME "z14"
  1531. #define CORENAME "Z14"
  1532. #endif
  1533. #ifdef FORCE_EV4
  1534. #define FORCE
  1535. #define ARCHITECTURE "ALPHA"
  1536. #define SUBARCHITECTURE "ev4"
  1537. #define ARCHCONFIG "-DEV4 " \
  1538. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  1539. "-DL2_SIZE=2097152 -DL2_LINESIZE=32 " \
  1540. "-DDTB_DEFAULT_ENTRIES=32 -DDTB_SIZE=8192 "
  1541. #define LIBNAME "ev4"
  1542. #define CORENAME "EV4"
  1543. #endif
  1544. #ifdef FORCE_EV5
  1545. #define FORCE
  1546. #define ARCHITECTURE "ALPHA"
  1547. #define SUBARCHITECTURE "ev5"
  1548. #define ARCHCONFIG "-DEV5 " \
  1549. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=32 " \
  1550. "-DL2_SIZE=2097152 -DL2_LINESIZE=64 " \
  1551. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=8192 "
  1552. #define LIBNAME "ev5"
  1553. #define CORENAME "EV5"
  1554. #endif
  1555. #ifdef FORCE_EV6
  1556. #define FORCE
  1557. #define ARCHITECTURE "ALPHA"
  1558. #define SUBARCHITECTURE "ev6"
  1559. #define ARCHCONFIG "-DEV6 " \
  1560. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=64 " \
  1561. "-DL2_SIZE=4194304 -DL2_LINESIZE=64 " \
  1562. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=8192 "
  1563. #define LIBNAME "ev6"
  1564. #define CORENAME "EV6"
  1565. #endif
  1566. #ifdef FORCE_C910V
  1567. #define FORCE
  1568. #define ARCHITECTURE "RISCV64"
  1569. #ifdef NO_RV64GV
  1570. #define SUBARCHITECTURE "RISCV64_GENERIC"
  1571. #define SUBDIRNAME "riscv64"
  1572. #define ARCHCONFIG "-DRISCV64_GENERIC " \
  1573. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1574. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1575. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1576. #define LIBNAME "riscv64_generic"
  1577. #define CORENAME "RISCV64_GENERIC"
  1578. #else
  1579. #define SUBARCHITECTURE "C910V"
  1580. #define SUBDIRNAME "riscv64"
  1581. #define ARCHCONFIG "-DC910V " \
  1582. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1583. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1584. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1585. #define LIBNAME "c910v"
  1586. #define CORENAME "C910V"
  1587. #endif
  1588. #endif
  1589. #ifdef FORCE_x280
  1590. #define FORCE
  1591. #define ARCHITECTURE "RISCV64"
  1592. #define SUBARCHITECTURE "x280"
  1593. #define SUBDIRNAME "riscv64"
  1594. #define ARCHCONFIG "-Dx280 " \
  1595. "-DL1_DATA_SIZE=64536 -DL1_DATA_LINESIZE=32 " \
  1596. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  1597. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1598. #define LIBNAME "x280"
  1599. #define CORENAME "x280"
  1600. #else
  1601. #endif
  1602. #ifdef FORCE_RISCV64_ZVL256B
  1603. #define FORCE
  1604. #define ARCHITECTURE "RISCV64"
  1605. #define SUBARCHITECTURE "RISCV64_ZVL256B"
  1606. #define SUBDIRNAME "riscv64"
  1607. #define ARCHCONFIG "-DRISCV64_ZVL256B " \
  1608. "-DL1_DATA_SIZE=64536 -DL1_DATA_LINESIZE=32 " \
  1609. "-DL2_SIZE=262144 -DL2_LINESIZE=32 " \
  1610. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1611. #define LIBNAME "riscv64_zvl256b"
  1612. #define CORENAME "RISCV64_ZVL256B"
  1613. #endif
  1614. #ifdef FORCE_RISCV64_ZVL128B
  1615. #define FORCE
  1616. #define ARCHITECTURE "RISCV64"
  1617. #define SUBARCHITECTURE "RISCV64_ZVL128B"
  1618. #define SUBDIRNAME "riscv64"
  1619. #define ARCHCONFIG "-DRISCV64_ZVL128B " \
  1620. "-DL1_DATA_SIZE=32768 -DL1_DATA_LINESIZE=32 " \
  1621. "-DL2_SIZE=1048576 -DL2_LINESIZE=32 " \
  1622. "-DDTB_DEFAULT_ENTRIES=128 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=4 "
  1623. #define LIBNAME "riscv64_zvl128b"
  1624. #define CORENAME "RISCV64_ZVL128B"
  1625. #endif
  1626. #if defined(FORCE_E2K) || defined(__e2k__)
  1627. #define FORCE
  1628. #define ARCHITECTURE "E2K"
  1629. #define ARCHCONFIG "-DGENERIC " \
  1630. "-DL1_DATA_SIZE=16384 -DL1_DATA_LINESIZE=64 " \
  1631. "-DL2_SIZE=524288 -DL2_LINESIZE=64 " \
  1632. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1633. #define LIBNAME "generic"
  1634. #define CORENAME "generic"
  1635. #endif
  1636. #ifdef FORCE_CSKY
  1637. #define FORCE
  1638. #define ARCHITECTURE "CSKY"
  1639. #define SUBARCHITECTURE "CSKY"
  1640. #define SUBDIRNAME "csky"
  1641. #define ARCHCONFIG "-DCSKY" \
  1642. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1643. "-DL2_SIZE=524288 -DL2_LINESIZE=32 " \
  1644. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1645. #define LIBNAME "csky"
  1646. #define CORENAME "CSKY"
  1647. #endif
  1648. #ifdef FORCE_CK860FV
  1649. #define FORCE
  1650. #define ARCHITECTURE "CSKY"
  1651. #define SUBARCHITECTURE "CK860V"
  1652. #define SUBDIRNAME "csky"
  1653. #define ARCHCONFIG "-DCK860FV " \
  1654. "-DL1_DATA_SIZE=65536 -DL1_DATA_LINESIZE=32 " \
  1655. "-DL2_SIZE=524288 -DL2_LINESIZE=32 " \
  1656. "-DDTB_DEFAULT_ENTRIES=64 -DDTB_SIZE=4096 -DL2_ASSOCIATIVE=8 "
  1657. #define LIBNAME "ck860fv"
  1658. #define CORENAME "CK860FV"
  1659. #endif
  1660. #ifndef FORCE
  1661. #ifdef USER_TARGET
  1662. #error "The TARGET specified on the command line or in Makefile.rule is not supported. Please choose a target from TargetList.txt"
  1663. #endif
  1664. #if defined(__powerpc__) || defined(__powerpc) || defined(powerpc) || \
  1665. defined(__PPC__) || defined(PPC) || defined(_POWER) || defined(__POWERPC__)
  1666. #ifndef POWER
  1667. #define POWER
  1668. #endif
  1669. #define OPENBLAS_SUPPORTED
  1670. #endif
  1671. #if defined(__zarch__) || defined(__s390x__)
  1672. #define ZARCH
  1673. #include "cpuid_zarch.c"
  1674. #define OPENBLAS_SUPPORTED
  1675. #endif
  1676. #ifdef INTEL_AMD
  1677. #include "cpuid_x86.c"
  1678. #define OPENBLAS_SUPPORTED
  1679. #endif
  1680. #ifdef __ia64__
  1681. #include "cpuid_ia64.c"
  1682. #define OPENBLAS_SUPPORTED
  1683. #endif
  1684. #ifdef __alpha
  1685. #include "cpuid_alpha.c"
  1686. #define OPENBLAS_SUPPORTED
  1687. #endif
  1688. #ifdef POWER
  1689. #include "cpuid_power.c"
  1690. #define OPENBLAS_SUPPORTED
  1691. #endif
  1692. #ifdef sparc
  1693. #include "cpuid_sparc.c"
  1694. #define OPENBLAS_SUPPORTED
  1695. #endif
  1696. #ifdef __mips__
  1697. #ifdef __mips64
  1698. #include "cpuid_mips64.c"
  1699. #else
  1700. #include "cpuid_mips.c"
  1701. #endif
  1702. #define OPENBLAS_SUPPORTED
  1703. #endif
  1704. #ifdef __loongarch64
  1705. #include "cpuid_loongarch64.c"
  1706. #define OPENBLAS_SUPPORTED
  1707. #endif
  1708. #ifdef __riscv
  1709. #include "cpuid_riscv64.c"
  1710. #define OPENBLAS_SUPPORTED
  1711. #endif
  1712. #ifdef __arm__
  1713. #include "cpuid_arm.c"
  1714. #define OPENBLAS_SUPPORTED
  1715. #endif
  1716. #ifdef __aarch64__
  1717. #include "cpuid_arm64.c"
  1718. #define OPENBLAS_SUPPORTED
  1719. #endif
  1720. #ifndef OPENBLAS_SUPPORTED
  1721. #error "This arch/CPU is not supported by OpenBLAS."
  1722. #endif
  1723. #else
  1724. #endif
  1725. static int get_num_cores(void) {
  1726. int count;
  1727. #ifdef OS_WINDOWS
  1728. SYSTEM_INFO sysinfo;
  1729. #elif defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__DragonFly__) || defined(__APPLE__)
  1730. int m[2];
  1731. size_t len;
  1732. #endif
  1733. #if defined(linux) || defined(__sun__)
  1734. //returns the number of processors which are currently online
  1735. count = sysconf(_SC_NPROCESSORS_CONF);
  1736. if (count <= 0) count = 2;
  1737. return count;
  1738. #elif defined(OS_WINDOWS)
  1739. GetSystemInfo(&sysinfo);
  1740. return sysinfo.dwNumberOfProcessors;
  1741. #elif defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__DragonFly__) || defined(__APPLE__)
  1742. m[0] = CTL_HW;
  1743. m[1] = HW_NCPU;
  1744. len = sizeof(int);
  1745. sysctl(m, 2, &count, &len, NULL, 0);
  1746. if (count <= 0) count = 2;
  1747. return count;
  1748. #elif defined(_AIX)
  1749. //returns the number of processors which are currently online
  1750. count = sysconf(_SC_NPROCESSORS_ONLN);
  1751. if (count <= 0) count = 2;
  1752. return count;
  1753. #else
  1754. return 2;
  1755. #endif
  1756. }
  1757. int main(int argc, char *argv[]){
  1758. #ifdef FORCE
  1759. char buffer[8192], *p, *q;
  1760. int length;
  1761. #endif
  1762. if (argc == 1) return 0;
  1763. switch (argv[1][0]) {
  1764. case '0' : /* for Makefile */
  1765. #ifdef FORCE
  1766. printf("CORE=%s\n", CORENAME);
  1767. #else
  1768. #if defined(INTEL_AMD) || defined(POWER) || defined(__mips__) || defined(__arm__) || defined(__aarch64__) || defined(ZARCH) || defined(sparc) || defined(__loongarch__) || defined(__riscv) || defined(__alpha__) || defined(__csky__)
  1769. printf("CORE=%s\n", get_corename());
  1770. #endif
  1771. #endif
  1772. #ifdef FORCE
  1773. printf("LIBCORE=%s\n", LIBNAME);
  1774. #else
  1775. printf("LIBCORE=");
  1776. get_libname();
  1777. printf("\n");
  1778. #endif
  1779. printf("NUM_CORES=%d\n", get_num_cores());
  1780. #if defined(__arm__)
  1781. #if !defined(FORCE)
  1782. fprintf(stderr,"get features!\n");
  1783. get_features();
  1784. #else
  1785. fprintf(stderr,"split archconfig!\n");
  1786. sprintf(buffer, "%s", ARCHCONFIG);
  1787. p = &buffer[0];
  1788. while (*p) {
  1789. if ((*p == '-') && (*(p + 1) == 'D')) {
  1790. p += 2;
  1791. if (*p != 'H') {
  1792. while( (*p != ' ') && (*p != '-') && (*p != '\0') && (*p != '\n')) {p++; }
  1793. if (*p == '-') continue;
  1794. }
  1795. while ((*p != ' ') && (*p != '\0')) {
  1796. if (*p == '=') {
  1797. printf("=");
  1798. p ++;
  1799. while ((*p != ' ') && (*p != '\0')) {
  1800. printf("%c", *p);
  1801. p ++;
  1802. }
  1803. } else {
  1804. printf("%c", *p);
  1805. p ++;
  1806. if ((*p == ' ') || (*p =='\0')) printf("=1\n");
  1807. }
  1808. }
  1809. } else p ++;
  1810. }
  1811. #endif
  1812. #endif
  1813. #ifdef INTEL_AMD
  1814. #ifndef FORCE
  1815. get_sse();
  1816. #else
  1817. sprintf(buffer, "%s", ARCHCONFIG);
  1818. p = &buffer[0];
  1819. while (*p) {
  1820. if ((*p == '-') && (*(p + 1) == 'D')) {
  1821. p += 2;
  1822. while ((*p != ' ') && (*p != '\0')) {
  1823. if (*p == '=') {
  1824. printf("=");
  1825. p ++;
  1826. while ((*p != ' ') && (*p != '\0')) {
  1827. printf("%c", *p);
  1828. p ++;
  1829. }
  1830. } else {
  1831. printf("%c", *p);
  1832. p ++;
  1833. if ((*p == ' ') || (*p =='\0')) printf("=1");
  1834. }
  1835. }
  1836. printf("\n");
  1837. } else p ++;
  1838. }
  1839. #endif
  1840. #endif
  1841. #if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  1842. printf("__BYTE_ORDER__=__ORDER_BIG_ENDIAN__\n");
  1843. #elif defined(__BIG_ENDIAN__) && __BIG_ENDIAN__ > 0
  1844. printf("__BYTE_ORDER__=__ORDER_BIG_ENDIAN__\n");
  1845. #endif
  1846. #if defined(_CALL_ELF) && (_CALL_ELF == 2)
  1847. printf("ELF_VERSION=2\n");
  1848. #endif
  1849. #ifdef MAKE_NB_JOBS
  1850. #if MAKE_NB_JOBS > 0
  1851. printf("MAKEFLAGS += -j %d\n", MAKE_NB_JOBS);
  1852. #else
  1853. // Let make use parent -j argument or -j1 if there
  1854. // is no make parent
  1855. #endif
  1856. #elif NO_PARALLEL_MAKE==1
  1857. printf("MAKEFLAGS += -j 1\n");
  1858. #else
  1859. printf("MAKEFLAGS += -j %d\n", get_num_cores());
  1860. #endif
  1861. break;
  1862. case '1' : /* For config.h */
  1863. #ifdef FORCE
  1864. sprintf(buffer, "%s -DCORE_%s\n", ARCHCONFIG, CORENAME);
  1865. p = &buffer[0];
  1866. while (*p) {
  1867. if ((*p == '-') && (*(p + 1) == 'D')) {
  1868. p += 2;
  1869. printf("#define ");
  1870. while ((*p != ' ') && (*p != '\0')) {
  1871. if (*p == '=') {
  1872. printf(" ");
  1873. p ++;
  1874. while ((*p != ' ') && (*p != '\0')) {
  1875. printf("%c", *p);
  1876. p ++;
  1877. }
  1878. } else {
  1879. if (*p != '\n')
  1880. printf("%c", *p);
  1881. p ++;
  1882. }
  1883. }
  1884. printf("\n");
  1885. } else p ++;
  1886. }
  1887. #else
  1888. get_cpuconfig();
  1889. #endif
  1890. #ifdef FORCE
  1891. printf("#define CHAR_CORENAME \"%s\"\n", CORENAME);
  1892. #else
  1893. #if defined(INTEL_AMD) || defined(POWER) || defined(__mips__) || defined(__arm__) || defined(__aarch64__) || defined(ZARCH) || defined(sparc) || defined(__loongarch__) || defined(__riscv) || defined(__csky__)
  1894. printf("#define CHAR_CORENAME \"%s\"\n", get_corename());
  1895. #endif
  1896. #endif
  1897. break;
  1898. case '2' : /* SMP */
  1899. if (get_num_cores() > 1) printf("SMP=1\n");
  1900. break;
  1901. }
  1902. fflush(stdout);
  1903. return 0;
  1904. }