You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

TargetList.txt 1.7 kB

3 months ago
10 years ago
10 years ago
10 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
2 months ago
Simplifying ARMv8 build parameters ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode (which is not right because TX2 is ARMv8.1) as well as requiring a few redundancies in the defines, making it harder to maintain and understand what core has what. A few other minor issues were also fixed. Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX, ThunderX2, and XGene. Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester. A summary: * Removed TX2 code from ARMv8 build, to make sure it is compatible with all ARMv8 cores, not just v8.1. Also, the TX2 code has actually harmed performance on big cores. * Commoned up ARMv8 architectures' defines in params.h, to make sure that all will benefit from ARMv8 settings, in addition to their own. * Adding a few more cores, using ARMv8's include strategy, to benefit from compiler optimisations using mtune. Also updated cache information from the manuals, making sure we set good conservative values by default. Removed Vulcan, as it's an alias to TX2. * Auto-detecting most of those cores, but also updating the forced compilation in getarch.c, to make sure the parameters are the same whether compiled natively or forced arch. Benefits: * ARMv8 build is now guaranteed to work on all ARMv8 cores * Improved performance for ARMv8 builds on some cores (A72, Falkor, ThunderX1 and 2: up to 11%) over current develop * Improved performance for *all* cores comparing to develop branch before TX2's patch (9% ~ 36%) * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than current develop's branch and 8% faster than deveop before tx2 patches Issues: * Regression from current develop branch for A53 (-12%) and A57 (-3%) with ARMv8 builds, but still faster than before TX2's commit (+15% and +24% respectively). This can be improved with a simplification of TX2's code, to be done in future patches. At least the code is guaranteed to be ARMv8.0 now. Comments: * CortexA57 builds are unchanged on A57 hardware from develop's branch, which makes sense, as it's untouched. * CortexA72 builds improve over A57 on A72 hardware, even if they're using the same includes due to new compiler tunning in the makefile.
6 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154
  1. Force Target Examples:
  2. make TARGET=NEHALEM
  3. make TARGET=LOONGSON3A BINARY=64
  4. make TARGET=ISTANBUL
  5. Supported List:
  6. 1.X86/X86_64
  7. a)Intel CPU:
  8. P2
  9. KATMAI
  10. COPPERMINE
  11. NORTHWOOD
  12. PRESCOTT
  13. BANIAS
  14. YONAH
  15. CORE2
  16. PENRYN
  17. DUNNINGTON
  18. NEHALEM
  19. SANDYBRIDGE
  20. HASWELL
  21. SKYLAKEX
  22. ATOM
  23. COOPERLAKE
  24. SAPPHIRERAPIDS
  25. b)AMD CPU:
  26. ATHLON
  27. OPTERON
  28. OPTERON_SSE3
  29. BARCELONA
  30. SHANGHAI
  31. ISTANBUL
  32. BOBCAT
  33. BULLDOZER
  34. PILEDRIVER
  35. STEAMROLLER
  36. EXCAVATOR
  37. ZEN
  38. c)VIA CPU:
  39. SSE_GENERIC
  40. VIAC3
  41. NANO
  42. 2.Power CPU:
  43. POWER4
  44. POWER5
  45. POWER6
  46. POWER7
  47. POWER8
  48. POWER9
  49. POWER10
  50. POWER11
  51. PPCG4
  52. PPC970
  53. PPC970MP
  54. PPC440
  55. PPC440FP2
  56. CELL
  57. 3.MIPS CPU:
  58. P5600
  59. MIPS1004K
  60. MIPS24K
  61. 4.MIPS64 CPU:
  62. MIPS64_GENERIC
  63. SICORTEX
  64. LOONGSON3A
  65. LOONGSON3B
  66. I6400
  67. P6600
  68. I6500
  69. 5.IA64 CPU:
  70. ITANIUM2
  71. 6.SPARC CPU:
  72. SPARC
  73. SPARCV7
  74. 7.ARM CPU:
  75. CORTEXA15
  76. CORTEXA9
  77. ARMV7
  78. ARMV6
  79. ARMV5
  80. 8.ARM 64-bit CPU:
  81. ARMV8
  82. CORTEXA53
  83. CORTEXA57
  84. CORTEXA72
  85. CORTEXA73
  86. CORTEXA76
  87. CORTEXA510
  88. CORTEXA710
  89. CORTEXX1
  90. CORTEXX2
  91. NEOVERSEN1
  92. NEOVERSEV1
  93. NEOVERSEN2
  94. NEOVERSEV2
  95. CORTEXA55
  96. EMAG8180
  97. FALKOR
  98. THUNDERX
  99. THUNDERX2T99
  100. TSV110
  101. THUNDERX3T110
  102. VORTEX
  103. A64FX
  104. ARMV8SVE
  105. ARMV9SME
  106. FT2000
  107. 9.System Z:
  108. ZARCH_GENERIC
  109. Z13
  110. Z14
  111. 10.RISC-V 64:
  112. RISCV64_GENERIC (e.g. PolarFire Soc/SiFive U54)
  113. RISCV64_ZVL128B
  114. C910V
  115. x280
  116. RISCV64_ZVL256B
  117. 11.LOONGARCH64:
  118. // LOONGSONGENERIC/LOONGSON2K1000/LOONGSON3R5 are legacy names,
  119. // and it is recommended to use the more standardized naming conventions
  120. // LA64_GENERIC/LA264/LA464. You can still specify TARGET as
  121. // LOONGSONGENERIC/LOONGSON2K1000/LOONGSON3R5 during compilation or runtime,
  122. // and they will be internally relocated to LA64_GENERIC/LA264/LA464.
  123. LOONGSONGENERIC
  124. LOONGSON2K1000
  125. LOONGSON3R5
  126. LA64_GENERIC
  127. LA264
  128. LA464
  129. 12. Elbrus E2000:
  130. E2K
  131. 13. Alpha
  132. EV4
  133. EV5
  134. EV6
  135. 14.CSKY
  136. CSKY
  137. CK860FV