[llvm] [AArch64] Improve host feature detection. (PR #160410)

via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 19 21:03:21 PDT 2025


StDymphna wrote:

@efriedma-quic  Sorry for the delay.

So, it did actually fix the ORC JIT example I gave when originally reporting the issue! However, clang itself is still improperly enabling it. As per above:
`./bin/clang++ -mcpu=native -print-enabled-extensions`
or
`printf '' | clang -march=native -x c -E -### -`
Still shows sve is enabled (as shown previously).

llvm-project-21.1.3.src/llvm/lib/TargetParser/Host.cpp:1505

<details>
<summary>GDB CPU_INFO Content dump</summary>

```
1505      StringRef Content = P ? P->getBuffer() : "";
                                                                             (gdb) n
1506      return detail::getHostCPUNameForARM(Content);

(gdb) info locals

P = {__ptr_ = {<std::__ndk1::__compressed_pair_elem<llvm::MemoryBuffer*, 0, false>> = {             __value_ = 0xb400007fded82c00}, <std::__ndk1::__compressed_pair_elem<std::__ndk1::default_delete<llvm::MemoryBuffer>, 1, true>> = {<std::__ndk1::default_delete<llvm::MemoryBuffer>> = {<No data fields>}, <No data fields>}, <No data fields>}}
 Content = {static npos = 18446744073709551615,
  Data = 0xb400007fded82c30 "processor\t: 0\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x1\nCPU part\t: 0xd46\nCPU revision\t: 1\n\nprocessor\t: 1\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x1\nCPU part\t: 0xd46\nCPU revision\t: 1\n\nprocessor\t: 2\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x1\nCPU part\t: 0xd46\nCPU revision\t: 1\n\nprocessor\t: 3\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x1\nCPU part\t: 0xd4d\nCPU revision\t: 0\n\nprocessor\t: 4\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x1\nCPU part\t: 0xd4d\nCPU revision\t: 0\n\nprocessor\t: 5\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x2\nCPU part\t: 0xd47\nCPU revision\t: 0\n\nprocessor\t: 6\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x2\nCPU part\t: 0xd47\nCPU revision\t: 0\n\nprocessor\t: 7\nBogoMIPS\t: 38.40\nFeatures\t: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint i8mm bf16 bti\nCPU implementer\t: 0x41\nCPU architecture: 8\nCPU variant\t: 0x1\nCPU part\t: 0xd4e\nCPU revision\t: 0\n\n", Length = 2776}
```

</details>

As far as I can tell, it never even calls getHostCPUFeatures. I set line and function based breakpoints, and getProcCpuinfoContent was the only one that ever triggered. I stepped through a fair amount of the code as well.
<details>
<summary>GDB getProcCpuinfoContent Breakpoint and getHostCPUFeatures Traceback</summary>

```
Breakpoint 2, getProcCpuinfoContent ()
 at .../llvm-project-21.1.3.src/llvm/lib/TargetParser/Host.cpp 73
 73        if (const char *CpuinfoIntercept = std::getenv("LLVM_CPUINFO"))
 (gdb) bt
 #0 getProcCpuinfoContent()
 at .../llvm-project-21.1.3.src/llvm/lib/TargetParser/Host.cpp:73
 #1 0x0000007ff4b83d18 in llvm::sys::getHostCPUName()
 at .../llvm-project-21.1.3.src/llvm/lib/TargetParser/Host.cpp:1504
 #2 0x0000007feb2d8b08 in getAArch64ArchFeaturesFromMarch (D=..., March=..., Args=..., Extensions=...)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/ToolChains/Arch/AArch64.cpp:124
 #3 0x0000007feb2d5898 in clang::driver::tools::aarch64::getAArch64TargetFeatures (D=..., Triple=..., Args=..., Features=..., ForAS=124, ForMultilib=108)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/ToolChains/Arch/AArch64.cpp:195
 #4 0x0000007feb34b36c in clang::driver::tools::getTargetFeatures (D=..., Triple=..., Args=..., CmdArgs=..., ForAS=<optimized out>, IsAux=false)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/ToolChains/CommonArgs.cpp:850
 #5 0x0000007feb3134ac in clang::driver::tools::Clang::RenderTargetOptions ( this=this at entry=0xb400007fe18e86e0, EffectiveTriple=..., Args=..., KernelOrKext=<optimized out>, CmdArgs=...)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/ToolChains/Clang.cpp:1548
 #6 0x0000007feb31e984 in clang::driver::tools::Clang::ConstructJob (this=0xb400007fe18e86e0, C=..., JA=..., Output=..., Inputs=..., Args=..., LinkingOutput=<optimized out>)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/ToolChains/Clang.cpp:6070
 #7 0x0000007feb28fa90 in clang::driver::Driver::BuildJobsForActionNoCache (this=<optimized out>, C=..., A=<optimized out>, TC=<optimized out>, BoundArch=..., AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0, CachedResults=..., TargetDeviceOffloadKind=clang::driver::Action::OFK_None)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/Driver.cpp:6102
 #8 0x0000007feb28e1e0 in clang::driver::Driver::BuildJobsForAction (this=this at entry=0x7fffff59c0, C=..., A=0xb400007fe189a180, TC=0xb400007fe180bc00, BoundArch=..., AtTopLevel=<optimized out>, MultipleArchs=false, LinkingOutput=0x0, CachedResults=..., TargetDeviceOffloadKind=clang::driver::Action::OFK_None)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/Driver.cpp:5789
 #9 0x0000007feb282114 in clang::driver::Driver::BuildJobs (this=this at entry=0x7fffff59c0, C=...)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/Driver.cpp:5314
 #10 0x0000007feb27bab8 in clang::driver::Driver::BuildCompilation (this=0x7fffff59c0, ArgList=...)
 at .../llvm-project-21.1.3.src/clang/lib/Driver/Driver.cpp:1846
 #11 0x00000055555633f4 in clang_main (Argc=<optimized out>, Argv=<optimized out>,
 ToolContext=...)
 at .../llvm-project-21.1.3.src/clang/tools/driver/driver.cpp:376
 #12 0x0000005555570308 in main (argc=3, argv=0x7fffff9be8)
 at .../llvm-project-21.1.3.src/build/tools/clang/tools/driver/clang-driver.cpp:17
```
</details>

A quick look through it all, and I'd guess the relevant functions in src/clang/lib/Driver/ToolChains/Arch/AArch64.cpp do not do any checking and completely ignore most of the information they get back. addDefaultExts just looks at the CPU name, and adds the default extensions that the detected arch name claims to support; it doesn't seem to check/override with negative options when /proc/cpuinfo contradicts it, etc.

llvm/lib/Target/AArch64/AArch64Features.td
```
def HasV9_0aOps : Architecture64<9, 0, "a", "v9a",
  [HasV8_5aOps],
  !listconcat(HasV8_5aOps.DefaultExts, [FeatureFullFP16, FeatureSVE,
    FeatureSVE2])>;
```

I did not do any further looking there, but I would not be surprised if other related issues could exist.


https://github.com/llvm/llvm-project/pull/160410


More information about the llvm-commits mailing list