[llvm] Fixes inlining issue in armv7 (PR #169337)

via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 24 07:17:33 PST 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-backend-arm

Author: Croose (CrooseGit)

<details>
<summary>Changes</summary>

There is an issue on armv7 where a function wont be inlined due to mismatching target features between caller and callee.
The caller has `HasV8Ops` and `FeatureDotProd` and the callee does not, but AFAIK this should not be a problem.
https://godbolt.org/z/f19h3zT66 is an example showing how the call is not inlined on armv7.
The expected asm output would be something like:
```asm
.fnstart
	vsdot.s8	q0, q1, d4[0]
	bx	lr
.Lfunc_end0:

```
Thanks to @<!-- -->Amichaxx  we managed to narrow it down and now can resolve this problem by adding `ARM::FeatureDotProd, ARM::HasV8Ops` to InlineFeaturesAllowed in llvm/lib/Target/ARM/ARMTargetTransformInfo.h, after which the inlining occurs successfully.

Whilst we're at it we have also added some debugging to make it easier to tell why (or why not) a function is being inlined for ARM, and a couple other features that seem to be missing from the list.

This patch was motivated by an issue experienced with rust that was traced back to llvm, and thus was designed to address that.


---
Full diff: https://github.com/llvm/llvm-project/pull/169337.diff


2 Files Affected:

- (modified) llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp (+44) 
- (modified) llvm/lib/Target/ARM/ARMTargetTransformInfo.h (+1) 


``````````diff
diff --git a/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp b/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
index d12b802fe234f..89ebc3e715930 100644
--- a/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
@@ -102,6 +102,50 @@ bool ARMTTIImpl::areInlineCompatible(const Function *Caller,
   // the callers'.
   bool MatchSubset = ((CallerBits & CalleeBits) & InlineFeaturesAllowed) ==
                      (CalleeBits & InlineFeaturesAllowed);
+
+  LLVM_DEBUG({
+    dbgs() << "=== Inline compatibility debug ===\n";
+    dbgs() << "Caller: " << Caller->getName() << "\n";
+    dbgs() << "Callee: " << Callee->getName() << "\n";
+
+    // Bit diffs
+    FeatureBitset MissingInCaller = CalleeBits & ~CallerBits; // callee-only
+    FeatureBitset ExtraInCaller   = CallerBits & ~CalleeBits; // caller-only
+
+    // Counts
+    dbgs() << "Only-in-caller bit count: " << ExtraInCaller.count() << "\n";
+    dbgs() << "Only-in-callee bit count: " << MissingInCaller.count() << "\n";
+ 
+    dbgs() << "Only-in-caller feature indices [";
+    {
+      bool First = true;
+      for (size_t I = 0, E = ExtraInCaller.size(); I < E; ++I) {
+        if (ExtraInCaller.test(I)) {
+          if (!First) dbgs() << ", ";
+          dbgs() << I;
+          First = false;
+        }
+      }
+    }
+    dbgs() << "]\n";
+
+    dbgs() << "Only-in-callee feature indices [";
+    {
+      bool First = true;
+      for (size_t I = 0, E = MissingInCaller.size(); I < E; ++I) {
+        if (MissingInCaller.test(I)) {
+          if (!First) dbgs() << ", ";
+          dbgs() << I;
+          First = false;
+        }
+      }
+    }
+    dbgs() << "]\n";
+
+    // Indicies map to features as found in llvm-project/(your_build)/lib/Target/ARM/ARMGenSubtargetInfo.inc
+    dbgs() << "MatchExact="  << (MatchExact  ? "true" : "false")
+           << " MatchSubset=" << (MatchSubset ? "true" : "false") << "\n";
+  }); 
   return MatchExact && MatchSubset;
 }
 
diff --git a/llvm/lib/Target/ARM/ARMTargetTransformInfo.h b/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
index 919a6fc9fd0b0..2ecfce0de9f55 100644
--- a/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
+++ b/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
@@ -70,6 +70,7 @@ class ARMTTIImpl final : public BasicTTIImplBase<ARMTTIImpl> {
   // -thumb-mode in a caller with +thumb-mode, may cause the assembler to
   // fail if the callee uses ARM only instructions, e.g. in inline asm.
   const FeatureBitset InlineFeaturesAllowed = {
+      ARM::FeatureDotProd, ARM::HasV8Ops, ARM::FeatureBF16, ARM::FeatureSB,
       ARM::FeatureVFP2, ARM::FeatureVFP3, ARM::FeatureNEON, ARM::FeatureThumb2,
       ARM::FeatureFP16, ARM::FeatureVFP4, ARM::FeatureFPARMv8,
       ARM::FeatureFullFP16, ARM::FeatureFP16FML, ARM::FeatureHWDivThumb,

``````````

</details>


https://github.com/llvm/llvm-project/pull/169337


More information about the llvm-commits mailing list