[llvm] Only guard loop metadata that has non-debug info in it (PR #118825)

via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 5 08:00:08 PST 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-llvm-ir

Author: Dominik Steenken (dominik-steenken)

<details>
<summary>Changes</summary>

This PR is motivated by a mismatch we discovered between compilation results with vs. without `-g3`. We noticed this when compiling SPEC2017 testcases. The specific instance we saw is fixed in this PR by modifying a guard (see below), but it is likely similar instances exist elsewhere in the codebase.

The specific case fixed in this PR manifests itself in the `SimplifyCFG` pass doing different things depending on whether DebugInfo is generated or not. At the end of this comment, there is reduced example code that shows the behavior in question.

The differing behavior has two root causes:
1. Commit https://github.com/llvm/llvm-project/commit/c07e19b adds loop metadata including debug locations to loops that otherwise would not have loop metadata
2. Commit https://github.com/llvm/llvm-project/commit/ac28efa6c100 adds a guard to a simplification action in `SImplifyCFG` that prevents it from simplifying away loop metadata

So, the change in 2. does not consider that when compiling with debug symbols, loops that otherwise would not have metadata that needs preserving, now have debug locations in their loop metadata. Thus, with `-g3`, `SimplifyCFG` behaves differently than without it.

The larger issue is that while debug info is not supposed to influence the final compilation result, commits like 1. blur the line between what is and is not debug info, and not all optimization passes account for this.

This PR does not address that and rather just modifies this particular guard in order to restore equivalent behavior between debug and non-debug builds in this one instance.

---

Here is a reduced version of a file from `f526.blender_r` that showcases the behavior in question:
```C
struct LinkNode;
typedef struct LinkNode {
 struct LinkNode *next;
 void *link;
} LinkNode;

void do_projectpaint_thread_ph_v_state() {
  int *ps = do_projectpaint_thread_ph_v_state;
  LinkNode *node;
  while (do_projectpaint_thread_ph_v_state)
    for (node = ps; node; node = node->next)
      ;
}
```
Compiling this with and without DebugInfo, and then disassembling the results, leads to different outcomes (tested on SystemZ and X86). The reason for this is that the `SimplifyCFG` pass does different things in either case.


---
Full diff: https://github.com/llvm/llvm-project/pull/118825.diff


3 Files Affected:

- (modified) llvm/include/llvm/IR/Instruction.h (+4) 
- (modified) llvm/lib/IR/Instruction.cpp (+25) 
- (modified) llvm/lib/Transforms/Utils/Local.cpp (+1-1) 


``````````diff
diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h
index 730baa8cc00520..ea3d93d856efb7 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -367,6 +367,10 @@ class Instruction : public User,
   /// Return true if this instruction has any metadata attached to it.
   bool hasMetadata() const { return DbgLoc || Value::hasMetadata(); }
 
+  // Return true if this instruction contains loop metadata other than
+  // a debug location
+  bool hasLoopMetadataOtherThanDebugLoc() const;
+
   /// Return true if this instruction has metadata attached to it other than a
   /// debug location.
   bool hasMetadataOtherThanDebugLoc() const { return Value::hasMetadata(); }
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 0137bb281e7ec5..6c19712481ca9a 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -12,6 +12,7 @@
 
 #include "llvm/IR/Instruction.h"
 #include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/STLExtras.h"
 #include "llvm/IR/AttributeMask.h"
 #include "llvm/IR/Attributes.h"
 #include "llvm/IR/Constants.h"
@@ -19,6 +20,7 @@
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/MemoryModelRelaxationAnnotations.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Operator.h"
@@ -461,6 +463,29 @@ bool Instruction::hasPoisonGeneratingMetadata() const {
          hasMetadata(LLVMContext::MD_align);
 }
 
+bool Instruction::hasLoopMetadataOtherThanDebugLoc() const {
+  // If there is no loop metadata at all, we also don't have
+  // non-debug loop metadata, obviously.
+  if  (!hasMetadata(LLVMContext::MD_loop))
+    return false;
+
+  // If we do have loop metadata, retrieve it.
+  MDNode * LoopMD = getMetadata(LLVMContext::MD_loop);
+
+  // Check if the existing operands are debug locations. This loop
+  // should terminate after at most three iterations. Skip
+  // the first item because it is a self-reference.
+  for (const MDOperand& Op : llvm::drop_begin(LoopMD->operands())) {
+    // check for debug location type by attempting a cast.
+    if (!dyn_cast<DILocation>(Op)) {
+      return true;
+    }
+  }
+
+  // If we get here, then all we have is debug locations in the loop metadata.
+  return false;
+}
+
 void Instruction::dropPoisonGeneratingMetadata() {
   eraseMetadata(LLVMContext::MD_range);
   eraseMetadata(LLVMContext::MD_nonnull);
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp
index cdc3f0308fe59c..876017e9051067 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -1279,7 +1279,7 @@ bool llvm::TryToSimplifyUncondBranchFromEmptyBlock(BasicBlock *BB,
   // |    for.body <---- (md2)
   // |_______|  |______|
   if (Instruction *TI = BB->getTerminator())
-    if (TI->hasMetadata(LLVMContext::MD_loop))
+    if (TI->hasLoopMetadataOtherThanDebugLoc())
       for (BasicBlock *Pred : predecessors(BB))
         if (Instruction *PredTI = Pred->getTerminator())
           if (PredTI->hasMetadata(LLVMContext::MD_loop))

``````````

</details>


https://github.com/llvm/llvm-project/pull/118825


More information about the llvm-commits mailing list