[PATCH] D145168: Move DBG_VALUE's that depend on loads to after a load if the load is moved due to the pre register allocation ld/st optimization pass

Adrian Prantl via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 3 10:15:41 PST 2023


aprantl added a comment.

Nice! I appreciate the detailed explanation of the algorithm. I have some suggestions for how to streamline it a bit and potentially make it easier to read inside.



================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2590
 
+  // Reschedule DBG_VALs to match any loads that were moved. The issue here is
+  // that there can be a scenario where debug information is lost because of the
----------------
typo: DBG_VALUEs (makes grepping easier)


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2594
+  // result describes the debug infomation for a local variable gets moved below
+  // the load and that causes the debug information for that load to get lost.
+
----------------
Maybe the entire sentence could be compressed to:

When a load is sunk beyond a DBG_VALUE that is referring to it, the reference to the load in the DBG_VALUE becomes undef, resulting in a loss of debug info.
Example:


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2607
+  // inst_b
+  // DBG_VALUE %2, "x", ...
+  // %2 = ld ...
----------------
// DBG_VALUE undef, "x"


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2610
+  // %3 = ld ...
+
+  // The result of this pass would move the load for virtual register %2 that is
----------------
The code below addresses this by moving the DBG_VALUE to the position immediately after the load. Example:


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2624
+
+  // The way this code works is it defines three maps, the RegisterMap,
+  // LocalVarMap, and InstrMap.
----------------
I would move the block detailing the maps until after introducing the general idea of the algorithm, otherwise the reader won't know what problem they're solving. It might be good to very briefly outline the idea first before diving into the implementation. I.e., what are the condition the algorithm is checking for, what is it trying to avoid and why.


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2631
+
+  // LocalVarMap: This map is used to track the relationship between a Debug
+  // Variable and the DBG_VAL MachineInstr pointer that describes the debug
----------------
maybe put these comments next to the variable declarations?


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2638
+
+  // The algorithm to solve this issue first uses the RescheduleOps function to
+  // populate the RegisterMap with the registers that were moved. The keys are
----------------
There's a bunch of filler words that can be dropped without loosing any contents:

The algorithm works in two phases: First RescheduleOps() populates RegisterMap with the registers that were moved.

(Q: The wording is curious because it's loads that define registers that are being moved, maybe we can integrate this)


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2643
+  // instruction is a DBG_VAL and it describes a valid debug variable we can
+  // move forward. If the first operand of the DBG_VAL is a register and that
+  // register exists in the RegisterMap, we know this is a DBG_VAL that depends
----------------
Can you clarify "we can move forward"?


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2643
+  // instruction is a DBG_VAL and it describes a valid debug variable we can
+  // move forward. If the first operand of the DBG_VAL is a register and that
+  // register exists in the RegisterMap, we know this is a DBG_VAL that depends
----------------
aprantl wrote:
> Can you clarify "we can move forward"?
If the first operand of the DBG_VAL ...
Could we also handle the variadic DBG_VALUE_LIST instructions?


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2760
+  E = MBB->end();
+  DenseMap<const DILocalVariable *, MachineInstr *> LocalVarMap;
+  DenseMap<MachineInstr *, Register> InstrMap;
----------------
Might consider a `SmallDenseMap<8>` for better performance in the average case.
If you want to be really scientific about it you could compile a large file with optimizations and print the LocalVarMap.size() at the end of the function and then see if there is a sweet spot of small element size (<16) based on that data. But using 8 is fine, too!


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2760
+  E = MBB->end();
+  DenseMap<const DILocalVariable *, MachineInstr *> LocalVarMap;
+  DenseMap<MachineInstr *, Register> InstrMap;
----------------
aprantl wrote:
> Might consider a `SmallDenseMap<8>` for better performance in the average case.
> If you want to be really scientific about it you could compile a large file with optimizations and print the LocalVarMap.size() at the end of the function and then see if there is a sweet spot of small element size (<16) based on that data. But using 8 is fine, too!
Would a name like DbgValueSinkCandidates be more descriptive? (Not sure if that captures the semantics perfectly).


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2761
+  DenseMap<const DILocalVariable *, MachineInstr *> LocalVarMap;
+  DenseMap<MachineInstr *, Register> InstrMap;
+  for (; MBBI != E; ++MBBI) {
----------------
Same here.


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2764
+    MachineInstr &MI = *MBBI;
+    if (MI.isDebugInstr() && MI.isDebugValueLike()) {
+      auto Op = MI.getOperand(0);
----------------
Are there isDebugInstr() that are not isDebugValueLike()?




================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2775
+      // RegisterMap and InstrMap.
+      if (Op.isReg() && RegisterMap.find(Op.getReg()) != RegisterMap.end()) {
+        RegisterMap[Op.getReg()] = &MI;
----------------
micro optimization: you're looking up Op.getReg twice here.

auto RegMapEntry = RegisterMap.find(Op.getReg())
if (Op.isReg() && RegMapEntry != RegisterMap.end()) {
  RegMapEntry = &MI;


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2778
+        InstrMap[&MI] = Op.getReg();
+      }
+      // Check if DBG_VAL describes a variable which is also described by a
----------------
Maybe focus on the high level goal in the documentation here:

If the current DBG_VALUE describes the same variable as one of the in-flight DBG_VALUEs, remove the candidate from the list and set it to undef. Moving one DBG_VALUE past another would result in the variable's value going back in time when stepping through the block in the debugger.


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2785
+      // to be moved.
+      if (LocalVarMap.find(DbgVar) != LocalVarMap.end()) {
+        auto *Instr = LocalVarMap[DbgVar]; // TODO: Switch to iterator usage
----------------
double lookup


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2787
+        auto *Instr = LocalVarMap[DbgVar]; // TODO: Switch to iterator usage
+        if (InstrMap.find(Instr) != InstrMap.end()) {
+          RegisterMap[InstrMap[Instr]] = nullptr;
----------------
as you already commented: double lookup


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2801
+        if (RegisterMap.find(Reg) == RegisterMap.end() ||
+            RegisterMap[Reg] == nullptr)
+          continue;
----------------
!RegisterMap[Reg]
but also, double lookup


================
Comment at: llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp:2807
+        MBB->splice(InsertPos, MBB, DbgVal);
+        MBBI++;
+      }
----------------
Is this safe? Aren't we going to visit the instruction we just inserted next? I'm not sure how the semantics of the iterator are defined if inserting on the fly.


================
Comment at: llvm/test/DebugInfo/ARM/move-dbg-values.mir:4
+# CHECK-NEXT: DBG_VALUE $noreg, $noreg, !60, !DIExpression(), debug-location !95
+# CHECK-NEXT: DBG_VALUE 0, $noreg, !60, !DIExpression(), debug-location !95
+# CHECK-NEXT: %0:rgpr = t2LDRi12 %5, 0, 14 /* CC::al */, $noreg, debug-location !95 :: (load (s32) from %ir..backtrace_user.ctl_default, align 8)
----------------
Nice. Looks like this covers all of the interesting cases?


================
Comment at: llvm/test/DebugInfo/ARM/move-dbg-values.mir:69
+  !7 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !8, size: 32)
+  !8 = !DIDerivedType(tag: DW_TAG_typedef, line: 5, baseType: !9)
+  !9 = !DIBasicType(name: "unsigned long", encoding: DW_ATE_unsigned)
----------------
You might be able to simplify the IR here, by just making up new variables with basic types and dropping most of the unnecessary debug info.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145168/new/

https://reviews.llvm.org/D145168



More information about the llvm-commits mailing list