[PATCH] D82545: [Debugify] Make the debugify aware of the original (-g) Debug Info

Fri Sep 25 06:57:44 PDT 2020

StephenTozer added inline comments.

================
Comment at: llvm/include/llvm/Transforms/Utils/VerifyDIPreserve.h:165
+    bool WrapWithVerifyDIPreservation =
+        (Mode == VerifyDIPreserveMode::Debugify ||
+         Mode == VerifyDIPreserveMode::OriginalDebugInfo) &&
----------------
Minor, this first condition would look better as `Mode != VerifyDIPreserveMode::NoVerify` in my opinion.

================
Comment at: llvm/lib/Transforms/Utils/Debugify.cpp:355
+
+      auto InstrIt = DILocsBefore.find(Instr);
+      if (InstrIt == DILocsBefore.end()) {
----------------
djtodoro wrote:
> vsk wrote:
> > djtodoro wrote:
> > > vsk wrote:
> > > > Is `Instr` an `llvm::Instruction *`? I wonder whether it is sound to assume that pointer equality for `llvm::Instruction` implies actual equality, if we're comparing pointers before/after a pass runs. It's possible for a pass to delete and then create instructions: that could result in pointer reuse/recycling.
> > > It is an `llvm::Instruction *`. I see... any idea how to address that?
> > > (I was trying to avoid having to make a temp object of the instr...)
> > I dug into this a bit and unfortunately it looks like a thorny issue. It looks like Instructions are (de)allocated via `User::operator new` and `User::operator delete`, and these essentially call into the system allocator. Relying on pointer equality to identify a unique instruction before/after a pass modifies the IR might necessitate e.g. changing the `delete` implementation to never free memory. That might not be practical -- it's certainly a big hammer.
> > 
> > It's possible that there's some alternative approach, but I'm not sure whether they'd be effective. One idea might be to use metadata to uniquely identify instructions (not !dbg metadata, of course :), and to teach the `dropUnknownNonDebugMetadata` utility to preserve that "inst ID" metadata. (I'd be curious to get some other opinions on whether that's workable.)
> Thanks! I have also taken a look, and I agree that the gymnastic with the `new/delete` operators won't be practical choice.
> 
> Using some special kind of metadata can help us for this purpose. The "inst ID" metadata sounds good to me, since we are going to use it for this purpose **only**. We can call it `!debug-instr-number`, similar to the LiveDebugValues/InstrRef (D85741) work.
> 
> Let me try to implement a prototype for it, so we can see what are the pros && cons.
The metadata approach should work, although as discussed it may require changes to `dropUnknownNonDebugMetadata`. It might be a little messy though; I think it would also be viable to take the following approach: In the "before" step, populate a map of `Instruction* Instr -> WeakVH(Instr)`. With this map, you can check `Map[Instr] != nullptr` to determine whether `Instr` was deleted at any point in the pass; if it wasn't deleted, then you can simply use raw pointer comparison. This avoids the need to fiddle with the internals of `Instruction` that could potentially affect performance or the behaviour of other code.

I think it worth adding that there will be some noise with this approach; the DebugLoc metadata for any new generated instructions will be dependent on the DebugLoc metadata for the instruction they're based on. For example, cloning an instruction , replacing a conditional branch with an unconditional one, reassociating arithmetic, etc. In each of these cases, we will get a warning from the preservation pass that a new instruction did not receive a DILocation, even though the pass that generated it had no responsibility for it. It's technically correct, but it may result in a continuously increasing amount of noise from pass-to-pass, with the worst case being that there are no DebugLocs left at the start of a pass so that we get an error for every instruction created even if the pass itself is handling DebugLocs correctly.

Unfortunately I don't think there's a way around this issue; if we had a way of programmatically determining which Instruction(s) that a new instruction //should// derive its DebugLoc from, then we wouldn't have as much need for debugify to begin with.

================
Comment at: llvm/lib/Transforms/Utils/VerifyDIPreserve.cpp:326
+        continue;
+      // If the function had the SP attacched before the pass, consider it as
+      // a debug info bug.
----------------
Minor, attacched -> attached.

================
Comment at: llvm/lib/Transforms/Utils/VerifyDIPreserve.cpp:360
+    } else {
+      auto hadLoc = InstrIt->second;
+      if (!hadLoc)
----------------
Minor, `hadLoc` -> `HadLoc`, although I think this could also just be inlined into the if condition.

================
Comment at: llvm/lib/Transforms/Utils/VerifyDIPreserve.cpp:363
+        continue;
+      // If the instr had the !dbg attacched before the pass, consider it as
+      // a debug info issue.
----------------
As above, attacched -> attached.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82545/new/

https://reviews.llvm.org/D82545