[llvm] Avoid dbg.value between phi nodes when converting from DbgRecords (PR #83620)

Thu Mar 7 15:46:08 PST 2024

dsandersllvm wrote:

> That's the intention / hope. With debug-intrinsics, not mixing PHIs and intrinsics is a guaranteed property because inserting at getFirstNonPHI or getFirstInsertionPt always points after the leading PHIs and before any debug intrinsics....
> ...
> > Am I right in understanding this as "phi's should never have debug info"?
> Correct -- it was a property of LLVM using debug-intrinsics (because you can't put calls in between PHIs), and we're trying to preserve that property to have confidence that we're only changing how debug-info is represented, not what it means / how it behaves. Possibly we'll have to weaken that if this problem is too difficult to solve,

That all makes sense to me

> > If the new debug info attaches to previous instructions as I currently think it does
> It's actually the other way around -- debug-info records get attached to the following instruction. It's a convenient property because all well-formed blocks should have a terminator, thus almost all the time the debug-info records can be attached to an instruction. There's some temporary storage and hacks for scenarios where terminators get erased then recreated / re-inserted to a block, which is happily rare.

Ah yep, I can see it in the new code to print the debug records.

> And code that calls getFirstNonPHI or getFirstInsertionPt on the block, gets an instruction pointer to the add instruction, and inserts a PHI there making this code:

I've tracked down the origin of the debug info now and it was indeed something like this but it was also quite hard to spot. Rather than being the cloned phi, it was an instruction built just after it whose insertion point was selected with:
```
Builder.SetInsertPoint(FI->getFirstNonPHI());
```
At first glance that doesn't look wrong but getFirstNonPHI() and getFirstNonPHIIt() return different positions despite the names implying they only differ by return type. Additionally, SetInsertPoint has overloads for both Instruction* and BasicBlock::iterator so it's not an error to use the wrong one. I'm inclined to say that getFirstNonPHI() should be marked deprecated (and things like getFirstNonPHIIt() which use it internally updated to use a name that isn't marked deprecated) with the message suggesting getFirstNonPHIIt() or getFirstNonPHIOrDbg() (this one skips PseudoProbeInst's too so it's not 100% the same)

> Thinking about it, we can probably put an assertion in the behind-the-scenes debug-info maintenance code to detect scenarios where this occurs early, rather than during the verifier.

That would be great. Also, it's a lot easier to see what's going now that we have the record printing but one hack I still had to do to help me track it down was to stop the pass manager from converting back to the intrinsics so I could see the new records in -print-before-all/-print-after-all.

> I'll also try to fast-forward everything to do with PHIs in our update-patch to happen immediately. Will report back!

Thanks

https://github.com/llvm/llvm-project/pull/83620