[llvm] Avoid dbg.value between phi nodes when converting from DbgRecords (PR #83620)

Tue Mar 5 06:35:27 PST 2024

jmorse wrote:

> unless it somehow ensures that phis never have debug info so that adding a new phi doesn't leave the second-to-last phi with debug info.

That's the intention / hope. With debug-intrinsics, not mixing PHIs and intrinsics is a guaranteed property because inserting at `getFirstNonPHI` or `getFirstInsertionPt` always points after the leading PHIs and before any debug intrinsics. The RemoveDIs work has created an ambiguity there -- an instruction position might be before or after any attached debug-info records which will be resolved by the extra bit of information that's being placed in `BasicBlock::iterator`. Essentially it tells any instruction inserter that because you called `getFirstNonPHI` / `getFirstInsertionPt`, anything inserted there (such as a new PHI) should be placed in front of any debug-info records.

I hate to use a video, but I've got an illustrated example of the problem here: https://www.youtube.com/watch?v=fZgFTOuhEzc&t=475s from 4 minutes to 9:30ish.

> Am I right in understanding this as "phi's should never have debug info"?

Correct -- it was a property of LLVM using debug-intrinsics (because you can't put calls in between PHIs), and we're trying to preserve that property to have confidence that we're only changing how debug-info is represented, not what it means / how it behaves. Possibly we'll have to weaken that if this problem is too difficult to solve,

> If the new debug info attaches to previous instructions as I currently think it does

It's actually the other way around -- debug-info records get attached to the following instruction. It's a convenient property because all well-formed blocks should have a terminator, thus almost all the time the debug-info records can be attached to an instruction. There's some temporary storage and hacks for scenarios where terminators get erased then recreated / re-inserted to a block, which is happily rare.

> The second is how we had a phi with debug info attached in the input to our pass to begin with.

See the above paragraph on insertion and the video link -- something somewhere in LLVM has a scenario like this:

    %foo = PHI ...
    #dbg_value %foo, !123, !DIExpression....
    %bar = add i32 ...

And code that calls `getFirstNonPHI` or `getFirstInsertionPt` on the block, gets an instruction pointer to the `add` instruction, and inserts a PHI there making this code:

    %foo = PHI ...
    #dbg_value %foo, !123, !DIExpression....
    %baz = PHI ...
    %bar = add i32 ...

Wheras if it inserted with a `BasicBlock::iterator`, the intention of that code to insert at the start of the block would be communicated to the inserter, and it would produce:

    %foo = PHI ...
    %baz = PHI ...
    #dbg_value %foo, !123, !DIExpression....
    %bar = add i32 ...

Thinking about it, we can probably put an assertion in the behind-the-scenes debug-info maintenance code to detect scenarios where this occurs early, rather than during the verifier. I'll also try to fast-forward everything to do with PHIs in our update-patch to happen immediately. Will report back!

https://github.com/llvm/llvm-project/pull/83620