[PATCH] D91595: [Debugify] Support checking Machine IR debug info

Fri Nov 20 01:03:52 PST 2020

xiangzhangllvm added inline comments.

================
Comment at: llvm/lib/CodeGen/MachineCheckDebug.cpp:58
+        for (MachineInstr &MI : MBB) {
+          if (!MI.isDebugValue())
+            continue;
----------------
djtodoro wrote:
> xiangzhangllvm wrote:
> > jmorse wrote:
> > > djtodoro wrote:
> > > > xiangzhangllvm wrote:
> > > > > xiangzhangllvm wrote:
> > > > > > djtodoro wrote:
> > > > > > > I think we should do this for finding missing variables; but for the checking of missing lines we should check more machine instructions (especially the instrs other than debug ones), right? Any reason we shouldn't do that?
> > > > > > > 
> > > > > > > Please check the `checkDebugifyMetadata()` (from Debugify.cpp) how it was done.
> > > > > > Because the MirDebugify will generate DBG_VALUE for every MIRs, all these MIRs' debug location will be saved in these generated DBG_VALUEs. So here we just need to check the line info in these DBG_VALUEs. It is also ok the check the non_debug MIR again, but it is duplicated.
> > > > > Hi, @djtodoro 
> > > > > Say it is "duplicated" may not accurate, here we only check the debug location in DBG_VALUE MIRs is more strict than check both of them. Because the optimizations may change/update the value of DBG_VALUE (its 1st operand), but they shouldn't change/remove the location info of it. And directly remove a DBG_VALUE should always report warning.
> > > > On the IR level, debugify utility pass can tell us: //there is an instruction `I` with empty DebugLoc// after a pass. How it works: before a pass, it attaches synthetic dbg loc onto each IR instruction, and after the pass it checks if it preserved the dbg loc.
> > > > 
> > > > Can we do the same for MIR Debugify? For example: the mir-debugify attaches a dbgLoc on an MI (non debug one), and then the mir-check-debigify checks if a pass preserved that dbg location? Is it even possible at MIR level?
> > > I agree with Djordje, the line numbers attached to DBG_VALUE instructions are of little interest and never reach an output file, it's the non-meta instructions that are the subject of optimisations, and thus should be the target of line number checkers.
> > Hello @jmorse  @djtodoro, first thanks for your comments, I was thinking about it before, and it is nice discuss with you now:
> > **What we really want (MIR/IR)check-debug doing for us** ?
> > Take (IR)check-debug for e.g. 
> > In my eyes, we use check-debug to check if the optimizations **incorrectly** optimized the debug info of IRs. Right?
> > Some optimizations may **correctly** optimized the IRs by removing them with carefully handling its debug info. (e.g. --instsimplify), for these cases, I think, check-debug should not report WARNING messages out.
> > 
> > Take llvm lit test llvm/test/DebugInfo/debugify.ll for e.g.
> > There is unused %sum=... in it, let's remove it in instruction simplify.
> > After debugify, it will be:
> > 
> > ```
> > define i32 @bar() !dbg !12 {
> >   call void @foo(), !dbg !15
> >   %sum = add i32 0, 1, !dbg !16
> >   call void @llvm.dbg.value(metadata i32 %sum, metadata !14, metadata !DIExpression()), !dbg !16
> >   ret i32 0, !dbg !17
> > }
> > ```
> > If I run 
> > 
> > ```
> > opt -debugify --instsimplify -check-debugify -S -o - DebugInfo/debugify.ll
> > ```
> > it will report WARNING in -check-debugify:
> > 
> > ```
> > WARNING: Missing line 3
> > CheckModuleDebugify: PASS
> > ......
> > define i32 @bar() !dbg !12 {
> >   call void @foo(), !dbg !15
> >   call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_plus_uconst, 1, DW_OP_stack_value)), !dbg !16
> >   ret i32 0, !dbg !17
> > }
> > 
> > ```
> > because the check-debug will check the line info of removed instruction.
> > What is more, removing instruction is very common in optimization passes, and debugify will add different line info for every instructions, so once remove any non-debug instructions, it will give out WARNINGs. Even they **correctly** remove them (by updating its llvm.dbg.value (0/undef, ....).
> > 
> > If the test is big, we may need to check a lot of warnings one by one.
> > 
> > And that was why my current patch didn't check the line info for non-debug MIRs.
> > 
> Yes, but that is why it is being reported as warning...
> Use case: For example, we use debugify and realize that a bunch of `add` instructions don't have a dbg location attached after a pass `X`  - we investigate pass `X`, and find out that there is a creating of the instruction `add`, but we have missed to add `I->setDebugLoc()`. It makes sense when dealing with non-dbg instructions only, since the locations of these are used during debugging sessions, since only these instructions end up in the final code.
> 
> Please consider the code from trunk (IR Debugify.cpp):
>     // Find missing lines.
>     for (Instruction &I : instructions(F)) {
>       if (isa<DbgValueInst>(&I) || isa<PHINode>(&I))
>         continue;
> 
>       auto DL = I.getDebugLoc();
>       if (DL && DL.getLine() != 0) {
>         MissingLines.reset(DL.getLine() - 1);
>         continue;
>       }
> 
>       if (!DL) {
>         dbg() << "WARNING: Instruction with empty DebugLoc in function ";
>         dbg() << F.getName() << " --";
>         I.print(dbg());
>         dbg() << "\n";
>       }
>     }
> 
> This gives us a chance to investigate if a pass drops the location accidentally or it should have done it, but only for non-debug ones, as:
>     if (isa<DbgValueInst>(&I) || isa<PHINode>(&I))
>       continue;
Thanks for the explanation, I'll check the non-debug MIR.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91595/new/

https://reviews.llvm.org/D91595