[PATCH] D91595: [Debugify] Support checking Machine IR debug info

Fri Nov 20 00:53:31 PST 2020

djtodoro added inline comments.

================
Comment at: llvm/lib/CodeGen/MachineCheckDebug.cpp:58
+        for (MachineInstr &MI : MBB) {
+          if (!MI.isDebugValue())
+            continue;
----------------
xiangzhangllvm wrote:
> jmorse wrote:
> > djtodoro wrote:
> > > xiangzhangllvm wrote:
> > > > xiangzhangllvm wrote:
> > > > > djtodoro wrote:
> > > > > > I think we should do this for finding missing variables; but for the checking of missing lines we should check more machine instructions (especially the instrs other than debug ones), right? Any reason we shouldn't do that?
> > > > > > 
> > > > > > Please check the `checkDebugifyMetadata()` (from Debugify.cpp) how it was done.
> > > > > Because the MirDebugify will generate DBG_VALUE for every MIRs, all these MIRs' debug location will be saved in these generated DBG_VALUEs. So here we just need to check the line info in these DBG_VALUEs. It is also ok the check the non_debug MIR again, but it is duplicated.
> > > > Hi, @djtodoro 
> > > > Say it is "duplicated" may not accurate, here we only check the debug location in DBG_VALUE MIRs is more strict than check both of them. Because the optimizations may change/update the value of DBG_VALUE (its 1st operand), but they shouldn't change/remove the location info of it. And directly remove a DBG_VALUE should always report warning.
> > > On the IR level, debugify utility pass can tell us: //there is an instruction `I` with empty DebugLoc// after a pass. How it works: before a pass, it attaches synthetic dbg loc onto each IR instruction, and after the pass it checks if it preserved the dbg loc.
> > > 
> > > Can we do the same for MIR Debugify? For example: the mir-debugify attaches a dbgLoc on an MI (non debug one), and then the mir-check-debigify checks if a pass preserved that dbg location? Is it even possible at MIR level?
> > I agree with Djordje, the line numbers attached to DBG_VALUE instructions are of little interest and never reach an output file, it's the non-meta instructions that are the subject of optimisations, and thus should be the target of line number checkers.
> Hello @jmorse  @djtodoro, first thanks for your comments, I was thinking about it before, and it is nice discuss with you now:
> **What we really want (MIR/IR)check-debug doing for us** ?
> Take (IR)check-debug for e.g. 
> In my eyes, we use check-debug to check if the optimizations **incorrectly** optimized the debug info of IRs. Right?
> Some optimizations may **correctly** optimized the IRs by removing them with carefully handling its debug info. (e.g. --instsimplify), for these cases, I think, check-debug should not report WARNING messages out.
> 
> Take llvm lit test llvm/test/DebugInfo/debugify.ll for e.g.
> There is unused %sum=... in it, let's remove it in instruction simplify.
> After debugify, it will be:
> 
> ```
> define i32 @bar() !dbg !12 {
>   call void @foo(), !dbg !15
>   %sum = add i32 0, 1, !dbg !16
>   call void @llvm.dbg.value(metadata i32 %sum, metadata !14, metadata !DIExpression()), !dbg !16
>   ret i32 0, !dbg !17
> }
> ```
> If I run 
> 
> ```
> opt -debugify --instsimplify -check-debugify -S -o - DebugInfo/debugify.ll
> ```
> it will report WARNING in -check-debugify:
> 
> ```
> WARNING: Missing line 3
> CheckModuleDebugify: PASS
> ......
> define i32 @bar() !dbg !12 {
>   call void @foo(), !dbg !15
>   call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_plus_uconst, 1, DW_OP_stack_value)), !dbg !16
>   ret i32 0, !dbg !17
> }
> 
> ```
> because the check-debug will check the line info of removed instruction.
> What is more, removing instruction is very common in optimization passes, and debugify will add different line info for every instructions, so once remove any non-debug instructions, it will give out WARNINGs. Even they **correctly** remove them (by updating its llvm.dbg.value (0/undef, ....).
> 
> If the test is big, we may need to check a lot of warnings one by one.
> 
> And that was why my current patch didn't check the line info for non-debug MIRs.
> 
Yes, but that is why it is being reported as warning...
Use case: For example, we use debugify and realize that a bunch of `add` instructions don't have a dbg location attached after a pass `X`  - we investigate pass `X`, and find out that there is a creating of the instruction `add`, but we have missed to add `I->setDebugLoc()`. It makes sense when dealing with non-dbg instructions only, since the locations of these are used during debugging sessions, since only these instructions end up in the final code.

Please consider the code from trunk (IR Debugify.cpp):
    // Find missing lines.
    for (Instruction &I : instructions(F)) {
      if (isa<DbgValueInst>(&I) || isa<PHINode>(&I))
        continue;

      auto DL = I.getDebugLoc();
      if (DL && DL.getLine() != 0) {
        MissingLines.reset(DL.getLine() - 1);
        continue;
      }

      if (!DL) {
        dbg() << "WARNING: Instruction with empty DebugLoc in function ";
        dbg() << F.getName() << " --";
        I.print(dbg());
        dbg() << "\n";
      }
    }

This gives us a chance to investigate if a pass drops the location accidentally or it should have done it, but only for non-debug ones, as:
    if (isa<DbgValueInst>(&I) || isa<PHINode>(&I))
      continue;

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91595/new/

https://reviews.llvm.org/D91595