[llvm-commits] DBG_VALUE instruction format and generation

Alexey Samsonov samsonov at google.com
Thu Sep 6 01:47:32 PDT 2012


On Tue, Sep 4, 2012 at 10:21 PM, Eric Christopher <echristo at apple.com>wrote:

>
> On Sep 3, 2012, at 9:04 AM, Alexey Samsonov <samsonov at google.com> wrote:
>
> > Hi!
> >
> > I've got the following questions about DBG_VALUE instruction:
> > 1) When I dump this instruction I see the line of the form:
> >   DBG_VALUE %EDI, 0, !"a"; line no:47
> > for two completely different cases:
> >   * when value of "a" is actually stored in a register %edi  (this
> should be encoded as "DW_OP_reg5" in DWARF).
> >   * when value of "a" is stored in memory at address stored in %edi
> ("DW_OP_breg5, 0" in DWARF).
> >
> > Currently LLVM handles this in favor of first case, and produces wrong
> debug info for the second case. Can these cases
> > actually be distinguished, and if yes, where should I take a look?
> >
>
> Right now it works via the offset mechanism in that if the first operand
> to the address is a DW_OP_plus we'll turn it into a breg by default. I make
> no claims that this will work if the offset is zero or that we do the right
> thing in that case at the moment, but that's how it's currently set up.
> Basically via the DIVariable creation mechanism.
>

This looks weird to me. Either I misunderstand smth, or we should
distinguish the cases with zero offset and non-zero offset from address.
(and use arithmetic instructions in the first case).

My original intent was to apply the following patch to
lib/CodeGen/AsmPrinter/AsmPrinter.cpp:
--- AsmPrinter.cpp      (revision 163107)
+++ AsmPrinter.cpp      (working copy)
@@ -810,7 +810,9 @@
   // caller might be in the middle of an dwarf expression. We should
   // probably assert that Reg >= 0 once debug info generation is more
mature.

-  if (int Offset =  MLoc.getOffset()) {
+  if (!MLoc.isReg()) {
+    // Value address is calculated as Reg + Offset.
+    int Offset = MLoc.getOffset();
     if (Reg < 32) {
       OutStreamer.AddComment(
         dwarf::OperationEncodingString(dwarf::DW_OP_breg0 + Reg));

As currently AsmPrinter::EmitDwarfRegOp emits DW_OP_reg if offset is zero
and DW_OP_breg otherwise.
This is wrong - MachineLocation(Reg) is a value in register, but
MachineLocation(Reg, 0) is a value in memory,
which is calculated at offset 0 from address stored in Reg. If I apply this
patch, it breaks stuff :)

In most cases instruction like
DBG_VALUE %EDI, 0, !"a"; line no:47
is parsed via the following line:
static DotDebugLocEntry getDebugLocEntry(AsmPrinter *Asm,
<...>
  if (MI->getNumOperands() != 3) {
    MachineLocation MLoc = Asm->getDebugValueLocation(MI);
    return DotDebugLocEntry(FLabel, SLabel, MLoc, Var);
  }
  if (MI->getOperand(0).isReg() && MI->getOperand(1).isImm()) {
    MachineLocation MLoc;
    MLoc.set(MI->getOperand(0).getReg(), MI->getOperand(1).getImm());
 <--------- MachineLocation is set here
    return DotDebugLocEntry(FLabel, SLabel, MLoc, Var);
  }
<...>
in _most_ cases this DBG_VALUE instruction corresponds to the value in
register, but in some cases I observe
it corresponds to the value in memory at zero offset. Hm...


> 2) This one is more general. I'm trying to make "clang -g" work well with
> AddressSanitizer instrumentation enabled
> > (currently generated debug info for variables is pretty much
> inconsistent) and need to develop a workflow somehow.
> > ASan works with llvm IR, so there are no machine instructions, just
> llvm.dbg.declare / llvm.dbg.value  intrinsics, which
> > are ignored by ASan. How can IR transforms (inserting function calls,
> basic blocks, etc.) hurt turning llvm.dbg intrinsic into
> > a set of DBG_VALUE instructions? What is the best way I can actually see
> what's going on when we generate machine instructions from IR?
>
> I'm not sure what you mean by hurt? I assume you mean that the pass
> doesn't look at the debug information right now. As a guess it'll probably
> want to keep track of the users of an alloca in case it rewrites the access
> in some way and then update the variable information accordingly. Otherwise
> I can't think of much that'll hurt the dbg_value/dbg_declare intrinsics if
> you're leaving the variables alone. Passing -debug will get you
> information, take a look for dbg_declare and dbg_value in
> SelectionDAGBuilder.cpp to get an idea of the kind of debug information
> we're going to drop and the debug output we're going to spew when you do
> that.
>

Oh, I see the point now - AddressSanitizer pass merges all alloca
instructions into one. So we'll likely need to patch dbg_declare
instructions accordingly (get the pointer inside the one large allocated
chunk that points to the memory corresponding to the original stack
variable).


>
> Mostly a vague set of answers but I hope they help. If you've got some
> specific code I'm more than happy to look at it.
>

Thanks!


>
> -eric
>


-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120906/a18555ba/attachment.html>


More information about the llvm-commits mailing list