[PATCH] LowerDbgDeclare - get rid of redundant dbg.values when an alloca survives optimization

Mon Oct 21 17:06:06 PDT 2013

Yes, I definitely need to dig deeper here. I’m really wondering why those dbg.values show up in the assembly, but not in the DWARF output.

I’ll keep you updated once I understand this better.

-- adrian

On Oct 21, 2013, at 16:54, Eric Christopher <echristo at gmail.com> wrote:

> If the sections of text are contiguous we should have a single
> location entry for the variable, else we should have multiple in
> location lists. Any duplicate or overlapping ranges should be
> coalesced down to a single range.
> 
> For the testcase with current ToT you've got a single location of
> DW_OP_fbreg + offset which seems to have absolutely no correlation
> with the assembly. I'm somewhat confused as to what's going on here.
> We've definitely got some problem here.
> 
> -eric
> 
> 
> On Fri, Oct 18, 2013 at 1:46 PM, Adrian Prantl <aprantl at apple.com> wrote:
>> 
>> On Oct 18, 2013, at 13:27, David Blaikie <dblaikie at gmail.com> wrote:
>> 
>>> 
>>> 
>>> 
>>> On Fri, Oct 18, 2013 at 1:15 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>> 
>>> On Oct 18, 2013, at 11:26 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>> 
>>>> 
>>>> 
>>>> 
>>>> On Fri, Oct 18, 2013 at 11:22 AM, Adrian Prantl <aprantl at apple.com> wrote:
>>>> 
>>>> On Oct 18, 2013, at 10:35, Adrian Prantl <aprantl at apple.com> wrote:
>>>> 
>>>>> 
>>>>> On Oct 18, 2013, at 10:24, David Blaikie <dblaikie at gmail.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Oct 18, 2013 at 10:12 AM, Adrian Prantl <aprantl at apple.com> wrote:
>>>>>> Any final verdict for this change?
>>>>>> 
>>>>>> I'm not really familiar with this stuff to provide a 'final verdict' as it were, but I'll ask some questions anyway...
>>>>>> 
>>>>>> Is this the correct test? Is it possible that the variable might be reconstituted around the call, rather than kept live for the entire range?
>>>>>> If there are things that prevent the alloca from being elided, are those things enshrined in a test somewhere that we could reuse, rather than reimplementing one part of them?
>>>>> 
>>>>> That’s a very good point indeed: I just discovered
>>>>>> bool llvm::isAllocaPromotable(const AllocaInst *AI)
>>>>> in mem2reg.
>>>> 
>>>> Yes that seems to work fine and is a lot cleaner.
>>>> 
>>>> Great. So I'm reading your comment in the code and I'm curious - you say it reduces the number of location DIEs... I guess you mean entries in the location list? (in the debug_loc section) which aren't really DIEs?
>>> 
>>> No, they are. The problem is that each dbg.value ends up in a different lexical scope and is therefore emitted as a separate location (vs. just another entry in one location list). See the testcase in the patch.
>>>> 
>>>> Why is that? What kind of locations were we emitting? Should the DWARF emission code have been doing a better job coalescing these locations if the variable was in the alloca all along anyway?
>>>> 
>>> I thought about this, too. Because they are in different scopes I think that would be wrong.
>>> 
>>> If the locations are contiguous and refer to the same location, they should be coalesced - are they not? (I assume not, and that it's the alloca that stays live over the BB boundary, but in each block the variable is temporarily stored in a register - which may be a different register on each block (and, yes, at the start of the loop we might see the variable in the same register in the loop entry, then the loop body but we can't coalesce those because before the first instruction of the loop we might've entered the loop from some other block where the variable wasn't stored in the same register))
>>> 
>>> 
>>>> Or are we dealing with a situation where the alloca exists, but the variable is temporarily in registers and without your change we described each of those register locations, but now we just describe the location as the alloca?
>>>> 
>>>> Is this going to improve or hurt the debugging experience much? What happens if the variable is in a register when the user goes to print it out and instead they get the stale value that's still in the alloca?
>>>> 
>>> At least with the example that I was looking at, it does't hurt. Remember that this is an alloca that cannot be lowered, so the location will have the correct value at least before and after each function call, which is incidentally also the use of the alloca.
>>> 
>>> Sure, but what about simple code like:
>>> 
>>> int i = 3;
>>> i = 7;
>>> i = f1();
>>> f2(&i);
>>> 
>>> Stepping over those lines it would seem problematic if the user can't print out 'i' and observe 3, 7, etc.
>> 
>> At least in this particular example. I don’t see a problem. At -O0, all the values will be stored into the alloca anyway.
>> 
>>        .loc    1 5 0 prologue_end      ## test1.c:5:0
>> Ltmp5:
>>        subq    $16, %rsp
>>        movl    $3, -8(%rbp)
>>        .loc    1 6 0                   ## test1.c:6:0
>>        movl    $7, -8(%rbp)
>>        .loc    1 7 0                   ## test1.c:7:0
>>        movb    $0, %al
>>        callq   _f1
>>        leaq    -8(%rbp), %rdi
>>        movl    %eax, -8(%rbp)
>>        .loc    1 8 0                   ## test1.c:8:0
>>        callq   _f2
>>        .loc    1 9 0                   ## test1.c:9:0
>>        movl    -4(%rbp), %eax
>>        addq    $16, %rsp
>>        popq    %rbp
>>        ret
>> 
>> at -O2, we are missing out on the 3 and the 7, but we would anyway, because this is dead code.
>> 
>>        .loc    1 7 0 prologue_end      ## test1.c:7:0
>> Ltmp5:
>>        callq   _f1
>>        movl    %eax, -4(%rbp)
>>        leaq    -4(%rbp), %rdi
>>        .loc    1 8 0                   ## test1.c:8:0
>>        callq   _f2
>>        .loc    1 9 0                   ## test1.c:9:0
>>        addq    $16, %rsp
>>        popq    %rbp
>>        ret
>> 
>> Aha. But if we extend the program like this:
>> int foo() {
>>  int i = 3;
>>  f3(i);
>>  i = 7;
>>  i = f1();
>>  f2(&i);
>> }
>> 
>> We are actually missing the 3, too.
>> 
>>        .cfi_def_cfa_register %rbp
>>        subq    $16, %rsp
>>        movl    $3, %edi
>>        .loc    1 7 0 prologue_end      ## test1.c:7:0
>> Ltmp5:
>>        callq   _f3
>>        xorl    %eax, %eax
>>        .loc    1 9 0                   ## test1.c:9:0
>>        callq   _f1
>>        movl    %eax, -4(%rbp)
>>        leaq    -4(%rbp), %rdi
>>        .loc    1 10 0                  ## test1.c:10:0
>>        callq   _f2
>>        .loc    1 11 0                  ## test1.c:11:0
>>        addq    $16, %rsp
>>        popq    %rbp
>>        ret
>> 
>> 
>> And now it gets really strange: Without my patch, the dbg.values show up in the asm output,
>> 
>>        subq    $16, %rsp
>> Ltmp5:
>>        ##DEBUG_VALUE: foo:i <- 3+0
>>        movl    $3, %edi
>>        .loc    1 7 0 prologue_end      ## test1.c:7:0
>> Ltmp6:
>>        callq   _f3
>> Ltmp7:
>>        ##DEBUG_VALUE: foo:i <- 7+0
>>        xorb    %al, %al
>>        .loc    1 9 0                   ## test1.c:9:0
>>        callq   _f1
>> Ltmp8:
>>        ##DEBUG_VALUE: foo:i <- EAX+0
>>        movl    %eax, -4(%rbp)
>>        leaq    -4(%rbp), %rdi
>>        .loc    1 10 0                  ## test1.c:10:0
>>        callq   _f2
>>        .loc    1 11 0                  ## test1.c:11:0
>>                                        ## implicit-def: EAX
>> Ltmp9:
>>        addq    $16, %rsp
>>        popq    %rbp
>>        ret
>> 
>> But not in the DWARF!?
>> 
>> Looks like we have two problems now.
>> 
>> -- adrian