[PATCH] D35994: Debug info for variables whos type is shrinked to bool

Tue Aug 1 13:17:24 PDT 2017

On Tue, Aug 1, 2017 at 1:09 PM Adrian Prantl <aprantl at apple.com> wrote:

> On Aug 1, 2017, at 10:53 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
> DW_OP_bra looks like it'd be a bit of a pain to work with since it's
> number of bytes (so you'll have to figure out the actual length of the
> expression sequence - not just the number of operations, etc). The LEB
> encoding in particular worries me a bit as being a pain to have to do that
> first, then compute the encoded length to figure out how many bytes to
> skip, etc. But that's not impossible/that arcane or anything...  If that
> works out OK, great. :)
>
>
> In DIExpression we could use the convention that the argument to DW_OP_bra
> encodes the number of operations (=DIExpression::ExprOperand iterators) to
> skip.
>

Oh, right, the IR representation doesn't even known the number of bytes...
blessing & a curse, I guess.

> In DwarfExpression.cpp we could get AsmPrinter to report back the length
> of the everything emitted (including the LEBs) — what is tricky is how to
> encode a forward reference such as the number of bytes to skip until the
> end of the expression.
>

Not quite sure I follow - I would picture it as: encode the OP_bra into the
buffer but not its argument yet - encode aurgement-many operands into a
temp buffer, measure its size, write that to the main buffer, then flush
the temp buffer to the main buffer.

But I guess that's the two pass you mention below? (& this could be N pass
if an OP_bra jumped over another OP_bra)

> We would have to make DwarfExpression run two passes: first emit the
> expression into a buffer and then fix up the relative jump targets.
>
> (alternatively we could do some arithmetic and take the 42/87 example: (87
> - 42) * val + 42. Admittedly it works easily when the 1 boolean value is
> associated with the larger of the two numbers... - but may be easier to do
> the more verbose & fussy (in terms of the byte computatiotns) version
> that's a bit more explicit about the values being used, etc)
>
>
> Another reason to prefer purely arithmetic solutions is that they might be
> (though I don't know enough about that) easier to compile into CodeView
> that arbitrary programs with control flow in them.
>

Maybe? I probably wouldn't speculate/bother about that without evidence,
though. The conditional form might be fine to reformulate in limited ways
(if we know we'll only see certain kindsn of conditionals, etc - or at
least only support certain kinds and punt on the rest, etc)

>
> Here's another attempt that doesn't assume anything about the two values:
>
> // val * 42 | ((~val)&1) * 87
> DW_OP_dup DW_OP_constu 42 DW_OP_mul DW_OP_swap
> DW_OP_not DW_OP_lit1 DW_OP_and DW_OP_constu 87 DW_OP_mul
> DW_OP_or DW_OP_stack_value
>

Oh, not bad at all. (not sure which of these are longer and by how much -
but I'm not sure minimizing complex locations is really a problem/priority
- though I haven't done a complex breakdown of optimized code debug size,
etc)

- Dave

>
> -- adrian
>
>
> *shrug* :)
>
> On Tue, Aug 1, 2017 at 10:47 AM Adrian Prantl <aprantl at apple.com> wrote:
>
>> On Aug 1, 2017, at 10:35 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>>
>>
>> On 08/01/2017 12:28 PM, Adrian Prantl wrote:
>>
>>
>> On Aug 1, 2017, at 10:20 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>>
>>
>> On 08/01/2017 11:54 AM, David Blaikie wrote:
>>
>>
>>
>> On Tue, Aug 1, 2017 at 9:43 AM Hal Finkel <hfinkel at anl.gov> wrote:
>>
>>>
>>> On 08/01/2017 11:28 AM, Adrian Prantl via llvm-commits wrote:
>>>
>>>
>>> On Aug 1, 2017, at 9:24 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>
>>>
>>> On Tue, Aug 1, 2017 at 9:04 AM Adrian Prantl via Phabricator <
>>> reviews at reviews.llvm.org> wrote:
>>>
>>>> aprantl added inline comments.
>>>>
>>>>
>>>> ================
>>>> Comment at: lib/Transforms/IPO/GlobalOpt.cpp:1577
>>>> +  for(auto *GV : GVs)
>>>> +    NewGV->addDebugInfo(GV);
>>>> +
>>>> ----------------
>>>> NikolaPrica wrote:
>>>> > aprantl wrote:
>>>> > > I'm not familiar with this transformation: Do we need to add a
>>>> DIExpression to mask out all but the first bit (i.e. can multiple bools be
>>>> packed into the same uint32_t such that it could confuse debuggers)?
>>>> > The debug info which is provided here with addDebugInfo later
>>>> generates address of the variable (DW_OP_addr: xxxx) in DW_AT_location. If
>>>> we provide here metadata which is for 1byte variable  the debugger would
>>>> get confused because the enum type is written as 4-byte and he would try to
>>>> read 4 bytes. This is just temporary fix until proper way to handle this is
>>>> found.
>>>> If I understood you correctly then the best way to represent this would
>>>> be a `DW_OP_LLVM_fragment /*offset*/0 /*bitsize*/1 (or 8?)`
>>>> expression. This will get lowered into a DW_OP_bit_piece to tell the
>>>> debugger that this location is describing of the first n bits of the
>>>> variable.
>>>>
>>>
>>> A slight problem with this is that at least GDB won't print a single
>>> value if it's partially unavailable (eg: if it's a struct and the fragment
>>> describes one member but not the othe,r I think that's OK - but if it's a
>>> single int and only one out of 4 bytes are described - well, the value is
>>> known/unknowable).
>>>
>>>
>>> A debugger would have to print the value as something like 0x??????01 to
>>> indicate that pieces are missing. But no debugger I'm aware of does that.
>>>
>>> If this optimization is really based on the proof that the other bytes
>>> are unused, never written to or read, then a fragment describing the other
>>> bytes as constant zero would be good to have as well.
>>>
>>>
>>> No, it is more complicated than that. See TryToShrinkGlobalToBoolean in
>>> lib/Transforms/IPO/GlobalOpt.cpp. It is looking for cases where the global
>>> is only set to one initialized value and one other value. In that case, it
>>> can make a Boolean global and replace uses of that Global with selects over
>>> that value.
>>>
>>
>> Oh, wait - you mean a variable with value 42 and 87 gets collapsed to a
>> boolean where true/1 is used to represent 42 and false/0 is used to
>> represent 87?
>>
>>
>> Yes. The code does this:
>>
>>       // Change the load into a load of bool then a select.
>>       LoadInst *LI = cast<LoadInst>(UI);
>>       LoadInst *NLI = new LoadInst(NewGV, LI->getName()+".b", false, 0,
>>                                    LI->getOrdering(),
>> LI->getSyncScopeID(), LI);
>>       Value *NSI;
>>       if (IsOneZero)
>>         NSI = new ZExtInst(NLI, LI->getType(), "", LI);
>>       else
>>         NSI = SelectInst::Create(NLI, OtherVal, InitVal, "", LI);
>>       NSI->takeName(LI);
>>       LI->replaceAllUsesWith(NSI);
>>
>>
>> Whoa — this is going to be fun! A DWARF expression for that example might
>> look like:
>>
>> DW_OP_constu 42 DW_OP_mul DW_OP_dup DW_OP_bra +2 DW_OP_skip +<end-1>
>> DW_OP_constu 87 DW_OP_constu DW_OP_stack_value
>>
>> roughly: if (Val * 42) return Val * 42; else return 87;
>>
>>
>> Why can't you just encode: 'if (Val) return 42; else return 87;'?
>>
>>
>> Yes that would be even shorter.
>> DW_OP_bra +4? DW_OP_constu 42 DW_OP_skip +<end-1> DW_OP_constu 87
>> DW_OP_constu DW_OP_stack_value
>>  I started out with trying to craft an expression that wouldn't need any
>> control flow and then got carried away :-)
>>
>> -- adrian
>>
>>
>>  -Hal
>>
>>
>> -- adrian
>>
>>
>>  -Hal
>>
>>
>> Oh. Yeah. That's a bit more involved. (but nice that it means we can
>> represent the values exactly - hopefully)
>>
>> Thanks!
>>
>>
>>> I think you actually want to generate a DWARF expression for these.
>>>
>>>  -Hal
>>>
>>>
>>> Agreed, that would be the most practical solution — if we can prove that
>>> it is correct.
>>>
>>> -- adrian
>>>
>>>
>>> But I doubt that's the case - no doubt if we can see all the reads and
>>> writes, we optimize away the writes if we know they won't be read, so we
>>> may end up with only the one byte. C'est la vie.
>>>
>>> Just mention it to check/understand what the optimization is/isn't
>>> doing, etc.
>>>
>>>
>>>>
>>>>
>>>> https://reviews.llvm.org/D35994
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing listllvm-commits at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>>
>>> --
>>> Hal Finkel
>>> Lead, Compiler Technology and Programming Languages
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170801/be3df3c4/attachment.html>