[llvm-dev] Proposal for multi location debug info support in LLVM IR

Wed Jan 6 14:39:00 PST 2016

I have updated the gist to reflect the following changes:

- Split llvm.dbg.value into three intrinsics. Note that in the process, I
deleted the !var argument from .add and .delete and the expr argument from
.delete since they are now necessarily redundant.
- Key calls describing pieces of variables are independent, and extra
locations may only be added to subset of the piece described in the
  original key call.

On Wed, Jan 6, 2016 at 11:05 PM, Keno Fischer <kfischer at college.harvard.edu>
wrote:

> Makes sense. If I understand your comment correctly, the following snippet:
>>
>> %1 = ...
>> %token = call llvm.dbg.value(token %undef, %1, !var, !())
>> %2 = ...
>> call llvm.dbg.value(token %token, %undef, !var, !())
>> call llvm.dbg.value(token %undef, %2, !var, !())
>>
>> is equivalent to
>>
>> %1 = ...
>> call llvm.dbg.value(token %undef, %1, !var, !())
>> %2 = ...
>> call llvm.dbg.value(token %undef, %2, !var, !())
>>
>> and both are legal.
>>
>
> Yes
>
>
>> > > >
>>> > > >     - To add a location with the same value for the same variable,
>>> you
>>> > > pass the
>>> > > >       token of the FIRST llvm.dbg.value, as this llvm.dbg.value's
>>> first
>>> > > argument
>>> > > >       E.g. to add another location for the variable above:
>>> > > >
>>> > > >         %second =3D call token @llvm.dbg.value(token %first,
>>> metadata
>>> > > %val2,
>>> > > >                                             metadata !var, metadata
>>> > > !expr2)
>>> > >
>>> > > Does this invalidate the first location, or does this add an
>>> additional
>>> > > location
>>> > > to the set of locations for var at this point? If I want to add a
>>> third
>>> > > location,
>>> > > which token do I pass in? Can you explain a bit more what
>>> information the
>>> > > token
>>> > > allows us to express that is currently not possible?
>>> > >
>>> >
>>> > It adds a second location. If you want to add a third location you
>>> pass in
>>> > the first token again.
>>> > Thus the first call (key call) indicates a change of values, and all
>>> > locations that have the same value should use the key call's token.
>>> >
>>>
>>> Ok. Looks like this is going to be somewhat verbose for partial updates
>>> of SROA’ed aggregates as in the following example:
>>>
>>> // struct s { int i, j };
>>> // void foo(struct s) { s.j = 0; ... }
>>>
>>> define void @foo(i32 %i, i32 %j) {
>>>   %token = call llvm.dbg.value(token %undef, %i, !Struct,
>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>>            call llvm.dbg.value(token %token, %j, !Struct,
>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>>   ...
>>>
>>>   ; have to repeat %i here:
>>>   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>>           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>>
>>> On the upside, having all this information explicit could simplify the
>>> code in DwarfDebug::buildLocationList().
>>>
>>
>> Yeah, this is true. We could potentially extend the semantics by allowing
>> separate key calls for pieces, i.e.
>>
>> %token = call llvm.dbg.value(token %undef, %i, !Struct,
>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>            call llvm.dbg.value(token undef, %j, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>
>> ; This now only invalidates the .j part
>> %tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>
>> In that case we would probably have to require that all DW_OP_bit_pieces
>> in non-key-call expressions are a subrange of those in the associated key
>> call.
>>
>>
>> This way all non-key-call additional locations are describing alternative
>> locations for (a subset of) the bits described the key-call location. Makes
>> sense, and again would simplify the backend’s work.
>>
>
> Yes, I think that's a reasonable change to the semantics, so let's make it
> so.
>
>
>> One thing I’m wondering about is whether we couldn’t design a friendlier
>> (assembler) syntax for the three different use-cases:
>>   %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
>>   %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
>>   %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())
>>
>> Could be written as e.g.:
>>
>>   %tok1 = call llvm.dbg.value.new(%1, !var, !())
>>   %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
>>   %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())
>>
>
> Yeah, I would be ok with that (and think it's a good idea).
>
>
>> -- adrian
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160106/827f3bd3/attachment.html>