[llvm-dev] Proposal for multi location debug info support in LLVM IR

Fri Jan 15 14:44:31 PST 2016

Adrian had proposed the following staging:

1. Remove offset argument from dbg.value
2. Unify dbg.value and dbg.declare
3. Full implementation

I'm not yet sure what to do about the difference in dbg.declare semantics.
For example, i think the following currently works

```
top:
%x = alloca
br else

if:
dbg.declare(%x...
unreachable

else:
# dbg.declare still applies here
```

I think it would be reasonable to switch to the proposed dominance
semantics during step 2, but we'll have to see if that negatively affects
any real-world test cases.

On Fri, Jan 15, 2016 at 11:38 PM, David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I'm reading/following along - discussion so far sounds reasonable to me.
>
> Only minor note: if dbg.value/declare can be narrowed down to one (I think
> you mentioned in your original proposal that it seemed like everything
> could be just dbg.value?) that'd be a good step, regardless - possibly
> ahead of/while this conversation is underway. Or is it the case that the
> proposed enhanced semantics are required before that transition (because
> currently dbg.value only goes to the end of the BB? if I recall correctly,
> whereas dbg.declare is the whole function)? In the latter case, perhaps
> it'd be a good first step/goal/transition to do as cleanup/generalization
> anyway.
>
> - Dave
>
> On Wed, Jan 6, 2016 at 2:02 PM, Vivek Sarkar via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I will be out of the office on January 7th and will return on January
>> 19th.  I will not have access to email during this time.  Please contact
>> Karen Lavelle at klavelle at rice.edu or 713-348-2062 if you have any
>> questions or concerns.
>>
>> Best regards,
>> Annepha
>>
>> On Jan 6, 2016, at 3:58 PM, Adrian Prantl via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> >
>> > On Jan 5, 2016, at 10:37 AM, Keno Fischer <kfischer at college.harvard.edu>
>> wrote:
>> > On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com>
>> wrote:
>> > Thanks for the clarification, Paul!
>> > Keno, just a few more questions for my understanding:
>> >
>> > >     - Indicating that a value changed at source level (e.g. because an
>> > >       assignment occurred)
>> >
>> > This is done by a key call.
>> >
>> > Correct
>> >
>> > >     - Indicating that the same value is now available in a new
>> location
>> >
>> > Additional, alternative locations with identical contents are added by
>> passing in the token from a key call.
>> >
>> > Correct
>> >
>> > >     - Indicating that a value is no longer available in some location
>> >
>> > This is done by another key call (possibly with an %undef location).
>> >
>> > Not quite. Another key call could be used if all locations are now
>> invalid. However, to just remove a single value, I was proposing
>> >
>> > ; This is the key call
>> > %first = call token @llvm.dbg.value(token undef, %someloc,
>> >                                   metadata !var, metadata !())
>> >
>> > ; This adds a location
>> > %second = call token @llvm.dbg.value(token %second, %someotherloc,
>> >                                   metadata !var, metadata !())
>> >
>> > ; This removes the (%second) location
>> > %third = call token @llvm.dbg.value(token %second, metadata token undef,
>> >                                   metadata !var, metadata !())
>> >
>> > Thus, to remove a location you always pass in the token of the call
>> that added the location. This is also the reason why I'm requiring the
>> second argument to be `token undef` because no valid location can be of
>> type token, and I wanted to avoid the situation in which a location gets
>> replaced by undef everywhere, accidentally turning into a removal of the
>> location specified by the key call
>> >
>> > Makes sense. If I understand your comment correctly, the following
>> snippet:
>> >
>> > %1 = ...
>> > %token = call llvm.dbg.value(token %undef, %1, !var, !())
>> > %2 = ...
>> > call llvm.dbg.value(token %token, %undef, !var, !())
>> > call llvm.dbg.value(token %undef, %2, !var, !())
>> >
>> > is equivalent to
>> >
>> > %1 = ...
>> > call llvm.dbg.value(token %undef, %1, !var, !())
>> > %2 = ...
>> > call llvm.dbg.value(token %undef, %2, !var, !())
>> >
>> > and both are legal.
>> >
>> > > > >
>> > > > >     - To add a location with the same value for the same
>> variable, you
>> > > > pass the
>> > > > >       token of the FIRST llvm.dbg.value, as this llvm.dbg.value's
>> first
>> > > > argument
>> > > > >       E.g. to add another location for the variable above:
>> > > > >
>> > > > >         %second =3D call token @llvm.dbg.value(token %first,
>> metadata
>> > > > %val2,
>> > > > >                                             metadata !var,
>> metadata
>> > > > !expr2)
>> > > >
>> > > > Does this invalidate the first location, or does this add an
>> additional
>> > > > location
>> > > > to the set of locations for var at this point? If I want to add a
>> third
>> > > > location,
>> > > > which token do I pass in? Can you explain a bit more what
>> information the
>> > > > token
>> > > > allows us to express that is currently not possible?
>> > > >
>> > >
>> > > It adds a second location. If you want to add a third location you
>> pass in
>> > > the first token again.
>> > > Thus the first call (key call) indicates a change of values, and all
>> > > locations that have the same value should use the key call's token.
>> > >
>> >
>> > Ok. Looks like this is going to be somewhat verbose for partial updates
>> of SROA’ed aggregates as in the following example:
>> >
>> > // struct s { int i, j };
>> > // void foo(struct s) { s.j = 0; ... }
>> >
>> > define void @foo(i32 %i, i32 %j) {
>> >   %token = call llvm.dbg.value(token %undef, %i, !Struct,
>> !DIExpression(DW_OP_bit_piece(0, 32)))
>> >            call llvm.dbg.value(token %token, %j, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>> >   ...
>> >
>> >   ; have to repeat %i here:
>> >   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
>> !DIExpression(DW_OP_bit_piece(0, 32)))
>> >           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>> >
>> > On the upside, having all this information explicit could simplify the
>> code in DwarfDebug::buildLocationList().
>> >
>> > Yeah, this is true. We could potentially extend the semantics by
>> allowing separate key calls for pieces, i.e.
>> >
>> > %token = call llvm.dbg.value(token %undef, %i, !Struct,
>> !DIExpression(DW_OP_bit_piece(0, 32)))
>> >            call llvm.dbg.value(token undef, %j, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>> >
>> > ; This now only invalidates the .j part
>> > %tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>> >
>> > In that case we would probably have to require that all
>> DW_OP_bit_pieces in non-key-call expressions are a subrange of those in the
>> associated key call.
>> >
>> > This way all non-key-call additional locations are describing
>> alternative locations for (a subset of) the bits described the key-call
>> location. Makes sense, and again would simplify the backend’s work.
>> >
>> >
>> > Is there any information in the tokens that could not be recovered by a
>> static analysis of the debug intrinsics?
>> > Note that having redundant information available explicitly is not
>> necessarily a bad thing.
>> >
>> > I am not entirely sure what you are proposing. You somehow need to be
>> able to encode which dbg.values invalidate previous locations and which do
>> not. Since we're describing front-end variables this will generally depend
>> on front-end semantics, so I'm not sure what a generic analysis pass can do
>> here without requiring language-specific analysis.
>> >
>> > Right. Determining whether two locations have equivalent contents is
>> not generally decidable.
>> >
>> > The one difference I noticed so far is that alternative locations allow
>> earlier locations to outlive locations that are dominated by them:
>> >   %loc = dbg.value(%undef, var, ...)
>> >   ...
>> >   %alt = dbg.value(%loc, var, ...)
>> >   ...
>> >   ; alt becomes unavailable
>> >   ...
>> >   ; %loc is still available here.
>> >
>> > Any other advantages that I missed?
>> >
>> > -- adrian
>> >
>> >
>> > One thing I’m wondering about is whether we couldn’t design a
>> friendlier (assembler) syntax for the three different use-cases:
>> >   %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
>> >   %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
>> >   %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())
>> >
>> > Could be written as e.g.:
>> >
>> >   %tok1 = call llvm.dbg.value.new(%1, !var, !())
>> >   %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
>> >   %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())
>> >
>> > -- adrian
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/f3ba2bef/attachment.html>