[llvm-dev] Proposal for multi location debug info support in LLVM IR

Fri Jan 15 14:55:20 PST 2016

On Fri, Jan 15, 2016 at 2:44 PM, Keno Fischer <kfischer at college.harvard.edu>
wrote:

> Adrian had proposed the following staging:
>
> 1. Remove offset argument from dbg.value
> 2. Unify dbg.value and dbg.declare
> 3. Full implementation
>
> I'm not yet sure what to do about the difference in dbg.declare semantics.
> For example, i think the following currently works
>
> ```
> top:
> %x = alloca
> br else
>
> if:
> dbg.declare(%x...
> unreachable
>
> else:
> # dbg.declare still applies here
>

Hmm - I thought we had some (perhaps undocumented) rule that dbg.declares
should all go in the entry with the allocas? I assume Clang follows this
rule at least.

> ```
>
> I think it would be reasonable to switch to the proposed dominance
> semantics during step 2, but we'll have to see if that negatively affects
> any real-world test cases.
>
> On Fri, Jan 15, 2016 at 11:38 PM, David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I'm reading/following along - discussion so far sounds reasonable to me.
>>
>> Only minor note: if dbg.value/declare can be narrowed down to one (I
>> think you mentioned in your original proposal that it seemed like
>> everything could be just dbg.value?) that'd be a good step, regardless -
>> possibly ahead of/while this conversation is underway. Or is it the case
>> that the proposed enhanced semantics are required before that transition
>> (because currently dbg.value only goes to the end of the BB? if I recall
>> correctly, whereas dbg.declare is the whole function)? In the latter case,
>> perhaps it'd be a good first step/goal/transition to do as
>> cleanup/generalization anyway.
>>
>> - Dave
>>
>> On Wed, Jan 6, 2016 at 2:02 PM, Vivek Sarkar via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> I will be out of the office on January 7th and will return on January
>>> 19th.  I will not have access to email during this time.  Please contact
>>> Karen Lavelle at klavelle at rice.edu or 713-348-2062 if you have any
>>> questions or concerns.
>>>
>>> Best regards,
>>> Annepha
>>>
>>> On Jan 6, 2016, at 3:58 PM, Adrian Prantl via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> >
>>> > On Jan 5, 2016, at 10:37 AM, Keno Fischer <
>>> kfischer at college.harvard.edu> wrote:
>>> > On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com>
>>> wrote:
>>> > Thanks for the clarification, Paul!
>>> > Keno, just a few more questions for my understanding:
>>> >
>>> > >     - Indicating that a value changed at source level (e.g. because
>>> an
>>> > >       assignment occurred)
>>> >
>>> > This is done by a key call.
>>> >
>>> > Correct
>>> >
>>> > >     - Indicating that the same value is now available in a new
>>> location
>>> >
>>> > Additional, alternative locations with identical contents are added by
>>> passing in the token from a key call.
>>> >
>>> > Correct
>>> >
>>> > >     - Indicating that a value is no longer available in some location
>>> >
>>> > This is done by another key call (possibly with an %undef location).
>>> >
>>> > Not quite. Another key call could be used if all locations are now
>>> invalid. However, to just remove a single value, I was proposing
>>> >
>>> > ; This is the key call
>>> > %first = call token @llvm.dbg.value(token undef, %someloc,
>>> >                                   metadata !var, metadata !())
>>> >
>>> > ; This adds a location
>>> > %second = call token @llvm.dbg.value(token %second, %someotherloc,
>>> >                                   metadata !var, metadata !())
>>> >
>>> > ; This removes the (%second) location
>>> > %third = call token @llvm.dbg.value(token %second, metadata token
>>> undef,
>>> >                                   metadata !var, metadata !())
>>> >
>>> > Thus, to remove a location you always pass in the token of the call
>>> that added the location. This is also the reason why I'm requiring the
>>> second argument to be `token undef` because no valid location can be of
>>> type token, and I wanted to avoid the situation in which a location gets
>>> replaced by undef everywhere, accidentally turning into a removal of the
>>> location specified by the key call
>>> >
>>> > Makes sense. If I understand your comment correctly, the following
>>> snippet:
>>> >
>>> > %1 = ...
>>> > %token = call llvm.dbg.value(token %undef, %1, !var, !())
>>> > %2 = ...
>>> > call llvm.dbg.value(token %token, %undef, !var, !())
>>> > call llvm.dbg.value(token %undef, %2, !var, !())
>>> >
>>> > is equivalent to
>>> >
>>> > %1 = ...
>>> > call llvm.dbg.value(token %undef, %1, !var, !())
>>> > %2 = ...
>>> > call llvm.dbg.value(token %undef, %2, !var, !())
>>> >
>>> > and both are legal.
>>> >
>>> > > > >
>>> > > > >     - To add a location with the same value for the same
>>> variable, you
>>> > > > pass the
>>> > > > >       token of the FIRST llvm.dbg.value, as this
>>> llvm.dbg.value's first
>>> > > > argument
>>> > > > >       E.g. to add another location for the variable above:
>>> > > > >
>>> > > > >         %second =3D call token @llvm.dbg.value(token %first,
>>> metadata
>>> > > > %val2,
>>> > > > >                                             metadata !var,
>>> metadata
>>> > > > !expr2)
>>> > > >
>>> > > > Does this invalidate the first location, or does this add an
>>> additional
>>> > > > location
>>> > > > to the set of locations for var at this point? If I want to add a
>>> third
>>> > > > location,
>>> > > > which token do I pass in? Can you explain a bit more what
>>> information the
>>> > > > token
>>> > > > allows us to express that is currently not possible?
>>> > > >
>>> > >
>>> > > It adds a second location. If you want to add a third location you
>>> pass in
>>> > > the first token again.
>>> > > Thus the first call (key call) indicates a change of values, and all
>>> > > locations that have the same value should use the key call's token.
>>> > >
>>> >
>>> > Ok. Looks like this is going to be somewhat verbose for partial
>>> updates of SROA’ed aggregates as in the following example:
>>> >
>>> > // struct s { int i, j };
>>> > // void foo(struct s) { s.j = 0; ... }
>>> >
>>> > define void @foo(i32 %i, i32 %j) {
>>> >   %token = call llvm.dbg.value(token %undef, %i, !Struct,
>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>> >            call llvm.dbg.value(token %token, %j, !Struct,
>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>> >   ...
>>> >
>>> >   ; have to repeat %i here:
>>> >   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>> >           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>> >
>>> > On the upside, having all this information explicit could simplify the
>>> code in DwarfDebug::buildLocationList().
>>> >
>>> > Yeah, this is true. We could potentially extend the semantics by
>>> allowing separate key calls for pieces, i.e.
>>> >
>>> > %token = call llvm.dbg.value(token %undef, %i, !Struct,
>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>> >            call llvm.dbg.value(token undef, %j, !Struct,
>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>> >
>>> > ; This now only invalidates the .j part
>>> > %tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>> >
>>> > In that case we would probably have to require that all
>>> DW_OP_bit_pieces in non-key-call expressions are a subrange of those in the
>>> associated key call.
>>> >
>>> > This way all non-key-call additional locations are describing
>>> alternative locations for (a subset of) the bits described the key-call
>>> location. Makes sense, and again would simplify the backend’s work.
>>> >
>>> >
>>> > Is there any information in the tokens that could not be recovered by
>>> a static analysis of the debug intrinsics?
>>> > Note that having redundant information available explicitly is not
>>> necessarily a bad thing.
>>> >
>>> > I am not entirely sure what you are proposing. You somehow need to be
>>> able to encode which dbg.values invalidate previous locations and which do
>>> not. Since we're describing front-end variables this will generally depend
>>> on front-end semantics, so I'm not sure what a generic analysis pass can do
>>> here without requiring language-specific analysis.
>>> >
>>> > Right. Determining whether two locations have equivalent contents is
>>> not generally decidable.
>>> >
>>> > The one difference I noticed so far is that alternative locations
>>> allow earlier locations to outlive locations that are dominated by them:
>>> >   %loc = dbg.value(%undef, var, ...)
>>> >   ...
>>> >   %alt = dbg.value(%loc, var, ...)
>>> >   ...
>>> >   ; alt becomes unavailable
>>> >   ...
>>> >   ; %loc is still available here.
>>> >
>>> > Any other advantages that I missed?
>>> >
>>> > -- adrian
>>> >
>>> >
>>> > One thing I’m wondering about is whether we couldn’t design a
>>> friendlier (assembler) syntax for the three different use-cases:
>>> >   %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
>>> >   %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
>>> >   %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())
>>> >
>>> > Could be written as e.g.:
>>> >
>>> >   %tok1 = call llvm.dbg.value.new(%1, !var, !())
>>> >   %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
>>> >   %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())
>>> >
>>> > -- adrian
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > llvm-dev at lists.llvm.org
>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/e6f6efeb/attachment.html>