[llvm-dev] Proposal for multi location debug info support in LLVM IR

Keno Fischer via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 15 15:03:25 PST 2016


We do, ish, but it's not enforced as far as I can tell. I do think there is
a situation where clang can create such code (don't ask me how though, I
encountered it while hunting a different bug and just noticed it looked
odd). This was during an LTO build, so inlining related perhaps?

On Fri, Jan 15, 2016 at 11:55 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Fri, Jan 15, 2016 at 2:44 PM, Keno Fischer <
> kfischer at college.harvard.edu> wrote:
>
>> Adrian had proposed the following staging:
>>
>> 1. Remove offset argument from dbg.value
>> 2. Unify dbg.value and dbg.declare
>> 3. Full implementation
>>
>> I'm not yet sure what to do about the difference in dbg.declare
>> semantics. For example, i think the following currently works
>>
>> ```
>> top:
>> %x = alloca
>> br else
>>
>> if:
>> dbg.declare(%x...
>> unreachable
>>
>> else:
>> # dbg.declare still applies here
>>
>
> Hmm - I thought we had some (perhaps undocumented) rule that dbg.declares
> should all go in the entry with the allocas? I assume Clang follows this
> rule at least.
>
>
>> ```
>>
>> I think it would be reasonable to switch to the proposed dominance
>> semantics during step 2, but we'll have to see if that negatively affects
>> any real-world test cases.
>>
>> On Fri, Jan 15, 2016 at 11:38 PM, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> I'm reading/following along - discussion so far sounds reasonable to me.
>>>
>>> Only minor note: if dbg.value/declare can be narrowed down to one (I
>>> think you mentioned in your original proposal that it seemed like
>>> everything could be just dbg.value?) that'd be a good step, regardless -
>>> possibly ahead of/while this conversation is underway. Or is it the case
>>> that the proposed enhanced semantics are required before that transition
>>> (because currently dbg.value only goes to the end of the BB? if I recall
>>> correctly, whereas dbg.declare is the whole function)? In the latter case,
>>> perhaps it'd be a good first step/goal/transition to do as
>>> cleanup/generalization anyway.
>>>
>>> - Dave
>>>
>>> On Wed, Jan 6, 2016 at 2:02 PM, Vivek Sarkar via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> I will be out of the office on January 7th and will return on January
>>>> 19th.  I will not have access to email during this time.  Please contact
>>>> Karen Lavelle at klavelle at rice.edu or 713-348-2062 if you have any
>>>> questions or concerns.
>>>>
>>>> Best regards,
>>>> Annepha
>>>>
>>>> On Jan 6, 2016, at 3:58 PM, Adrian Prantl via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> >
>>>> > On Jan 5, 2016, at 10:37 AM, Keno Fischer <
>>>> kfischer at college.harvard.edu> wrote:
>>>> > On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com>
>>>> wrote:
>>>> > Thanks for the clarification, Paul!
>>>> > Keno, just a few more questions for my understanding:
>>>> >
>>>> > >     - Indicating that a value changed at source level (e.g. because
>>>> an
>>>> > >       assignment occurred)
>>>> >
>>>> > This is done by a key call.
>>>> >
>>>> > Correct
>>>> >
>>>> > >     - Indicating that the same value is now available in a new
>>>> location
>>>> >
>>>> > Additional, alternative locations with identical contents are added
>>>> by passing in the token from a key call.
>>>> >
>>>> > Correct
>>>> >
>>>> > >     - Indicating that a value is no longer available in some
>>>> location
>>>> >
>>>> > This is done by another key call (possibly with an %undef location).
>>>> >
>>>> > Not quite. Another key call could be used if all locations are now
>>>> invalid. However, to just remove a single value, I was proposing
>>>> >
>>>> > ; This is the key call
>>>> > %first = call token @llvm.dbg.value(token undef, %someloc,
>>>> >                                   metadata !var, metadata !())
>>>> >
>>>> > ; This adds a location
>>>> > %second = call token @llvm.dbg.value(token %second, %someotherloc,
>>>> >                                   metadata !var, metadata !())
>>>> >
>>>> > ; This removes the (%second) location
>>>> > %third = call token @llvm.dbg.value(token %second, metadata token
>>>> undef,
>>>> >                                   metadata !var, metadata !())
>>>> >
>>>> > Thus, to remove a location you always pass in the token of the call
>>>> that added the location. This is also the reason why I'm requiring the
>>>> second argument to be `token undef` because no valid location can be of
>>>> type token, and I wanted to avoid the situation in which a location gets
>>>> replaced by undef everywhere, accidentally turning into a removal of the
>>>> location specified by the key call
>>>> >
>>>> > Makes sense. If I understand your comment correctly, the following
>>>> snippet:
>>>> >
>>>> > %1 = ...
>>>> > %token = call llvm.dbg.value(token %undef, %1, !var, !())
>>>> > %2 = ...
>>>> > call llvm.dbg.value(token %token, %undef, !var, !())
>>>> > call llvm.dbg.value(token %undef, %2, !var, !())
>>>> >
>>>> > is equivalent to
>>>> >
>>>> > %1 = ...
>>>> > call llvm.dbg.value(token %undef, %1, !var, !())
>>>> > %2 = ...
>>>> > call llvm.dbg.value(token %undef, %2, !var, !())
>>>> >
>>>> > and both are legal.
>>>> >
>>>> > > > >
>>>> > > > >     - To add a location with the same value for the same
>>>> variable, you
>>>> > > > pass the
>>>> > > > >       token of the FIRST llvm.dbg.value, as this
>>>> llvm.dbg.value's first
>>>> > > > argument
>>>> > > > >       E.g. to add another location for the variable above:
>>>> > > > >
>>>> > > > >         %second =3D call token @llvm.dbg.value(token %first,
>>>> metadata
>>>> > > > %val2,
>>>> > > > >                                             metadata !var,
>>>> metadata
>>>> > > > !expr2)
>>>> > > >
>>>> > > > Does this invalidate the first location, or does this add an
>>>> additional
>>>> > > > location
>>>> > > > to the set of locations for var at this point? If I want to add a
>>>> third
>>>> > > > location,
>>>> > > > which token do I pass in? Can you explain a bit more what
>>>> information the
>>>> > > > token
>>>> > > > allows us to express that is currently not possible?
>>>> > > >
>>>> > >
>>>> > > It adds a second location. If you want to add a third location you
>>>> pass in
>>>> > > the first token again.
>>>> > > Thus the first call (key call) indicates a change of values, and all
>>>> > > locations that have the same value should use the key call's token.
>>>> > >
>>>> >
>>>> > Ok. Looks like this is going to be somewhat verbose for partial
>>>> updates of SROA’ed aggregates as in the following example:
>>>> >
>>>> > // struct s { int i, j };
>>>> > // void foo(struct s) { s.j = 0; ... }
>>>> >
>>>> > define void @foo(i32 %i, i32 %j) {
>>>> >   %token = call llvm.dbg.value(token %undef, %i, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>>> >            call llvm.dbg.value(token %token, %j, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>>> >   ...
>>>> >
>>>> >   ; have to repeat %i here:
>>>> >   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>>> >           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>>> >
>>>> > On the upside, having all this information explicit could simplify
>>>> the code in DwarfDebug::buildLocationList().
>>>> >
>>>> > Yeah, this is true. We could potentially extend the semantics by
>>>> allowing separate key calls for pieces, i.e.
>>>> >
>>>> > %token = call llvm.dbg.value(token %undef, %i, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>>> >            call llvm.dbg.value(token undef, %j, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>>> >
>>>> > ; This now only invalidates the .j part
>>>> > %tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
>>>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>>> >
>>>> > In that case we would probably have to require that all
>>>> DW_OP_bit_pieces in non-key-call expressions are a subrange of those in the
>>>> associated key call.
>>>> >
>>>> > This way all non-key-call additional locations are describing
>>>> alternative locations for (a subset of) the bits described the key-call
>>>> location. Makes sense, and again would simplify the backend’s work.
>>>> >
>>>> >
>>>> > Is there any information in the tokens that could not be recovered by
>>>> a static analysis of the debug intrinsics?
>>>> > Note that having redundant information available explicitly is not
>>>> necessarily a bad thing.
>>>> >
>>>> > I am not entirely sure what you are proposing. You somehow need to be
>>>> able to encode which dbg.values invalidate previous locations and which do
>>>> not. Since we're describing front-end variables this will generally depend
>>>> on front-end semantics, so I'm not sure what a generic analysis pass can do
>>>> here without requiring language-specific analysis.
>>>> >
>>>> > Right. Determining whether two locations have equivalent contents is
>>>> not generally decidable.
>>>> >
>>>> > The one difference I noticed so far is that alternative locations
>>>> allow earlier locations to outlive locations that are dominated by them:
>>>> >   %loc = dbg.value(%undef, var, ...)
>>>> >   ...
>>>> >   %alt = dbg.value(%loc, var, ...)
>>>> >   ...
>>>> >   ; alt becomes unavailable
>>>> >   ...
>>>> >   ; %loc is still available here.
>>>> >
>>>> > Any other advantages that I missed?
>>>> >
>>>> > -- adrian
>>>> >
>>>> >
>>>> > One thing I’m wondering about is whether we couldn’t design a
>>>> friendlier (assembler) syntax for the three different use-cases:
>>>> >   %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
>>>> >   %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
>>>> >   %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())
>>>> >
>>>> > Could be written as e.g.:
>>>> >
>>>> >   %tok1 = call llvm.dbg.value.new(%1, !var, !())
>>>> >   %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
>>>> >   %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())
>>>> >
>>>> > -- adrian
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160116/f52c84eb/attachment-0001.html>


More information about the llvm-dev mailing list