[llvm-dev] Proposal for multi location debug info support in LLVM IR
Adrian Prantl via llvm-dev
llvm-dev at lists.llvm.org
Wed Jan 6 13:58:54 PST 2016
> On Jan 5, 2016, at 10:37 AM, Keno Fischer <kfischer at college.harvard.edu> wrote:
>
> On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
> Thanks for the clarification, Paul!
> Keno, just a few more questions for my understanding:
>
> > - Indicating that a value changed at source level (e.g. because an
> > assignment occurred)
>
> This is done by a key call.
>
> Correct
>
> > - Indicating that the same value is now available in a new location
>
> Additional, alternative locations with identical contents are added by passing in the token from a key call.
>
> Correct
>
> > - Indicating that a value is no longer available in some location
>
> This is done by another key call (possibly with an %undef location).
>
> Not quite. Another key call could be used if all locations are now invalid. However, to just remove a single value, I was proposing
>
> ; This is the key call
> %first = call token @llvm.dbg.value(token undef, %someloc,
> metadata !var, metadata !())
>
> ; This adds a location
> %second = call token @llvm.dbg.value(token %second, %someotherloc,
> metadata !var, metadata !())
>
> ; This removes the (%second) location
> %third = call token @llvm.dbg.value(token %second, metadata token undef,
> metadata !var, metadata !())
>
> Thus, to remove a location you always pass in the token of the call that added the location. This is also the reason why I'm requiring the second argument to be `token undef` because no valid location can be of type token, and I wanted to avoid the situation in which a location gets replaced by undef everywhere, accidentally turning into a removal of the location specified by the key call
Makes sense. If I understand your comment correctly, the following snippet:
%1 = ...
%token = call llvm.dbg.value(token %undef, %1, !var, !())
%2 = ...
call llvm.dbg.value(token %token, %undef, !var, !())
call llvm.dbg.value(token %undef, %2, !var, !())
is equivalent to
%1 = ...
call llvm.dbg.value(token %undef, %1, !var, !())
%2 = ...
call llvm.dbg.value(token %undef, %2, !var, !())
and both are legal.
> > > >
> > > > - To add a location with the same value for the same variable, you
> > > pass the
> > > > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's first
> > > argument
> > > > E.g. to add another location for the variable above:
> > > >
> > > > %second =3D call token @llvm.dbg.value(token %first, metadata
> > > %val2,
> > > > metadata !var, metadata
> > > !expr2)
> > >
> > > Does this invalidate the first location, or does this add an additional
> > > location
> > > to the set of locations for var at this point? If I want to add a third
> > > location,
> > > which token do I pass in? Can you explain a bit more what information the
> > > token
> > > allows us to express that is currently not possible?
> > >
> >
> > It adds a second location. If you want to add a third location you pass in
> > the first token again.
> > Thus the first call (key call) indicates a change of values, and all
> > locations that have the same value should use the key call's token.
> >
>
> Ok. Looks like this is going to be somewhat verbose for partial updates of SROA’ed aggregates as in the following example:
>
> // struct s { int i, j };
> // void foo(struct s) { s.j = 0; ... }
>
> define void @foo(i32 %i, i32 %j) {
> %token = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32)))
> call llvm.dbg.value(token %token, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32)))
> ...
>
> ; have to repeat %i here:
> %tok2 = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32)))
> call llvm.dbg.value(token %tok2, metadata i32 0, !Struct, !DIExpression(DW_OP_bit_piece(32, 32)))
>
> On the upside, having all this information explicit could simplify the code in DwarfDebug::buildLocationList().
>
> Yeah, this is true. We could potentially extend the semantics by allowing separate key calls for pieces, i.e.
>
> %token = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32)))
> call llvm.dbg.value(token undef, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32)))
>
> ; This now only invalidates the .j part
> %tok2 = call llvm.dbg.value(token %undef, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32)))
>
> In that case we would probably have to require that all DW_OP_bit_pieces in non-key-call expressions are a subrange of those in the associated key call.
This way all non-key-call additional locations are describing alternative locations for (a subset of) the bits described the key-call location. Makes sense, and again would simplify the backend’s work.
>
> Is there any information in the tokens that could not be recovered by a static analysis of the debug intrinsics?
> Note that having redundant information available explicitly is not necessarily a bad thing.
>
> I am not entirely sure what you are proposing. You somehow need to be able to encode which dbg.values invalidate previous locations and which do not. Since we're describing front-end variables this will generally depend on front-end semantics, so I'm not sure what a generic analysis pass can do here without requiring language-specific analysis.
Right. Determining whether two locations have equivalent contents is not generally decidable.
> The one difference I noticed so far is that alternative locations allow earlier locations to outlive locations that are dominated by them:
> %loc = dbg.value(%undef, var, ...)
> ...
> %alt = dbg.value(%loc, var, ...)
> ...
> ; alt becomes unavailable
> ...
> ; %loc is still available here.
>
> Any other advantages that I missed?
>
> -- adrian
One thing I’m wondering about is whether we couldn’t design a friendlier (assembler) syntax for the three different use-cases:
%tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
%tok2 = call llvm.dbg.value(token %token, %2, !var, !())
%tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())
Could be written as e.g.:
%tok1 = call llvm.dbg.value.new(%1, !var, !())
%tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
%tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())
-- adrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160106/19273a48/attachment.html>
More information about the llvm-dev
mailing list