[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Alok Sharma via llvm-dev llvm-dev at lists.llvm.org
Mon Dec 23 11:03:40 PST 2019


Hi Paul,

As David already replied about the emergence of
DW_OP_LLVM_explicit_pointer. Let me explain a bit more about it.

In order to address a case David has put regarding a variable pointing to a
temporary (which happens in case of references). For the same case a
solution is already suggested by you (using artificial variable for
temporary).

To make maximum use of the discussion, I tried to provide additional option
to choose from.

Note that this case is not handled even by gnu gcc, So how much gcc does
should be *must* for us and beyond that anything should be *aspire*.

Now to include that aspire case we have two options

1. Create Artificial variable (flip side we need to carry extra artifical
DIE)
2. Define the value inline using DW_OP_LLVM_explicit_pointer (flip side new
operator need to be introduced)

I think we should go ahead with *must* functionality anyway and chose one
of the options for *aspire* functionality.

Regards,
Alok

Since this case


On Thu, Dec 19, 2019 at 9:57 PM Robinson, Paul <paul.robinson at sony.com>
wrote:

> I regret to say I also have not been following this with the attention it
> deserves, and I am pretty much on holiday until 14 January.
>
> I am particularly surprised by the appearance of something called
> DW_OP_LLVM_explicit_pointer, which I wouldn’t have thought necessary and
> don’t remember from the discussions that I did read.
>
> I will try to mend my ways and pay more attention when I return.
>
> --paulr
>
>
>
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* Wednesday, December 18, 2019 6:24 PM
> *To:* Alok Sharma <aloksharma.knit at gmail.com>; Adrian Prantl <
> aprantl at apple.com>; Jonas Devlieghere <jdevlieghere at apple.com>; Robinson,
> Paul <paul.robinson at sony.com>
> *Cc:* Jeremy Morse <jeremy.morse.llvm at gmail.com>; llvm-dev <
> llvm-dev at lists.llvm.org>; AlokKumar.Sharma at amd.com; Vedant Kumar <
> vedant_kumar at apple.com>
> *Subject:* Re: [llvm-dev] DW_OP_implicit_pointer design/implementation in
> general
>
>
>
> (I'm still pretty concerned that there are IR changes going in for a
> feature that seems incomplete and more invasive than really seems justified
> to me - though I admit I'm clearly not paying enough attention to this
> feature to have a nuanced/fully informed opinion & so maybe I just need to
> step back from all of this - but given the addition of new intrinsics, it
> seems like there should be more clear design discussion)
>
>
>
> On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at gmail.com>
> wrote:
>
> Hi David,
>
>
>
> This is regarding missing multilevel handling in branch for explicit
> pointers.
>
>
>
> > * does the proposed IR format support multiple layers of dereference
> (eg: int ** where we know it ultimately points to the value 3 but can't
> describe either the first or second level pointers that get to that value)
> - it sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's
> overly narrow/special case, then?
>
>
>
> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of
> multilevel indirection. As of now it is so due to below reason.
>
>
>
>  Explicit pointer handles cases when variable points to a temporary which
> contains constant. Due to language standard constraints, we don't find
> pointers in such cases, what we get is references. Unlike pointers,
> references have single level. (reference to reference is just reference
> while pointer to pointer is double pointer).
>
>  Case of reference to reference,  second level can be handled using
> DW_OP_LLVM_explicit_pointer itself.
>
>  Case of pointer to reference, second level can be handled using
> DW_OP_implicit_pointer.
>
>
>
> Though it would not be complex to make explicit pointer multilevel, I
> avoided so due to lack of use case. Please let me know if I am missing
> something.
>
>
> Sorry, I couldn't understand your language related to references and
> pointers - I don't understand why they would be handled differently or
> represent challenges/tradeoffs for features related to collapsed
> indirection like this.
>
> Multi-level indirection seems to have as much use as single level
> indirection. (if a DWARF user may want to know what a pointer points to
> even when what it points to isn't in memory, the same would hold true for
> pointers to pointers, etc)
>
> I would expect this to be handled with a general OP saying "hey, I'm
> skipping one level of indirection indirection in the resulting value,
> because that indirection is missing/not in the final program" and that this
> would be encoded in a llvm.dbg.value/DIExpression as usual, without the
> need for new IR intrinsics, though possibly with the need for an LLVM
> extension DWARF OP (DW_OP_LLVM_explicit_pointer?)
>
> To reconstitute that general form into the current DWARF limited
> "indirection needs to refer to another variable DIE" issue - as I think
> Paul speculated previously, we could always reconstitute a synthetic
> variable DIE & not try to reflect the case where the indirection lands at
> another named/known variable - as I expect that's the minority case. In
> most cases in C++ I expect pointers and references do not refer to named
> variables in the same function. They refer to return values from functions,
> they refer to array elements in dynamically allocated arrays, etc, etc.
>
>
>
>
> Regards,
>
> Alok
>
>
>
>
>
> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at gmail.com>
> wrote:
>
> Let me try to summarize the implementation first.
>
>
>
> At the moment, there are two branches.
>
>
>
> 1. When an existing variable is optimized out and that variable is used to
> get the de-refereced value, pointed to by another pointer/reference
> variable.
>
>   Such cases are being addressed using Dwarf expression
> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen
> implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we
> represent it using dbg.derefval (which denotes derefereced value of pointer
> or reference) and DW_OP_LLVM_implicit_pointer operation.
>
>
>
> 2. When a temporary variable is optimized out and that variable is used to
> get de-referenced value of another reference variable (AFAIK it can not be
> reproduced with pointers)
>
>   Such cases are being addressed using new Dwarf expression
> DW_OP_explicit_pointer as de-referenced value can be displayed explicitly
> (in place). In LLVM IR, we represent it using dbg.derefval and
> DW_OP_LLVM_explicit_pointer operation.
>
>
>
> Both of these two branches have some common implementation to define new
> operations (Dwarf and IR). (D70642, D70643, D69999, D69886).
>
> First branch has additional patches (D70260, 70384, D70385, D70419).
>
> Second branch has additional patch ( D70833).
>
>
>
> Let me try to comment on points raised by you.
>
> - Branch 2, (patch D70833) handles cases when temporaries (not existing
> variables) are optimized out.
>
> - In patch D70385, I have included test points to display that multi
> layered pointers are working
> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c).
>
>
>
> I feel that review of branch 1 (implicit pointer) can be resumed (which
> was halted due to current discussion), while we can continue to discuss
> branch 2 (explicit pointers D7083) if you want. David, what do you think?
>
>
>
> Regards,
>
> Alok
>
>
>
> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> wrote:
>
> Sorry I haven't been more engaged with this thread, I have been reading
> it, so hopefully my reply isn't completely out of line/irrelevant - but I
> still feel like having a custom dwarf expression operator (& no new
> intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't
> actually generated into the DWARF - though this one perhaps could be in
> some/all cases as an extension, maybe - or a synthesized variable could be
> created for compatibility with the current DWARF standard) would make the
> most sense.
>
> Some thought experiments that I think are relevant:
> * does the proposed IR format scale to pointers that don't point to
> existing variables (that I think has already been touched on in this thread)
> * does the proposed IR format support multiple layers of dereference (eg:
> int ** where we know it ultimately points to the value 3 but can't describe
> either the first or second level pointers that get to that value) - it
> sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's
> overly narrow/special case, then?
>
>
>
> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi folks,
>
>
>
> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review which
> includes the case when temporary is promoted.
>
>
>
> For such cases it generates IR as
>
>
>
>   call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata
> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32
>
>
>
> And llvm-darfdump output looks like
>
>
>
> -------------
>
> 0x0000007b:     DW_TAG_inlined_subroutine
>                   DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi")
>                   DW_AT_low_pc  (0x00000000004004c6)
>                   DW_AT_high_pc (0x00000000004004d0)
>                   DW_AT_call_file
> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc")
>                   DW_AT_call_line       (10)
>                   DW_AT_call_column     (0x03)
>
> 0x00000088:       DW_TAG_formal_parameter
>                     DW_AT_location      (indexed (0x0) loclist =
> 0x00000010:
>                        [0x00000000004004c6, 0x00000000004004d4):
> DW_OP_explicit_pointer, DW_OP_lit3)
>                     DW_AT_abstract_origin       (0x00000055 "p")
>
> ------------
>
>
>
> Please note that DW_OP_explicit_pointer denotes that following value
> represents de-referenced value of optimized out pointer. With necessary
> changes in LLDB debugger this dwarf info can help to detect the explicit
> de-referenced value of 'p'.
>
>
>
> Hi David,
>
>
>
> Should we keep on working for the above case separately and resume the
> review of implicit pointer independently now, which is updated with many
> suggestions from this discussion?
>
>
>
> Regards,
>
> Alok
>
>
>
>
>
> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <jeremy.morse.llvm at gmail.com>
> wrote:
>
> Hi,
>
> For a new way of representing things,
>
> Adrian wrote:
> > llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0,
> DW_OP_LLVM_arg1, DW_OP_plus),
> >                    %ptr, %ofs)
>
> I think this would be great -- there're definitely some constructs
> created by the induction-variables pass and similar where one could
> recover an implicit variable value, if you could for example subtract
> one pointer from another.
>
> With the current model of storing DIExpressions as a vector of
> opcodes, it might become a pain to salvage a Value that gets optimised
> out --in the example, if %ofs were salvaged, presumably
> DW_OP_LLVM_arg1 could have to be replaced with several extra
> operations. This isn't insurmountable, but I've repeatedly shied away
> from scanning through DIExpressions to patch them up. A vector of
> opcodes is the final output of the compiler, IMHO richer metadata
> should be used in the meantime.
>
> IMHO the implicit pointer work doesn't need to block on this. As said
> my mild preference would be for a new intrinsic for this form of
> variable location.
>
> ~
>
> Inre PR37682,
>
> > I’ve been reminded of PR37682, where a function with a reference
> parameter might spend all its time computing the “referenced” value in a
> temp, and only move the final value back to the referenced object at the
> end.  This is clearly a situation that could benefit from
> DW_OP_implicit_pointer, and there is really no other-object DIE for it to
> refer to.  Given the current spec, the compiler would need to produce a
> DW_TAG_dwarf_procedure for the parameter DIE to refer to.  Appendix D
> (Figure D.61) has an example of this construction, although it’s a more
> contrived source example.
>
> This has been working through my mind too, and I think it's slightly
> different to what implicit_pointer is trying to achieve. In the case
> implicit_pointer is designed for, it's a strict improvement in debug
> experience because you're recovering information that couldn't be
> expressed. However for PR37682 it's a trade-off between whether the
> user might want to examine the pointer, or the pointed-at integer:
> AFAIUI, we can only express one of the two, not both. Wheras for
> mem2reg'd variables referred to by DIE, there is never a pointer to be
> lost.
>
> I think my preference would always be to see temporarily-promoted
> values as there's no other way of observing them, but others might
> disagree.
>
> --
> Thanks,
> Jeremy
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191224/7abf9b49/attachment-0001.html>


More information about the llvm-dev mailing list