[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Mon Nov 18 14:06:36 PST 2019

I’ve been reminded of PR37682, where a function with a reference parameter might spend all its time computing the “referenced” value in a temp, and only move the final value back to the referenced object at the end.  This is clearly a situation that could benefit from DW_OP_implicit_pointer, and there is really no other-object DIE for it to refer to.  Given the current spec, the compiler would need to produce a DW_TAG_dwarf_procedure for the parameter DIE to refer to.  Appendix D (Figure D.61) has an example of this construction, although it’s a more contrived source example.

Does it have to be spec’d this way?  I think the spec as given is general enough to support DW_OP_implicit_pointer to an aggregate, with different locations for each member.  You could probably come up with a way to specify simpler cases more simply, although you’d need a new DW_OP to do that—there’s no explicit FORM describing the operand of a DW_OP, so we can’t just mess with how the operands are interpreted.
--paulr

From: David Blaikie <dblaikie at gmail.com>
Sent: Friday, November 15, 2019 12:54 PM
To: Robinson, Paul <paul.robinson at sony.com>
Cc: Adrian Prantl <aprantl at apple.com>; AlokKumar.Sharma at amd.com; Jonas Devlieghere <jdevlieghere at apple.com>; llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: DW_OP_implicit_pointer design/implementation in general

On Fri, Nov 15, 2019 at 8:07 AM Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote:
| Any ideas why it wouldn't be more general to handle cases where the variable isn't named?

Couldn’t there be a DIE (flagged as artificial) to describe the return-value temp?

There could be - though there are very few (the array bound example Adrian gave is the only one I know of - and even that seems unnecessary/GCC uses a different (& I think better/clearer/simpler) representation) cases of artificial variables being generated in Clang/LLVM - it lacks precedent so far as I can tell.

  You’d need such a DIE if you wanted the debugger to be able to look at the return value from source() anyway,

Not so far as I know - with GDB (& I assume LLDB) when you call a function and return from it (eg: "finish" or "step" that steps across the end of a function) the debugger prints out the return value (using the DW_AT_type of the DW_TAG_subprogram that was executing & its knowledge of the ABI to know where/how that value would be stored during the return) & you can actually then query it and do other things using the artificial variable name GDB provides

(my example was slightly bogus - you can't take the address of a temporary in C++ like that, but you can take a reference to it, so updating & fleshing out the test:

__attribute__((optnone)) int source() {
  return 3;
}
__attribute__((optnone)) void f(int) {
}
inline void sink(const int& p) {
  f(p);
}
int main() {
  sink(source());
}

& then playing that through GDB:

(gdb) start
Temporary breakpoint 1 at 0x401131: file var.cpp, line 10.
Starting program: /usr/local/google/home/blaikie/dev/scratch/a.out

Temporary breakpoint 1, main () at var.cpp:10
10        sink(source());
(gdb) s
source () at var.cpp:2
2         return 3;
(gdb) fin
Run till exit from #0  source () at var.cpp:2
main () at var.cpp:10
10        sink(source());
Value returned is $1 = 3
(gdb) s
sink (p=<optimized out>) at var.cpp:7
7         f(p);

It'd be nice if the value of 'p' could be printed there, but it seems without introducing artificial variables, the implicit_pointer doesn't provide a way to do that & that seems to me like an unnecessary limitation & complication in the DWARF and in LLVM's intermediate representation compared to having 'p's DW_AT_location describe the value being pointed to directly without the need for another variable?

- Dave

in the context of main() and in the absence of inlining.  And given that DIE, implicit_pointer within sink() can refer to it.

From: David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>>
Sent: Thursday, November 14, 2019 5:32 PM
To: Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>>
Cc: Adrian Prantl <aprantl at apple.com<mailto:aprantl at apple.com>>; AlokKumar.Sharma at amd.com<mailto:AlokKumar.Sharma at amd.com>; Jonas Devlieghere <jdevlieghere at apple.com<mailto:jdevlieghere at apple.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: Re: DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:53 PM Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote:
My reading of the DWARF issue is that it was fairly specifically designed to handle the case of a function taking parameters by pointer/reference, which is then inlined, and the caller is passing local objects rather than other pointers/references.  So:

void inline_me(foo *ptr) {
 does something with ptr->x or *ptr;
}
void caller() {
  foo actual_obj;
  inline_me(&actual_obj);
}

After inlining, maintaining a pointer to actual_obj might be sub-optimal, but after a “step in” to inline_me, the user wants to look at an expression spelled *ptr even though the actual_obj might not have a memory address (because fields are SROA’d into registers, or whatever).  This is where DW_OP_implicit_pointer saves the day; *ptr and ptr->x are still evaluatable expressions, which expressions are secretly indirecting through the DIE for actual_obj.

I think it is not widely applicable outside of that kind of scenario.

Any ideas why it wouldn't be more general to handle cases where the variable isn't named? Such as:

foo source();
void f(foo);
inline void sink(foo* p) {
  f(*p);
}
int main() {
  sink(&source());
}

--paulr

From: David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>>
Sent: Thursday, November 14, 2019 4:34 PM
To: Adrian Prantl <aprantl at apple.com<mailto:aprantl at apple.com>>
Cc: AlokKumar.Sharma at amd.com<mailto:AlokKumar.Sharma at amd.com>; Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>>; Jonas Devlieghere <jdevlieghere at apple.com<mailto:jdevlieghere at apple.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: Re: DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:27 PM Adrian Prantl <aprantl at apple.com<mailto:aprantl at apple.com>> wrote:

> On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:
>
> Hey folks,
>
> Would you all mind having a bit of a design discussion around the feature both at the DWARF level and the LLVM implementation? It seems like what's currently being proposed/reviewed (based on the DWARF feature as spec'd) is a pretty big change & I'm not sure I understand the motivation, exactly.
>
> The core point of my confusion: Why does describing the thing a pointer points to require describing a named variable that it points to? What if it doesn't point to a named variable?

Without having looked at the motivational text when the feature was proposed to DWARF, my assumption was that this is similar to how bounds for variable-length arrays are implemented, where a (potentially) artificial variable is created by the compiler in order to have something to refer to.

I /sort/ of see that case as a bit different, because the array type needs to refer back into the function potentially (to use frame-relative, etc). I could think of other ways to do that in hindsight (like putting the array type definition inside the function to begin with & having the count describe the location directly, for instance).

In retrospect I find the entire specification of DW_OP_implicit_pointer to be strangely specific/limited (why one hard-coded offset instead of an arbitrary expression?), but that ship has sailed for DWARF 5 and I'm to blame for not voicing that concern earlier.

Sure, but we don't have to implement it if we don't find it to be super useful/worthwhile, right? (if something else would be particularly more general/useful we could instead implement that as an extension, though of course there's cost to that in terms of consumer support, etc)

-- adrian

>
> Seems like there should be a way to describe that situation - and that doing so would be a more general solution than one limited to only describing pointers that point to named variables. And would be a simpler implementation in LLVM - without having to deconstruct variables during optimizations, etc, to track one variable's value being concretely related to another variable's value.
>
> - David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191118/f5b90362/attachment-0001.html>