[PATCH] D76336: [DWARF] Emit DW_AT_call_pc for tail calls

Thu Mar 19 15:54:31 PDT 2020

vsk added a comment.

In D76336#1932313 <https://reviews.llvm.org/D76336#1932313>, @dblaikie wrote:

> >> @dblaikie wrote:
> >> why is this only relevant in optimized builds?
> > 
> > I think the main benefit of the call-site information is using it together with call-site-parameters, used for computing the actual value of parameter, even the location of the parameter was not provided (optimized-out).
>
> That presents some interestingly tricky challenges. We have -fstandalone-debug for if you are building one part of a program with debug info an others without - but there's nothing equivalent for if you're building part of a program optimized but other parts unoptimized. And this heuristic (only emit these attributes for optimized code) assumes the caller and callee are both equally optimized/similarly compiled - which isn't necessarily true.

That's ok though, because a debugger can handle call site entries being only partially available. I.e. there's no reason (afaik) for mixing and matching optimized/unoptimized .o's to regress debugger features enabled by call site entries.

>> That improves debugging user experience when debugging optimized code. In addition, in the case of tail calls, the call_site debug info is used for printing artificial call frames for the tail calls (and tail calls are typical to optimized code?).
> 
> The tail call case is easier - since that's the caller-side. It'd probably be better to just emit that on any tail call, optimized or unoptimized code - I guess I mean, ideally the choice wouldn't be made at the frontend, but at the backend if the call ends up being a tail call.

We only want to emit call site entries at tail-calling sites when the caller has debug info, though. We do that today by relying on the DIFlagAllCallsDescribed attribute, which the frontend provides. Also fwiw the artificial frames debugger feature doesn't work if only the tail-calling sites are described -- all of the calls have to be described for the debugger to reconstruct feasible paths through the call graph.

> (I'm thinking of LTO situations, attribute optnone, other things like that - the frontend doesn't really know if something is optimized or not & really you can't tell if a callee is optimized because it's in another translation unit)

Hm, oh, good point. But, 99+% of the time, isn't the workflow to compile with -flto=thin + -O{1,2,3,s,z}? We handle that fine. But I guess if you're doing `-O0 -disable-O0-optnone -flto`, you wouldn't get call site entries. Hrm. Does that come up much? I guess we could fix that by adding a frontend flag?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76336/new/

https://reviews.llvm.org/D76336