[clang] [llvm] [clang][DebugInfo] Add virtual call-site target information in DWARF. (PR #167666)

Thu Dec 4 11:20:49 PST 2025

dwblaikie wrote:

> Carlos is away at the moment so I'm happy to take the nop-sled question. Sorry for the slight wait, I've also been in and out of the office this week.
> 
> > Problem is that DWARF isn't enough to identify ICF sites for indirect calls - because it doesn't record the originally intended destination of the call, only the actual call - so there's nothing to compare to to detect the difference.
> 
> That makes sense. However, I think there's more than just ICF detection at play here as I believe it's also the ability to disambiguate between tail calls and ICF that is valuable. The SCE debugger recovers entry values using call site parameter DIEs, so it's important to know whether or not there has been a tail call between that call site info (in the next frame up/parent frame) and the current call frame, as that would mean the call site information does not describe the call to the current frame.
> 
> Where there is a mismatch between the parent frame call_origin and current frame it could be due to either ICF or a tail call. Today, we can disambiguate those cases for direct calls using the call_origin as already mentioned.
> 
> However, I don't think there's a foolproof way to determine whether there's been a tail call for indirect calls currently. Right now the debugger defensively hides the entry_values if the parent frame doesn't have a call_origin at the call site (e.g., because it's indirect). Adding `virutal_call_origin` for virtual calls lets us perform that check over some indirect call sites, albeit imperfectly (a tail call through to another implementation of the virtual function would be missed, I think).

tail calls have `DW_AT_call_tail_call` on them, right?

> > I know this is way off in a different direction, but...
> 
> > Have you considered using a NOP-sled at the start of ICF'd functions (this would be a linker change - when ICF'ing, rather than resolving every address to the ICF'd version - resolve each unique original function to a unique NOP at the start of/before the start of the chosen preserved copy - that way each original function would still have a unique address at the call sites)? This would solve any cases of indirect calls, not just virtual ones, and would work in a debugger, would preserve uniqueness of addresses for C++ function address comparison semantics, etc.
> 
> That sounds interesting, I had not seen this idea before. I wonder, does this only help you detect ICF specifically on the first step into a call? I can't picture how this would work if you paused execution in the middle of the code-folded function.

So you already do this for statically known call sites by comparing the function you're in to the caller's call_site's target function - if they're different, it's ICF (and/or a tail call - though the caller's call site should tell you if it's a tail call, so you can know if you're ICF-that's-not-a-tail-call, but if it's a tail call you could be in tail call or tail call+ICF)

With a NOP sled you could do this for dynamic calls too - because you could differentiate between the start of the function you're in and the caller's call_site's call_target, if you can still evaluate that (if registers haven't been clobbered/not saved/etc). Which you couldn't do without the nop sled, because the caller's call_target would already be resolved to point to the ICF'd function.

> I'm not sure it helps us with the tail call detection for entry value resolution for indirect calls. I appreciate that isn't what you were suggesting or responding to, but I think we'd still like to solve that.

Rather than trying to detect tail calls - can't you rely the call_site to tell you if it's a tail call? The compiler knows and should be able to emit this information reliably (& seems to at least in some cases - if there's gaps, hopefully those are fixable)

& yeah, if you see a tail call - if you can get to the function the tail call is for (even if it's a dynamic dispatch - you can then lookup the function it's dispatching to in the DWARF because you know at runtime/post-runtime what it is) and if it isn't the callee, don't use the call_site_parameters, and if it is the callee you have to analyze all that direct call's calls (recursively), and up from the callee, until you have a graph from caller to callee following only tail call edges and check that the callee appears nowhere else in that graph (eg: in the simple case if the callee is tail recursive - you don't know whether you're in the first call or the Nth call, so you can't reliably use the call site parameters).

> > It seems like a misuse of DW_OP_call_origin & I think it'd be more suitable to use a distinct attribute for this since it has different semantics from the standard attribute.
> 
> That seems fair to me, and it looks like Carlos has taken this on board and added a new attribute.

Yep yep - wondering if we can avoid that/have some more general solution that solves some other problems too.

> > (but, yeah, I'd consider NOP sleds as a more general solution - they have some code size impact, but maybe it's small enough to keep them even in fully optimized builds)
> 
> Some of our users are quite constrained by code size (and it seems reasonable to assume it's likely to be those users that are using ICF), so that could potentially be an issue. Are there performance implications? Assuming that the call sites also get patched to the unique nops (otherwise the linker needs to potentially do special things for debug info, which sounds like it's not ideal, unless I'm not seeing the full picture?).

Yep, it'd have non-zero size and performance implications - I don't know how small the epsilon is & whether it's below noise/below some acceptable cost threshold.

https://github.com/llvm/llvm-project/pull/167666