[PATCH] D82129: [DebugInfo] Drop location ranges for variables which exist entirely outside the variable's scope

Mon Jun 29 01:34:38 PDT 2020

Orlando added a comment.

In D82129#2115399 <https://reviews.llvm.org/D82129#2115399>, @dblaikie wrote:

> In D82129#2115249 <https://reviews.llvm.org/D82129#2115249>, @probinson wrote:
>
> > In D82129#2114150 <https://reviews.llvm.org/D82129#2114150>, @Orlando wrote:
> >
> > > Looking closer at COFF/register-variables.ll, it doesn't look like a bug but instead another victim of how we model debug info. Before running -branch-folder (Control Flow Optimizer) all the instructions in the else block belong to the else block scope. The branch folder merges the common tails from 'then' and 'else' into 'else', merging the debug-locations while it does so. @jmorse summarised the situation well offline: Every time we call getMergedLocation, we are creating the conditions where this occurs, and eliminating it during compilation would be a high burden.
> >
>
>
> Hmm - could you explain that in more detail? If we merge the locations both if and else scopes would cease to exist (since we can't represent that ambiguity), right? But the dbg.value doesn't use/care about its !dbg, so it continues existing/describing a variable location over some unrelated instructions?

That's right.

> Fair enough.
> 
>> And yet, the variable was allocated to a register, and the variable's location information pointed to the correct instruction range.
>>  Inadequacies in our ability to represent the scope properly shouldn't cause us to eliminate *correct* location information for variables.
> 
> This is I think a point of disagreement (between you and I) - I don't think it's useful to emit DWARF that describes variables outside their scope. I don't think any consumer should do anything useful with that data & it seems like wasted bytes to me. (name lookup wouldn't find the variable at any point where it has a location, etc)
> 
> If we've failed to track a variable's location within its scope, we shouldn't emit any location for it. I don't think that variable location information is correct if it's not in the scope of the variable - in this case, there is no scope of the variable (or its been reduced) - so no range of instructions over which to describe the location of the variable, since it's not in-scope.
> 
> If that's the case - that the merged instructions drop the scope and leave behind dbg.values that describe the variable even though it's not in scope - where the fix would be to remove the dbg.values (because now they describe the location of a variable outside that variable's scope) would be impractical/ie: it's cheaper to remove it later - then I'm OK with that.

That summarises the situation as I see it, yeah.

> Though I worry about that this would leave around a lot of dead dbg.value intrinsics, perhaps? That we'd be better off cleaning up earlier, not just for the sake of the resulting DWARF.

I don't think we could do it before isel. One reason being that in IR we only know where locations start and not exactly how far they extend which means we can't do very precise scope range overlap checks. I suppose it could be possible to do it earlier post-isel? Though trimming here at the end is very safe because we have the final program structure to work with; knowing nothing is going to move around afterwards is nice.

> From my perspective it's a question of whether we should actively drop that erroneous debug info at the end now - knowing that (so far as I've seen in the discussion) all such instances of it /might/ be bugs (if someone can show one example where we don't consider it a bug in an LLVM optimization

I'm not sure I follow here, please could you rephrase this part?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82129/new/

https://reviews.llvm.org/D82129