[PATCH] D60831: [DebugInfo at O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion

Thu Apr 18 21:17:59 PDT 2019

aprantl added a comment.

In D60831#1472604 <https://reviews.llvm.org/D60831#1472604>, @jmellorcrummey wrote:

> > A good example for why arbitrarily picking one location during merging is when the two locations are coming from different inlined instances of different functions (or perhaps even worse: two inlined instances of the same function). I would assume that even in profiling a wrong backtrace would invalidate or render untrustworthy large parts of any analysis being done one this data.
>
> I think we fundamentally disagree on what is good for profiling.

That is possible.

> If for instance two "load" instructions from different inlined contexts merge, I would prefer that they be charged to one of the locations where the instruction came from and the other gets the benefit of that operation for free. (That's what common subexpression elimination is for!) Saying that one got merged into the other or vice versa is an acceptable view. I don't see this as misleading, untrustworthy, or invalidating anything.
> 
> If the instructions come from two different files, I guess that you won't even associate it with one of the files. So, I have instructions in the binary that won't be covered by line map entries at all. Having LITERALLY NO INFORMATION where they came from without tracing instruction generation through the compiler is something that I fundamentally oppose.

Precisely to that point I was hoping to provide a few compelling counterexamples to demonstrate why potentially wrong information is actually worse than no information.

But I guess what this really boils down to is that all debug information in LLVM IR is (at the moment) "must" information that is supposed to be either 100% reliable or omitted. It sounds like for the kinds of analysis that you are doing, you would also benefit from a second category of "may" information that may or may not be valid. That's a legitimate ask, but if we wanted to include this in LLVM IR, we would need to qualify it as not reliable, so it doesn't, for example, leak into debug info that software developers rely on.

What I would find more interesting would be extending LLVM IR to support a one-to-many mapping from PC address to source locations. This way we would also be up front about the fact that the source location is one out of a set, but we then could use additional contextual information (such as the current backtrace and DWARF call site information) to potentially disambiguate them before use.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60831/new/

https://reviews.llvm.org/D60831