[PATCH] D57962: [DebugInfo] PR40628: Don't salvage load operations

Mon Feb 11 04:08:08 PST 2019

jmorse added a comment.

Hi,

In D57962#1391428 <https://reviews.llvm.org/D57962#1391428>, @aprantl wrote:

> Would it be feasible to inject a dbg_value(undef) at the end of the basic block or before the next store, whichever is earlier, instead?

That'd totally work, and could even be done quite late in compilation. However I think the greater issue is that it becomes difficult to determine when code movement in optimisation passes affects dbg.values. Consider this example, compiling "-O2 -g -fno-inline -fno-unroll-loops" with clang at r353515:

  static int qux[] = { 1, 2, 3, 4 };

  int
  foo(int baz, int *out)
  {
    int sum = 0;
    for (int i = 0; i < 4; i++) {
      int extra = *out;
      sum += qux[i] * baz;
      sum %= 4;
      *out = sum;
    }
    return sum;
  }

  int
  main(int argc, char **argv)
  {
    int out = 12;
    return foo(argc, &out);
  }

(The assign to 'extra' is contrived but eliminating loads is something LLVM does all the time I believe). In this code, LICM promotes '*out' to an SSA register for the body of the loop. However, before LICM the dbg.value for 'extra' has its load operand folded into a DW_OP_deref, and points at the variable in 'main'. Stepping through this program in gdb, 'extra' always reads as '12', rather than any of the values computed in the loop. Invalidating the dbg.value at the next memory store wouldn't help, as the location specified by the dbg.value never holds the value we're interested in.

My main point here is that not only would LICM need to be taught that moving the store to '*out' could invalidate some dbg.values, determining which dbg.values are affected would involve digging into DIExpressions looking for dereferences, potentially through GEPs that were folded in too, possibly even requiring alias analysis on any dbg.value with a deref. Which (as far as I'm aware) is a reasonably large amount of code complexity and computation. DeadStoreElimination would need to behave similarly (a dbg.value might try to point at the memory of a store that's later eliminated). I'm not incredibly familiar with all of LLVMs passes, but there could be more to be taught.

There might be some kind of half-way house like inserting dbg.value(undef...) to terminate the range of dbg.values containing DW_OP_deref, at the point where any such pass changes visible stores, but I believe this would still involve teaching new things to passes.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57962/new/

https://reviews.llvm.org/D57962