[PATCH] D58238: [DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions, try to forward copies

Fri Feb 15 02:54:22 PST 2019

bjope added a comment.

I have some doubt about this. Mainly the part about inserting undefs. Although, some of my doubts are partially based on the big-black-hole regarding how debug info is supposed to work for heavily optimized code (and more specifically how this is supposed to work in LLVM based on what we got today).

I'll try to describe this by an example. Assume the C-code looks like this:

  1:  x = y;
  2:  z = z & q1;
  3:  x = x +3;
  4:  z = z | q2;

then we might end up MIR like this as input to MachineSink

  %1 = LOAD ...
  DBG_VALUE %1, %noreg, "x", ...
  %2 = AND ...
  %3 = ADD %1, 3
  DBG_VALUE %3, %noreg, "x", ...
  %4 = OR %2, ...

Now assume that the ADD is sunk (for simplicity in this example just to below the OR and not to another BB).

Isn't it perfectly OK to get

  %1 = LOAD ...
  DBG_VALUE %1, %noreg, "x", ...
  %2 = AND ...
  %4 = OR %2, ...
  %3 = ADD %1, 3
  DBG_VALUE %3, %noreg, "x", ...

making it appear as "x" is equal to "y" up until the ADD has been executed.

As I understand it this patch would introduce a DBG_VALUE saying that we do not know the value of "x" when doing the OR.

  %1 = LOAD ...
  DBG_VALUE %1, %noreg, "x", ...
  %2 = AND ...
  DBG_VALUE %noreg, %noreg, "x", ...
  %4 = OR %2, ...
  %3 = ADD %1, 3
  DBG_VALUE %3, %noreg, "x", ...

But we have just delayed the ADD a little bit (so we have not really executed line 3 yet).

It might also be tempting to express the add in the DIExpression (a "salvage" kind of solution):

  %1 = LOAD ...
  DBG_VALUE %1, %noreg, "x", ...
  %2 = AND ...
  DBG_VALUE %1, %noreg, "x", !DIExpresssion("%1 + 3")
  %4 = OR %2, ...
  %3 = ADD %1, 3
  DBG_VALUE %3, %noreg, "x", ...

I think the "salvage" kind of solution is incorrect here. We have not optimized away the add, it has just been moved. When debugging it would be confusing if "x" already has the value "y + 3" before executing the ADD (which in this example probably is the only instruction with line 3 as debug location).

The alternative of making "x" appear as optimized out is also weird IMO. We have not lost track of the value of "x". It first gets the value %1 and then after the ADD it has the value %3.

Maybe it is a philosophical question. Is the compiler scheduling/reordering source instructions (in the debugger it will appear as source statements are executed in a random order), or are we scheduling/reordering machine instructions (and in the debugger it still should appear as if we execute variable assignments in the order of the source code). VLIW-scheduling, tail merging, etc, of course make things extra complicated, since we basically will be all over the place all the time.

The "copy-prop" part of this patch might be OK. But couldn't it just be seen as a the same scenario as with the ADD above, where we are delaying the assignment?

(Sorry if my simplified examples doesn't make sense for the actual problem that you attempt to solve.)

Does it matter that we actually sink into a later BB here? Is this fixing some problem where it from a debugging perspective would appear wrong if the variable does not get the new value before the end of the BB?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58238/new/

https://reviews.llvm.org/D58238