[PATCH] D56587: Fix sign/zero extension in Dwarf expressions.

Tue Jan 15 14:22:55 PST 2019

bjope added inline comments.

================
Comment at: lib/Transforms/Utils/Local.cpp:1856
+        // (((To >> (ToBits - 1)) * (~0)) << ToBits) | To
+        SmallVector<uint64_t, 11> Ops({dwarf::DW_OP_dup,
+                                       dwarf::DW_OP_constu, ToBits - 1,
----------------
I guess this still is wrong, at least if we end up with a DWARF location description for a memory location.

If for example the variable is 32-bits, and we want to describe it using a 16-bit value, then we will get something like this:
```
call void @llvm.dbg.value(metadata i16 %value, metadata "!variable", metadata DIExpression(DW_OP_dup, DW_OP_constu, 15, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_constu, 16, DW_OP_shl, DW_OP_or, DW_OP_stack_value)
```
In llc this will become a DBG_VALUE. The value could either end up referring to a 16-bit register, or it could refer to a 16-bit stack slot (e.g. if this is an input argument passed on the stack). In the latter case we typically end up prepending the DIExpression with DW_OP_fbreg and an offset. The memory location will point to the 16-bit value.

But we do not really express that the debugger should read a 16-bit value here, right? The debugger will only see that the variable is 32-bits, so it will read 32-bits, right?

For a little endian target we will get garbage in bits 16-31 (since we read outside the 16-bit stack slot). For a big endian target we will get the wanted value in bits 16-31 and garbage in bits 0-15. Either way, the result would be wrong. For little endian we would need to clear bit 16-31 before the OR with the sign-extension mask. For big endian we aren't even operating on the correct bits.

I'm not really sure what happens if the debugger finds a 16-bit register location for the 32-bit variable. Do we know that it only us reading 16-bits to the value stack?

One solution could be to use DW_OP_deref_size when reading from memory, to specify that we only want to read 16 bits. I'm not sure exactly how DwarfExpression could know when this is needed. I guess we can not add the DW_OP_deref_size already here, because it would be wrong in case of ending up with a register location. But maybe we still need to do something more also for the register location scenario when using this approach.

An alternative solution is to describe the variable using two dbg.value intrinsics. One using a fragment for bits 0-15, and another one using a fragment expression for bits 16-31. I guess it would look something like this:
```
call void @llvm.dbg.value(metadata i16 %value, metadata "!variable", metadata DIExpression(DW_OP_LLVM_fragment 0, 16)
call void @llvm.dbg.value(metadata i16 %value, metadata "!variable", metadata DIExpression(DW_OP_constu, 15, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_LLVM_fragment 16, 16)
```

I've seen the discussion about DW_OP_convert. Would DW_OP_convert help in telling the debugger that any derefs should be 16 bits in this case. Then I guess that still would be good for DWARF5.

Similar problem as described above also exists for the zext case below. At least for big endian when dereferencing memory, since we get the wrong value in the least significant bits when reading 32 bits from a 16-bit stack slot.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56587/new/

https://reviews.llvm.org/D56587