[PATCH] D26769: [IR] Remove the DIExpression field from DIGlobalVariable.

Fri Nov 18 15:49:39 PST 2016

> On Nov 18, 2016, at 1:33 PM, David Blaikie <dblaikie at gmail.com> wrote:

<snip>

>> 
>>>> 
>>>> I don't quite understand this concern. Couldn't any transformation simply ignore the LLVM_OP_piece at the end of the DIExpression since it has no effect?
>>> 
>>> That's roughly my concern - if we have to teach things to ignore it, and otherwise special case it (when producing DWARF later)), why build it into the DIExpression byte stream in general rather than pulling it out to a separate metadata operand, etc? It doesn't seem to fit there.
>> 
>> I think you might be overstating how much work it is to "teach" a transformation to ignore it. All (current) transformations that rewrite DIExpressions (with the exception of the ones creating DW_OP_pieces) work by concatenating a new expression with the existing expression, e.g., by putting a DW_OP_deref at the beginning of the expression.
> 
> Fair point - inserting things into the stack just after the implied raw element's basically free/orthogonal to the rest of the expression.
>  
>> The DIExpression -> DWARF Expression compiler has to special-case many operations already because it wants to find the most efficient encoding for the expression. This involves things like combining the location with the first operation (i.e., merging [reg], DW_OP_deref -> DW_OP_breg) and rewriting DW_OP_constu into, e.g, DW_OP_const1u. So I don't think it complicates the expression compiler at all; it needs to handle it anyway, and it needs to handle it at the end of the expression.
> 
> It seems the handling needs to be pretty different - since it'll be wanting to look across all the DIExpressions for the fragments and finding the holes to create those, etc.
> 
> It just seems not really part of the expression - at best silently ignored in some cases. But perhaps I'm really not appreciating the current emission code.
> 
> In fact from an API perspective it looks like it's embedded as a fairly first-level concept (looking at DebugLocEntry::finalize and it calling DIExpression::getBitPiece{Offset,Size}) - sort of supports the idea that this doesn't fit into a generic piece of an expression, but a fairly special case thing.
> 

Okay, I thought about this :-)

I think we need introduce a new DIExpression node format that has
the correct semantics for DW_OP_bit_piece and additional support
for (at least) also storing the offset into the source
variable. Below are two ideas for how we could do this.

Regardless of which implementation we choose we will also need to
introduce a new bitcode record for DIExpressions with the
corrected semantics in order to safely upgrade existing bitcode.

First let's revisit what DIExpression looks like:

// Slightly abbreviated in-memory representation.
class DIExpression {
  std::vector<uint64_t> Elements;
}

IR Syntax for the last 2 bytes of a 6-byte variable:
!DIExpression(..., DW_OP_bit_piece, 16, 32)

Proposal A
==========

We could add two extra fields to DIExpression to explicitly store
the size of the fragment this expression describes, together with
offset of the fragment inside the source variable. Use of
DW_OP_(bit_)piece is explicitly disallowed in this variant.

// In-memory.
class DIExpression {
  std::vector<uint64_t> Elements;
  uint64_t FragmentBitOffset = 0;
  optional<uint64_t> FragmentBitSize;
}

IR Syntax for the last 2 bytes of a 6-byte variable:
!DIExpression(..., fragmentBitSize: 16, fragmentBitOffset: 32)

Proposal B
==========

We could allow expressions with DW_OP_(bit_)piece and only store
the fragment offset. As shown in the second IR example, this
allows us to express things that would otherwise need masking and
shifting:

// In-memory.
class DIExpression {
  std::vector<uint64_t> Elements;
  uint64_t FragmentBitOffset = 0;
}

IR Syntax for the last 2 bytes of a 6-byte variable:
!DIExpression(..., DW_OP_piece, 2, fragmentBitOffset: 32)

IR Syntax for bits 8-11 of a variable stored in the upper nibble of a location.
!DIExpression(DW_OP_bit_piece, 4, 4, fragmentBitOffset: 8)

Proposal B doesn't meet your criteria for expression
transformations to not need to know about DW_OP_bit_piece, but at
least from my point of view that shouldn't cause a lot of trouble
in practice.

What do you think?
-- adrian