[LLVMdev] please review this fix for PR3510

Mon Feb 23 10:17:21 PST 2009

Hi Stuart,

> The original code was greatly simplified by using FieldNo to index the  
> LLVM fields and the initializer in lock-step.  Alas, that fails when  
> the initializer requires alignment padding; such padding is not  
> recorded in the LLVM structure type.  This implies that the  
> initialized ResultElts[] may have many more fields than the LLVM type.
> 
> The patched code tracks the HighWaterMarkInBits; it points to the next
> bit to allocate.  If the starting offset of the next initializing  
> value doesn't match, either byte-alignment or padding is indicated.   
> FieldNo counts the fields (initializers or padding) created in  
> ResultElts[], and LLVMFieldNo walks through the LLVM structure type  
> fields.  Note that ResultElts[] is now created to hold 2X the number  
> of LLVM fields in order to accommodate padding fields; it is shrunk-to- 
> fit after the initializer is complete.

these explanations should be in the code as comments.

> I've seen hints in the code that C++ can generate overlapping field  
> declarations, but I haven't personally seen this, and my patch has no  
> explicit provision for it.  If any reader has experience with this  
> issue, I would be grateful for assistance or a demonstrating  
> testcase.  I would not be surprised if my patch failed when  
> encountering an unholy convergence of packed designated bitfields or  
> whatever.

Please add assertions to catch such pathologies.

> The existing ProcessBitFieldInitialization() is invoked for every  
> bitfield; it expects to be passed the same FieldNo pointing at a  
> previously-created initializer so it can modify a previously-existing  
> constant value.  This is counter-intuitive when compared with the non- 
> bitfield case that increments FieldNo after every initialization is  
> processed, but I believe this is necessary.  However, there is an  
> awkward mental gear-shift when encountering a non-bitfield initializer  
> following a bitfield; this is handled with an ugly  
> "PredecessorWasBitfield" state-variable.  (Suggestions for a less- 
> inelegant solution are welcome.)

Please explain all this with comments in the code.

> The patch handles the PR3510 testcase correctly and has successfully  
> passed the GCC DejaGNU testsuite.  However, I don't think we have  
> sufficient tests for this issue, so I'm working on a few new testcases  
> in the background.

Good!

As for the patch itself:

> +++ llvm-gcc.test/gcc/testsuite/gcc.apple/6564697.c	(revision 0)

I'd rather have the test in the LLVM testsuite.  However I know
opinions differ on this.

>  /// ProcessBitFieldInitialization - Handle a static initialization of a

For the moment I will only comment on the non-bitfield case.

>  Constant *TreeConstantToLLVM::ConvertRecordCONSTRUCTOR(tree exp) {
>    const StructType *STy = cast<StructType>(ConvertType(TREE_TYPE(exp)));
>    std::vector<Constant*> ResultElts;
> -  ResultElts.resize(STy->getNumElements());
> -
> +  // In theory, /every/ field might need padding.
> +  ResultElts.resize(2 * STy->getNumElements());

I think you should just *reserve* this much size (as an optimization)
and use push_back to add elements.  Since the initializer can result
in generating a different LLVM type to STy, best not to make any serious
assumptions based on STy anywhere.

I think it is important to note that this routine is *not* required
to return a struct of the type STy (and in general it doesn't).  If
you don't even try to return STy, then that gives you some more freedom
which can perhaps be exploited to simplify things a bit.  For example,
it is valid to insert padding elements that may not exist in STy!  But
then it is probably very unwise to use STy anywhere, since it has no
particular connection with the final struct you construct.

> +  bool PredecessorWasBitfield = false;
>    tree NextField = TYPE_FIELDS(TREE_TYPE(exp));
>    unsigned HOST_WIDE_INT ix;
>    tree elt_value;
>    tree Field; // The fielddecl for the field.
> +  unsigned int FieldNo=0, LastBittenFieldNo;

What is LastBittenFieldNo (please add a comment)?  If you push elements onto the
back of ResultElts then you don't need FieldNo.

> +  uint64_t HighWaterMarkInBits=0;

What is HighWaterMarkInBits (please add a comment)?

> +  uint64_t FieldOffsetInBits;
>    FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (exp), ix, Field, elt_value) {
> -    unsigned FieldNo;
>      if (Field == 0) {           // If an explicit field is specified, use it.
>        Field = NextField;
>        // Advance to the next FIELD_DECL, skipping over other structure members
> @@ -7134,18 +7139,57 @@ Constant *TreeConstantToLLVM::ConvertRec
>          if (TREE_CODE(Field) == FIELD_DECL) break;
>        }
>      }
> +    // Get the offset (beginning_of_field - beginning_of_struct) from GCC.
> +    FieldOffsetInBits = getFieldOffsetInBits(Field);
>  
>      // Decode the field's value.
>      Constant *Val = Convert(elt_value);
>  
> +    // If there's a hole in the initializer, create padding for it.
> +    assert(HighWaterMarkInBits <= FieldOffsetInBits
> +	   && "misinterpreted struct initializer");

This assertion is probably correct, but it's hard to be sure without
knowing what HighWaterMarkInBits is exactly.  Presumably it is the
location of the first bit after the end of the last field in the new
LLVM type you are creating.  Did you run the gcc testsuite and see if
this fires anywhere?  It might fire I think if some field has DECL_SIZE
smaller than the TYPE_SIZE.  Sadly, this is likely to be possible.  If
this can occur then you need to return a packed LLVM type as the initializer,
which means starting all over again.

> +    uint64_t PaddingSizeInBits = FieldOffsetInBits - HighWaterMarkInBits;
> +    // If there's a gap of at least one byte, pad it.

Is it possible for this difference to not be a multiple of 8?  For example,
because the new field is a bitfield that starts partway in to a byte.  You 
should at least add an assertion here to check this.  Also, the padding may
not be needed if the new field is sufficiently aligned, since then the padding
may be automatically (implicitly) be present.

>        // If not, things are much simpler.
> -      unsigned int FieldNo = GetFieldIndex(Field);
> -      assert(FieldNo < ResultElts.size() && "Invalid struct field number!");
> +      const StructLayout *SL = getTargetData().getStructLayout(STy);
> +      unsigned LLVMFieldNo = SL->getElementContainingOffset(FieldOffsetInBits/8);

I think that you should avoid looking at STy if at all possible.  After all,
you are creating a potentially entirely new LLVM type using the initializer
elements - the declared type of the global simply isn't very relevant.

> +      assert((HighWaterMarkInBits & 7) == 0
> +	     && "expected byte alignment at minimum");
>  
>        // Example: struct X { int A; char C[]; } x = { 4, "foo" };
>        assert((TYPE_SIZE(getDeclaredType(Field)) ||
> @@ -7164,21 +7208,26 @@ Constant *TreeConstantToLLVM::ConvertRec
>        // "int" for example.  If we ignored this, we would lay out the
>        // initializer wrong.
>        if (TYPE_SIZE(getDeclaredType(Field)) &&
> -          Val->getType() != STy->getElementType(FieldNo))
> +          Val->getType() != STy->getElementType(LLVMFieldNo))
>          Val = ConvertStructFieldInitializerToType(Val,
> -                                                  STy->getElementType(FieldNo));
> +                                                  STy->getElementType(LLVMFieldNo));

I don't see the point of ConvertStructFieldInitializerToType.  Since in general
we are happy to return an initializer with a different type to the global, this
shouldn't be needed.  In my dream world STy would not be used anywhere in this
method.  Every time I see it a little red light flashes in my head saying "probable
bug"!

>  
> -      ResultElts[FieldNo] = Val;
> +      ResultElts[FieldNo++] = Val;

If the LLVM type is more aligned than the gcc type, then you may get
implicit padding here, meaning the LLVM field gets placed at the wrong
offset.  In this case you will need to returned a packed LLVM type, which
means starting all over again.

>      }
>  
> +    HighWaterMarkInBits += TREE_INT_CST_LOW(DECL_SIZE(Field));

Since HighWaterMarkInBits represents the position of the next LLVM
element (at least I think it does...) you shouldn't be using the gcc
field size to update it, you should use the LLVM padded size.  While
the LLVM padded size will be equal to the gcc type size, DECL_SIZE(field)
could be something else (usually smaller).  This may cause problems (see
above).

> +  assert(FieldNo <= ResultElts.size()
> +	 && "Either LLVM & GCC structs are dissimilar,"
> +	 " or too much padding was generated!");

I think they can be fairly dissimilar.  Best to use push_back and
drop this assertion.

>    
> -  // Fill in null elements with zeros.
> -  for (unsigned i = 0, e = ResultElts.size(); i != e; ++i) {
> -    if (ResultElts[i] == 0)
> -      ResultElts[i] = Constant::getNullValue(STy->getElementType(i));
> -  }
> +  ResultElts.resize(FieldNo);

I think the idea with this code is that an initializer is allowed
to be shorter than the type of the variable (omitted fields should
be zero initialized).  However it looks like the following code
takes care of it, so no problem!  (By the way, there's no need to
adjust newLLVMSize for the alignment, getTypePaddedSize has already
done this - can you please clean this while you're there.  Thanks!).

Ciao,

Duncan.