[PATCH] D63371: Rewrite ConstStructBuilder with a mechanism that can cope with splitting and updating constants.

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 17 13:13:49 PDT 2019


rsmith added inline comments.


================
Comment at: lib/CodeGen/CGExprConstant.cpp:73
+/// Incremental builder for an llvm::Constant* holding a structure constant.
+class ConstantBuilder : private ConstantBuilderUtils {
+  llvm::SmallVector<llvm::Constant*, 32> Elems;
----------------
rjmccall wrote:
> This seems like a very generic name for this type.
It is intended to be a very generic type. (I was trying to arrange it so that it could possibly be moved to LLVM eventually. I heard that globalopt would benefit from being able to do this kind of constant splitting / reforming.) Is `ConstantAggregateBuilder` sufficiently more precise?


================
Comment at: lib/CodeGen/CGExprConstant.cpp:75
+  llvm::SmallVector<llvm::Constant*, 32> Elems;
+  llvm::SmallVector<CharUnits, 32> Offsets;
+  CharUnits Size = CharUnits::Zero();
----------------
rjmccall wrote:
> Are there invariants about these?  I assume they're parallel arrays; are they kept sorted?
Added comments to explain.


================
Comment at: lib/CodeGen/CGExprConstant.cpp:76
+  llvm::SmallVector<CharUnits, 32> Offsets;
+  CharUnits Size = CharUnits::Zero();
+
----------------
rjmccall wrote:
> This is one past the last byte that's been covered by an actual `Constant*` value, or does it include unoccupied padding, or does it exclude even occupied padding?
Added comment to explain.


================
Comment at: lib/CodeGen/CGExprConstant.cpp:98
+    Offsets.reserve(N);
+  }
+
----------------
rjmccall wrote:
> Might be worth clarifying what `N` is here.
Looks like this ended up being unused; removed.


================
Comment at: lib/CodeGen/CGExprConstant.cpp:190
+bool ConstantBuilder::addBits(llvm::APInt Bits, uint64_t OffsetInBits,
+                              bool AllowOverwrite) {
+  const ASTContext &Context = CGM.getContext();
----------------
rjmccall wrote:
> `AllowOversized` (which you used in the interface) seems like a better name.
`AllowOversized` is used to mean "the size of the constant may be larger than the size of the type", and is a parameter to `build` / `buildFrom`.
`AllowOverwrite` is used to mean "adding this constant may overwrite something you've already been given", and is a parameter to `add` / `addBits`.

I can make these names more different from each other if that would help?


================
Comment at: lib/CodeGen/CGExprConstant.cpp:258
+          return false;
+        assert(CI->getBitWidth() == CharWidth && "splitAt failed");
+        assert((!(CI->getValue() & UpdateMask) || AllowOverwrite) &&
----------------
rjmccall wrote:
> Oh, because we're splitting before and after a single-`CharUnits` range?  That seems worthy of a somewhat clearer explanation in the code.
> 
> I guess we could have a non-`ConstantInt` single-byte value.  Unlikely but not impossible. :)
Yes. It's not too hard to craft a testcase where we'd get an explicit `undef` here; added handling for that and for all-zeroes constants, which are both correctly handled by overwriting the whole byte.


================
Comment at: lib/CodeGen/CGExprConstant.cpp:375
+
+  // FIXME: We could theoretically split a ConstantInt if the need ever arose.
+
----------------
rjmccall wrote:
> Does this not come up all the time with bit-fields?  I guess we emit them in single-`char` chunks, so it wouldn't.  Probably worth a comment.
Done.

We could hit this case for cases such as:

```
union U { int a; int b : 3; };
struct S { U u; };
S s = {(union U){1234}, .u.b = 5};
```

(which `CodeGen` currently rejects with "cannot compile this static initializer yet" in C), and splitting the `ConstantInt` would allow us to emit that initializer as a constant, but I'm not sure it's worthwhile unless it lets us simplify or improve bitfield emission in general. (The above isn't a case that C requires us to treat as a constant initializer, so rejecting it is not a conformance issue.)

Maybe instead of splitting bitfields into 1-byte chunks like we currently do, we should try to combine them into a single `iN`, like `CGRecordLayoutBuilder` does. But splitting to `i8` maintains the status quo, which is what I was aiming for in this patch.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63371/new/

https://reviews.llvm.org/D63371





More information about the cfe-commits mailing list