[PATCH] D62825: [C++2a] Add __builtin_bit_cast, used to implement std::bit_cast

Thu Jun 13 12:43:14 PDT 2019

erik.pilkington added a comment.

In D62825#1528329 <https://reviews.llvm.org/D62825#1528329>, @rsmith wrote:

> Can we also use the same bit reader/writer implementation to generalize the current implementation of `__builtin_memcpy` and friends? (I don't think we can remove the existing implementation, sadly, since we allow copying arrays of the same trivially-copyable type today even if it contains pointers and unions and such.)

Yeah, I think we could support that. Sounds like follow-up stuff though :)

================
Comment at: clang/include/clang/AST/OperationKinds.def:69-72
+/// CK_CXXBitCast - Represents std::bit_cast. Like CK_BitCast, but with very few
+/// semantic requirements, namely, the source and destination types just need to
+/// be of the same size and trivially copyable. There is no CK_LValueBitCast
+/// analog for this, since std::bit_cast isn't able to create a need for it.
----------------
rsmith wrote:
> I'm not sure I understand why we don't just use `CK_BitCast` for this. What's the difference? (The `CastKind` just specifies how to do the conversion, and doesn't have anything to do with the semantic requirements -- those should be checked by whatever code builds the cast.)
Sure, new patch just folds this into CK_BitCast.

================
Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:9715-9718
+def err_bit_cast_non_trivially_copyable : Error<
+  "__builtin_bit_cast %select{source|destination}0 type must be a trivially copyable">;
+def err_bit_cast_type_size_mismatch : Error<
+  "__builtin_bit_cast source size does not match destination size (%0 and %1)">;
----------------
rsmith wrote:
> If these diagnostics are going to be produced by instantiations of `std::bit_cast`, it'd be more user-friendly to avoid mentioning `__builtin_bit_cast` at all (we'll presumably have a diagnostic pointing at it). So maybe s/__builtin_bit_cast/bit cast/?
An invalid __builtin_bit_cast should never really be created, since `std::bit_cast` guards this with a SFINAE check, so I think its more appropriate to use the `__builtin` spelling. I just included these diagnostics for completeness, I don't think anyone would actually run into them.

================
Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:9718
+def err_bit_cast_type_size_mismatch : Error<
+  "__builtin_bit_cast source size does not match destination size (%0 and %1)">;
 } // end of sema component.
----------------
jfb wrote:
> What does "match" mean? Can this be clearer?
Sure, "does not match" -> "does not equal".

================
Comment at: clang/include/clang/Basic/Features.def:252
 EXTENSION(gnu_asm, LangOpts.GNUAsm)
+EXTENSION(builtin_bit_cast, true)

----------------
rsmith wrote:
> Should we identify this with `__has_builtin` instead of `__has_extension`? (We already list some of our keyword-shaped `__builtin_*`s there.)
Sure, that makes more sense.

================
Comment at: clang/lib/AST/ExprConstant.cpp:5359
+  // bit-fields.
+  SmallVector<Optional<unsigned char>, 32> Bytes;
+
----------------
rsmith wrote:
> This only works if `unsigned char` is large enough to hold one `CharUnit`, which is theoretically not guaranteed to be the case. Maybe we can defer handling that until we support bit-fields here (or beyond; this is far from the only place in Clang that's not yet ready for `char` being anything other than 8 bits) but we should at least have a FIXME.
Sure, I added a FIXME. 

================
Comment at: clang/lib/AST/ExprConstant.cpp:5364
+public:
+  APBuffer(CharUnits Width, bool TargetIsLittleEndian)
+      : Bytes(Width.getQuantity()),
----------------
jfb wrote:
> Use an `enum class` instead of a `bool`.
Why? Doesn't seem like it would be any more clear, and all the other endian values we're comparing against are bools, which would mean awkwardly translating from an bool -> enum -> bool whenever we want to use it. (i.e. llvm::sys::IsLittleEndianHost, `TargetInfo::isLittleEndian()` are both booleans)

================
Comment at: clang/lib/AST/ExprConstant.cpp:5377
+    }
+    if (llvm::sys::IsLittleEndianHost != TargetIsLittleEndian)
+      std::reverse(Output.begin(), Output.end());
----------------
jfb wrote:
> Similar to this, you probably want to assert that `CHAR_BIT == ` whatever we use to denote target bitwidth.
Sure, I added an assert down in `handleBitCast` (which has access to that information).

================
Comment at: clang/lib/AST/ExprConstant.cpp:5421-5423
+      Info.FFDiag(BCE->getBeginLoc(), diag::note_constexpr_uninitialized)
+          << true << Ty;
+      return false;
----------------
rsmith wrote:
> `APValue::None` means "outside lifetime", not "uninitialized".
New patch updates the diagnostic. 

================
Comment at: clang/lib/AST/ExprConstant.cpp:5447
+
+    case APValue::LValue: {
+      LValue LVal;
----------------
jfb wrote:
> rsmith wrote:
> > jfb wrote:
> > > rsmith wrote:
> > > > This will permit bitcasts from `nullptr_t`, providing a zero value. I think that's wrong; the bit representation of `nullptr_t` is notionally uninitialized (despite strangely having pointer size and alignment).
> > > > 
> > > > I think this is a specification bug: `bit_cast<intptr_t>(nullptr)` should be non-constant (it won't necessarily be `0` at runtime) but the current wording doesn't seem to permit us to treat it as non-constant.
> > > You brought this up in ABQ, the outcome was... poor notes so I have no idea what Core decided...
> > It looks like we never actually resolved what happens in this case ;-(
> IIRC we discussed having the same-ish issue with `bool`, or any other indeterminate bit pattern. Realistically it's zero...
The new patch writing anything (including an indeterminate value) to nullptr_t, and reading nullptr_t results in indeterminate bytes. 

================
Comment at: clang/lib/AST/ExprConstant.cpp:5459
+
+  bool visitRecord(const APValue &Val, QualType Ty, CharUnits Offset) {
+    const RecordDecl *RD = Ty->getAsRecordDecl();
----------------
jfb wrote:
> Does this properly fail when there's a vtable?
A virtual function would mean that the record isn't trivially copyable, so that case will be handled in Sema, or even in the SFINAE check. Added a test.

================
Comment at: clang/lib/AST/ExprConstant.cpp:5505
+    CharUnits ElemWidth = Info.Ctx.getTypeSizeInChars(CAT->getElementType());
+    expandArray(const_cast<APValue &>(Val), Val.getArraySize()-1);
+    for (unsigned I = 0, E = Val.getArraySize(); I != E; ++I) {
----------------
rsmith wrote:
> Expanding the array seems unnecessary (and isn't const-correct and might be modifying an actually-`const` temporary); it doesn't seem hard to use the array filler from here for elements past the last initialized one.
Yeah, good point. I use the filler in the new patch.

================
Comment at: clang/lib/AST/ExprConstant.cpp:5657
+
+  Optional<APValue> visit(const Type *Ty, CharUnits Offset) {
+    return unsupportedType(QualType(Ty, 0));
----------------
rsmith wrote:
> You support reading atomic types but not writing them. Is that intentional?
I haven't considered it until now, but its kinda a moot point, `_Atomic` types aren't trivially copyable, so a BuiltinBitCastExpr over an atomic type will never be created.

================
Comment at: clang/lib/Sema/SemaCast.cpp:2801-2802
+void CastOperation::CheckBitCast() {
+  if (isPlaceholder())
+    SrcExpr = Self.CheckPlaceholderExpr(SrcExpr.get());
+
----------------
rsmith wrote:
> Should we be performing the default lvalue conversions here too? Or maybe materializing a temporary? (Are we expecting a glvalue or prvalue as the operand of `bit_cast`? It seems unnecessarily complex for the AST representation to allow either.)
Okay, new patch filters the expression through DefaultLvalueConversion (which deals with placeholders too).

================
Comment at: clang/test/CodeGenCXX/builtin-bit-cast.cpp:8-15
+  // CHECK: [[BIT_CAST_TEMP_INT:%.*]] = alloca i32, align 4
+  // CHECK: [[BIT_CAST_TEMP_FLOAT:%.*]] = alloca float, align 4
+
+  // CHECK: store i32 43, i32* [[BIT_CAST_TEMP_INT]], align 4
+  // ...bitcast the memcpy operands to i8*...
+  // CHECK: call void @llvm.memcpy.p0i8.p0i8.i64
+  // CHECK-NEXT: [[TMP:%.*]] = bitcast i8* %1 to float*
----------------
rsmith wrote:
> This seems like an unnecessarily-complex lowering. Bitcasting via memory seems fine, but we don't need two separate `alloca`s here -- using only one would seem preferable (especially at -O0), we just need to make sure we don't emit any TBAA metadata for one side of the load/store pair. (Maybe in practice getting CodeGen to do that is too hard and it's not worth it though?)
The reason I did the double alloca was TBAA concerns, but it looks like convincing CodeGen to suppress it isn't that hard, so the new patch just does that. We also need to use the max(alignof(dest), alignof(src)) for the alloca.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62825/new/

https://reviews.llvm.org/D62825