[PATCH] D116203: [clang] adds unary type transformations as compiler built-ins

Thu Jul 7 13:47:29 PDT 2022

rsmith added inline comments.

================
Comment at: clang/lib/AST/ASTContext.cpp:10733-10736
   // For enums, get the underlying integer type of the enum, and let the general
   // integer type signchanging code handle it.
   if (const auto *ETy = T->getAs<EnumType>())
     T = ETy->getDecl()->getIntegerType();
----------------
This isn't the correct behavior for `make_unsigned`. We need to pick the integer type with the lowest rank with the same size as the enumeration (see [the corresponding wording](http://eel.is/c++draft/meta.trans#tab:meta.trans.sign-row-3-column-2-sentence-1)). For example, if the underlying type of `E` is `long` and `sizeof(long) == sizeof(int)`, then `make_unsigned<E>` is required to produce `unsigned int` not `unsigned long`.

It might be that the best approach here is for `BuiltinChangeSignedness` to only call `getCorrespondingUnsignedType` when `T` is a signed or unsigned integral type (notably excluding `char16_t` and friends -- see next comment -- but including `_BitInt`), and otherwise for it to look through the signed / unsigned integer types for the first one with the same size as `T` and pick that.

================
Comment at: clang/lib/AST/ASTContext.cpp:10747
+  case BuiltinType::Char16:
     return UnsignedShortTy;
   case BuiltinType::Int:
----------------
Mapping `char16_t` -> `unsigned short` seems target-specific to me. If plain `unsigned char` is 16 bits wide, then `make_unsigned<char16_t>` should be `unsigned char`. Same for `char32_t`, `wchar_t`.

================
Comment at: clang/lib/AST/ASTContext.cpp:5863
   // FIXME: derive from "Target" ?
-  return WCharTy;
+  return IntTy;
 }
----------------
cjdb wrote:
> rsmith wrote:
> > This change seems surprising. Can you explain what's going on here?
> Motivated by `__make_signed(wchar_t)` previously returning `wchar_t`, and that it was seemingly inconsistent with `getUnsignedWCharType` below.
It looks to me like this breaks the implementation of a GCC compatibility feature. In GCC 8 and before, `signed wchar_t` is [accepted and means `wchar_t`](https://godbolt.org/z/5af7n9bef) and similarly `unsigned wchar_t` is accepted and means `unsigned int`, and Clang provides that behavior here; with this change, `signed wchar_t` will resolve to `int` instead in Clang, deviating from GCC's behavior for this GCC compatibility feature.

I think I'd be happy to see that GCC compatibility feature removed entirely, especially given that GCC 9 onwards removed it, it's an error by default in Clang, and [essentially no-one seems to be using it](https://www.google.com/search?q=%22-Wno-signed-unsigned-wchar%22), but that should not be done in this patch, and we shouldn't change its behavior here either.

================
Comment at: clang/lib/Parse/ParseDeclCXX.cpp:1149-1150
+  default:
+    assert(false && "passed in an unhandled type transformation built-in");
+    llvm_unreachable("passed in an unhandled type transformation built-in");
+  }
----------------
We don't need both of these. Just the `llvm_unreachable` would suffice.

================
Comment at: clang/lib/Parse/ParseExpr.cpp:1755-1756
+      Tok.setKind(tok::identifier);
+      Diag(Tok, diag::ext_keyword_as_ident)
+          << Tok.getIdentifierInfo()->getName() << 0;
+      goto ParseIdentifier;
----------------
Is it feasible to also produce this warning for the corresponding case in `MaybeParseTypeTransformTypeSpecifier` / the callers of that function?

================
Comment at: clang/lib/Sema/SemaType.cpp:9228
+QualType Sema::BuiltinAddPointer(QualType BaseType, SourceLocation Loc) {
+  DeclarationName EntityName(BaseType.getBaseTypeIdentifier());
+  QualType PointerToT =
----------------
The name that's wanted by `BuildPointerType` is the name of the pointer variable, not the name of the type the pointer points to. Eg, for `int &*p;` it wants to say "`p` declared as a pointer to a reference". Passing in the name of the base type will mean that `__add_pointer(class X&)` could produce diagnostics like "`X` declared as a pointer to a reference", which is nonsense. Fortunately, that diagnostic is impossible here because we never pass it a reference, so you should be able to just pass a null `DeclarationName` instead.

================
Comment at: clang/lib/Sema/SemaType.cpp:9233-9234
+          : BaseType;
+  return Context.getUnaryTransformType(BaseType, PointerToT,
+                                       UTTKind::AddPointer);
+}
----------------
`BuildPointerType` can fail and return a null `QualType`; you'll need to detect that and do something different here -- either return the error to the caller or fall back to the original type for error recovery.

================
Comment at: clang/lib/Sema/SemaType.cpp:9244-9246
+  Qualifiers Quals = Pointee.getQualifiers();
+  return Context.getUnaryTransformType(
+      BaseType, Context.getQualifiedType(Pointee, Quals), UKind);
----------------
No need to add `Pointee`'s qualifiers to `Pointee`; it already has them.

================
Comment at: clang/lib/Sema/SemaType.cpp:9268
+  assert(LangOpts.CPlusPlus);
+  DeclarationName EntityName(BaseType.getBaseTypeIdentifier());
+  QualType PointerToT =
----------------
As above, this isn't the right name to pass, and we should not pass any name given that we don't have one.

================
Comment at: clang/lib/Sema/SemaType.cpp:9275
+          : BaseType;
+  return Context.getUnaryTransformType(BaseType, PointerToT, UKind);
+}
----------------
`BuildReferenceType` can fail and return a null `QualType`, and that will need to be handled here.

================
Comment at: clang/lib/Sema/SemaType.cpp:9335-9336
+  QualType Underlying = IsMakeSigned
+                            ? Context.getCorrespondingSignedType(BaseType)
+                            : Context.getCorrespondingUnsignedType(BaseType);
+  Underlying = Context.getQualifiedType(Underlying, BaseType.getQualifiers());
----------------
See comments in `getCorrespondingUnsignedType`: these don't perform the slightly odd "lowest rank type with this size" computation specified in the standard, in the case where the given type is an enum or character type.

================
Comment at: clang/test/SemaCXX/type-traits.cpp:3269
+  { int a[T(__is_same(remove_cvref_t<int S::*const volatile>, int S::*))]; }
+  { int a[T(__is_same(remove_cvref_t<int (S::*const volatile)()>, int(S::*)()))]; }
+  { int a[T(__is_same(remove_cvref_t<int (S::*const volatile)() &>, int(S::*)() &))]; }
----------------
cjdb wrote:
> rsmith wrote:
> > The tests you have here for references to pointers-to-members seem excessive (throughout); I don't think there's any reason to think that a reference to pointer-to-member would be so different from a reference to any other type that would justify this level of testing of the former case.
> > 
> > The cases that seem interesting here are the ones for non-referenceable types, but those are covered separately below.
> I've removed all but one pointer-to-member function.
I'm still seeing 39 pointer-to-member tests here in `check_remove_cvref`, 39 in `check_decay`, and 25 in `check_remove_reference`, all of which seem to be testing the same thing (that these traits don't look "inside" a pointer-to-member's pointee type in any way). Can those be cut down a bit?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116203/new/

https://reviews.llvm.org/D116203