[PATCH] D59105: [RFC] Create an Arbitrary Precision Integer Type.

Thu Mar 7 14:05:47 PST 2019

rsmith added a comment.

In principle, I think an extension in this space seems reasonable and useful. Considering the points in http://clang.llvm.org/get_involved.html, I'm happy to assume you meet points 2, 5, and 6, but I'd like to see a written-up rationale for the other points. (In particular, you do not yet meet the bar for points 3 and 7, and if you're not proposing this to WG14 or WG21 or similar, some amount of rationale justifying that is necessary.)

I don't like the name `ArbPrecIntType`; this is not an arbitrary-precision integer type, it's a fixed-width integer type. Something like `ExtIntegerType` might fit better.

Defining new types via attributed `typedef`s is a horrible hack, and I would not like to extend its scope any further. Have you considered alternative interfaces for specifying the type? For instance, we could support an `__int(N)` //type-specifier//. (I would expect such a specifier to behave like a normal integer width //type-specifier// (eg, like `short` or `long` or `__int128`), so we would allow things like `unsigned __int(54)`.)

Have you considered whether `__int(128)` (however you spell it) should be the same type as the existing `__int128`? (I think it makes sense for the `__int(N)` types to be distinct from all the standard types, even if `N` matches, unlike the behavior of the GNU `mode` attribute. But the relationship to `__int128` is less clear.)

How do integral promotions behave on these types? Are these types intended to follow the rules for extended integer types? If not, in what ways, and why not?

I'd like to see some tests for use of these types in these contexts:

- bit-fields
- enum-bases
- non-type template arguments (incl. mangling thereof)
- UBSan, and in particular `-fsanitize=signed-integer-overflow` (this will break; UBSan's static data representation assumes that integer types have power-of-two sizes)
- overload resolution with these types as parameters / arguments, particularly involving overload sets also including standard integer types
- TBAA metadata (I assume there is no aliasing relationship between these and standard integer types)

================
Comment at: include/clang/Basic/AttrDocs.td:4123-4126
+Since FPGAs have the ability to produce non-power-of-2 math units, there is a
+requirement to have an integer type that supports an arbitrary compile time bit
+representation. These need to be expressed in IR as the correct precision int
+types (such as i3, i46, etc).
----------------
Please frame the documentation around the feature itself rather than your motivating use case. Mentioning FPGAs as an aside might make some sense, but it shouldn't be the primary introduction to the extension's documentation.

================
Comment at: include/clang/Basic/AttrDocs.td:4139-4141
+this type has no affect on the ArbPrecInt. All operations on an ArbPrecInt
+should take place as an ArbPrecInt type, promoting to the larger of the two
+types.
----------------
Please mention that you convert to unsigned if the sizes match, like standard integer types do.

Please also describe the promotion rules for these types.

================
Comment at: lib/AST/ASTContext.cpp:1687-1693
+  // toCharUnitsFromBits always rounds down, which isn't a problem when the size
+  // Width is a multiple of chars, however ArbPrecInt makes this not a valid
+  // assumption.
+  return std::make_pair(toCharUnitsFromBits(Info.Width) +
+                            (Info.Width % getCharWidth() == 0
+                                 ? CharUnits::Zero()
+                                 : CharUnits::One()),
----------------
This should not be necessary. `getTypeInfo` should not be returning sizes that are not a multiple of the bit-width of `char`.

================
Comment at: lib/AST/ASTContext.cpp:1788
            "Overflow in array type bit size evaluation");
-    Width = EltInfo.Width * Size;
+    Width = llvm::alignTo(EltInfo.Width, EltInfo.Align) * Size;
     Align = EltInfo.Align;
----------------
Isn't this wrong for arrays of `__attribute__((aligned(N)))` types? Eg: https://godbolt.org/z/jOW_B5

(GCC changed this and broke ABI in version 5 onwards, but if/when we follow suit it should be an intentional decision, likely with a `-fclang-abi-compat=` flag, not a side-effect of an unrelated change such as this one.)

================
Comment at: lib/AST/ASTContext.cpp:2132
+    const ArbPrecIntType *AT = cast<ArbPrecIntType>(T);
+    Width = AT->getNumBits();
+    Align = std::min(std::max(getCharWidth(), llvm::PowerOf2Ceil(Width)),
----------------
Please do not add another case where a type's size is not a multiple of its alignment. This behavior of `__attribute__((aligned(N)))` is a mistake that we should not repeat. (Remember that `sizeof(T)` is supposed to be the size of `T` as an array element.) `Width` should include the padding necessary to reach a multiple of the alignment.

================
Comment at: lib/AST/ASTContext.cpp:2133-2134
+    Width = AT->getNumBits();
+    Align = std::min(std::max(getCharWidth(), llvm::PowerOf2Ceil(Width)),
+                     static_cast<uint64_t>(64));
+    break;
----------------
The use of 64 as the largest alignment here is arbitrary. Using the alignment of `long long` would make more sense, but that choice should be made by a method on `Target`.

================
Comment at: lib/AST/ASTContext.cpp:5710-5711

+  if (isa<ArbPrecIntType>(T))
+    return 7 + (getIntWidth(QualType(T, 0)) << 3);
+
----------------
This looks wrong: you're ranking `__int(2)` above `long`, which violates the integer conversion rank rules for extended integer types. You also give `__int128` and `__int(128)` the same rank despite being different types, which also violates the rank rules. (But I'd suggest fixing that by making them the same type.)

================
Comment at: lib/AST/ItaniumMangle.cpp:3377
+void CXXNameMangler::mangleType(const ArbPrecIntType *T) {
+  Out << "11clang_apint_";
+  mangleNumber(T->getNumBits());
----------------
Please use a reserved vendor mangling.

If you take my suggestion for the source-level interface, something like `U5__intILi120Ei` might make sense. With your current approach it should be `Ua8__ap_intILi120Ei`.

================
Comment at: lib/Sema/SemaExpr.cpp:9030-9046
+  // Always convert to the biggest type.
+  if (LHSBits > RHSBits) {
+    RHS = doIntegralCast(*this, RHS.get(), LHSType);
+    return LHSType;
+  }
+
+  if (LHSBits < RHSBits) {
----------------
This is moderately terrible. I mean, no worse than the existing language rules, but still, moderately terrible. =(

================
Comment at: lib/Sema/SemaExpr.cpp:9153-9156
+  if (LHS.get()->getType()->isArbPrecIntType() ||
+      RHS.get()->getType()->isArbPrecIntType())
+    return CheckArbPrecIntOperands(LHS, RHS, Loc, IsCompAssign,
+                                   /*IsShift=*/false);
----------------
Should this be part of `UsualArithmeticConversions` instead of being duplicated everywhere?

================
Comment at: lib/Sema/SemaOverload.cpp:7399

   /// The set of vector types that will be used in the built-in
   /// candidates.
----------------
"vector"

================
Comment at: lib/Sema/SemaType.cpp:2074-2076
+  if (!NumBitsExpr->isTypeDependent() && !NumBitsExpr->isValueDependent()) {
+    llvm::APSInt Bits(32);
+    if (!NumBitsExpr->isIntegerConstantExpr(Bits, Context)) {
----------------
Use `VerifyIntegerConstantExpression` here. It'll do contextual implicit conversion to an integer type for you (in C++11 onwards), and properly diagnose why the expression is non-constant if necessary.

================
Comment at: lib/Sema/SemaType.cpp:2100
+    }
+    return Context.getArbPrecIntType(T, NumBits, AttrLoc);
+  }
----------------
We should enforce an upper bound on the size of such a type. (Eg, we shouldn't permit an integer type that doesn't fit in the address space, and we may want a lower limit than that for sanity.)

================
Comment at: test/Sema/arb_prec_int.c:18
+typedef unsigned int __attribute__((__ap_int(0))) ZeroSizeUnsigned;
+// expected-error at +1{{signed __ap_int must have a size of at least 2}}
+typedef int __attribute__((__ap_int(1))) SignedOneSize;
----------------
Why? A signed bitfield can have a width of 1 (representing 0 and -1).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59105/new/

https://reviews.llvm.org/D59105