[PATCH] D59105: [RFC] Create an Arbitrary Precision Integer Type.

Fri Mar 8 06:41:46 PST 2019

erichkeane marked 7 inline comments as done.
erichkeane added a comment.

Thank you @rsmith for the quick review!  I really appreciate it.

In D59105#1422088 <https://reviews.llvm.org/D59105#1422088>, @rsmith wrote:

> In principle, I think an extension in this space seems reasonable and useful. Considering the points in http://clang.llvm.org/get_involved.html, I'm happy to assume you meet points 2, 5, and 6, but I'd like to see a written-up rationale for the other points. (In particular, you do not yet meet the bar for points 3 and 7, and if you're not proposing this to WG14 or WG21 or similar, some amount of rationale justifying that is necessary.)

I can definitely work on the test suite, and I'll attempt to improve the specification.  At the moment I don't believe we have an effort to get this into WG14/WG21, however I'll see if it is possible to figure out our interest here.

> I don't like the name `ArbPrecIntType`; this is not an arbitrary-precision integer type, it's a fixed-width integer type. Something like `ExtIntegerType` might fit better.
> 
> Defining new types via attributed `typedef`s is a horrible hack, and I would not like to extend its scope any further. Have you considered alternative interfaces for specifying the type? For instance, we could support an `__int(N)` //type-specifier//. (I would expect such a specifier to behave like a normal integer width //type-specifier// (eg, like `short` or `long` or `__int128`), so we would allow things like `unsigned __int(54)`.)

I'd considered a type specifier, though I'd convinced myself that doing this more akin to the vector types would be more acceptable.

> Have you considered whether `__int(128)` (however you spell it) should be the same type as the existing `__int128`? (I think it makes sense for the `__int(N)` types to be distinct from all the standard types, even if `N` matches, unlike the behavior of the GNU `mode` attribute. But the relationship to `__int128` is less clear.)

In our opinion, __int(64) should be different than long/long long/etc, and __int(16) shouldn't be short.  That is because these types are useful when they don't follow common promotion rules, which force it to promote at in-opportune times (though, moreso with short, the rest were due to creating new rules anyway).  Additionally, the ability to overload on these types has been useful to our customers.  I'm not sure I understand enough of __int128 to determine whether there is sufficient motivation to make these separate types.

> How do integral promotions behave on these types? Are these types intended to follow the rules for extended integer types? If not, in what ways, and why not?

This question feels like a trap :)   While I wouldn't dare to claim understanding of the rules, I believe the only difference is exclusion from integral promotion rules.

> I'd like to see some tests for use of these types in these contexts:
> 
> - bit-fields
> - enum-bases
> - non-type template arguments (incl. mangling thereof)
> - UBSan, and in particular `-fsanitize=signed-integer-overflow` (this will break; UBSan's static data representation assumes that integer types have power-of-two sizes)
> - overload resolution with these types as parameters / arguments, particularly involving overload sets also including standard integer types
> - TBAA metadata (I assume there is no aliasing relationship between these and standard integer types)

================
Comment at: lib/AST/ASTContext.cpp:1788
            "Overflow in array type bit size evaluation");
-    Width = EltInfo.Width * Size;
+    Width = llvm::alignTo(EltInfo.Width, EltInfo.Align) * Size;
     Align = EltInfo.Align;
----------------
rsmith wrote:
> Isn't this wrong for arrays of `__attribute__((aligned(N)))` types? Eg: https://godbolt.org/z/jOW_B5
> 
> (GCC changed this and broke ABI in version 5 onwards, but if/when we follow suit it should be an intentional decision, likely with a `-fclang-abi-compat=` flag, not a side-effect of an unrelated change such as this one.)
Oh dear... I was unaware of aligned.  Thats... special.

================
Comment at: lib/AST/ASTContext.cpp:2132
+    const ArbPrecIntType *AT = cast<ArbPrecIntType>(T);
+    Width = AT->getNumBits();
+    Align = std::min(std::max(getCharWidth(), llvm::PowerOf2Ceil(Width)),
----------------
rsmith wrote:
> Please do not add another case where a type's size is not a multiple of its alignment. This behavior of `__attribute__((aligned(N)))` is a mistake that we should not repeat. (Remember that `sizeof(T)` is supposed to be the size of `T` as an array element.) `Width` should include the padding necessary to reach a multiple of the alignment.
I think you're right here of course.  I believe you've likely noticed I end up doing an 'align-to' all over the place.  I think I should have just done it here.

================
Comment at: lib/AST/ASTContext.cpp:2133-2134
+    Width = AT->getNumBits();
+    Align = std::min(std::max(getCharWidth(), llvm::PowerOf2Ceil(Width)),
+                     static_cast<uint64_t>(64));
+    break;
----------------
rsmith wrote:
> The use of 64 as the largest alignment here is arbitrary. Using the alignment of `long long` would make more sense, but that choice should be made by a method on `Target`.
Makes sense.

================
Comment at: lib/AST/ASTContext.cpp:5710-5711

+  if (isa<ArbPrecIntType>(T))
+    return 7 + (getIntWidth(QualType(T, 0)) << 3);
+
----------------
rsmith wrote:
> This looks wrong: you're ranking `__int(2)` above `long`, which violates the integer conversion rank rules for extended integer types. You also give `__int128` and `__int(128)` the same rank despite being different types, which also violates the rank rules. (But I'd suggest fixing that by making them the same type.)
Hmm... this actually ends up being a pretty interesting thing to try to work through. There is probably some pretty interesting work I'll have to do to make sure they better fit in this list.

================
Comment at: lib/Sema/SemaExpr.cpp:9030-9046
+  // Always convert to the biggest type.
+  if (LHSBits > RHSBits) {
+    RHS = doIntegralCast(*this, RHS.get(), LHSType);
+    return LHSType;
+  }
+
+  if (LHSBits < RHSBits) {
----------------
rsmith wrote:
> This is moderately terrible. I mean, no worse than the existing language rules, but still, moderately terrible. =(
I'd hoped to merge this with the usual conversion functions, however they were quite insistent on converting things to "int" when smaller than that, which absolutely trashes FPGA performance.

================
Comment at: lib/Sema/SemaExpr.cpp:9153-9156
+  if (LHS.get()->getType()->isArbPrecIntType() ||
+      RHS.get()->getType()->isArbPrecIntType())
+    return CheckArbPrecIntOperands(LHS, RHS, Loc, IsCompAssign,
+                                   /*IsShift=*/false);
----------------
rsmith wrote:
> Should this be part of `UsualArithmeticConversions` instead of being duplicated everywhere?
I ended up pushing this down the stack at one point, and had problems with the all-math-at-int rules, but perhaps this could be put at the beginning of UsualArithmeticConversions.

================
Comment at: test/Sema/arb_prec_int.c:18
+typedef unsigned int __attribute__((__ap_int(0))) ZeroSizeUnsigned;
+// expected-error at +1{{signed __ap_int must have a size of at least 2}}
+typedef int __attribute__((__ap_int(1))) SignedOneSize;
----------------
rsmith wrote:
> Why? A signed bitfield can have a width of 1 (representing 0 and -1).
Well, huh, TIL.  That is somewhere between interesting and horrifying.  

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59105/new/

https://reviews.llvm.org/D59105