[PATCH] D46915: [Fixed Point Arithmetic] Fixed Point Precision Bits and Fixed Point Literals

Tue Jun 12 08:57:51 PDT 2018

ebevhan added inline comments.

================
Comment at: include/clang/Basic/TargetInfo.h:89
+  // corresponding unsaturated types.
+  unsigned char ShortAccumFBits, ShortAccumIBits;
+  unsigned char AccumFBits, AccumIBits;
----------------
leonardchan wrote:
> ebevhan wrote:
> > I suspect it's still possible to calculate the ibits based on the fbits, even for _Accum.
> > 
> > Are the unsigned values needed? The fbits for unsigned _Fract are the same as for signed _Fract if SameFBits is set, and +1 otherwise. The same should go for unsigned _Accum, but I don't think it's entirely clear how this affects the integral part.
> Similar to the previous comment, if we choose to make SameFBits dependent on/controlled by the target, then I think it would be better for the target to explicitly specify the integral bits since there could be some cases where there may be padding (such as for unsigned _Accum types if this flag is set), and in this case, I think it should be explicit that the `bit_width != IBits + FBits`.
> 
> We also can't fill in that padding for the unsigned _Accum types as an extra integral bit since the standard says that `"each signed accum type has at least as many integral bits as its corresponding unsigned accum type".`
> 
> For the unsigned _Fract values, I think we can get rid of them if we choose to keep the flag instead, but it would no longer apply to unsigned _Accum types since it would allow for extra padding in these types and would conflict with the logic of `bit_width == IBits + FBits`
> 
> For now, I actually think the simplest option is to keep these target properties, but have the target override them individually, with checks to make sure the values adhere to the standard.
Is it not the case that `bitwidth != IBits + FBits` for signed types only? Signed types require a sign bit (which is neither a fractional bit nor an integral bit, according to spec) but unsigned types do not, and if IBits and FBits for the signed and unsigned types are the same, the MSB is simply a padding bit, at least for _Accum (for _Fract everything above the fbits is padding). 

My reasoning for keeping the number of configurable values down was to limit the number of twiddly knobs to make the implementation simpler. Granting a lot of freedom is nice, but I suspect that it will be quite hard to get the implementation working properly for every valid configuration. I also don't really see much of a reason for `FBits != UFBits` in general. I know the spec gives a recommendation to implement it that way, but I think the benefit of normalizing the signed and unsigned representations outweighs the lost bit in the unsigned type.

It's hard to say what the differences are beyond that since I'm not sure how you plan on treating padding bits. In our implementation, padding bits (the MSB in all of the unsigned types for us) after any operation are zeroed.

================
Comment at: lib/Basic/TargetInfo.cpp:797
+  // corresponding unsigned accum type.
+  assert(ShortAccumIBits == UnsignedShortAccumIBits ||
+         ShortAccumIBits == UnsignedShortAccumIBits - 1);
----------------
Shouldn't these be `ShortAccumIBits >= UnsignedShortAccumIBits` etc.?

================
Comment at: lib/Lex/LiteralSupport.cpp:737
+  if (!hadError && saw_fixed_point_suffix) {
+    assert(isFract || isAccum);
+    assert(radix == 16 || radix == 10);
----------------
Is `saw_fixed_point_suffix` only used for this assertion? Doesn't `isFract || isAccum` imply `saw_fixed_point_suffix`?

================
Comment at: lib/Lex/LiteralSupport.cpp:1049
+
+bool NumericLiteralParser::GetFixedPointValue(llvm::APInt &StoreVal, unsigned Scale) {
+  assert(radix == 16 || radix == 10);
----------------
The function does a lot of overflow checks that I think are superfluous if the required bits are calculated properly. The only overflow check that should be needed is the one at the very bottom that truncates the APInt to the final size.

================
Comment at: lib/Lex/LiteralSupport.cpp:1065
+
+    while (!IsExponentPart(*Ptr)) ++Ptr;
+    ++Ptr;
----------------
This whole block is very diligent but I wonder how common overflow in the exponent is.

I'm unsure if LLVM has a helper function with `atoll`-like functionality, but if it does it's a lot less code to just use that instead. There's also APInt's `fromString`, but that seems to assert if the integer doesn't fit in the bitwidth. The required bitwidth can be calculated here, though.

================
Comment at: lib/Lex/LiteralSupport.cpp:1085
+
+  if (radix == 10) {
+    // Number of bits needed for decimal literal is
----------------
These two calculations look very similar to me. The only difference seems to be that the exponent is multiplied by 4 in the decimal case, and not in the hexadecimal case.

================
Comment at: lib/Lex/LiteralSupport.cpp:1141
+    }
+    if ((radix == 16 && (*Ptr == 'p' || *Ptr == 'P')) ||
+        (radix == 10 && (*Ptr == 'e' || *Ptr == 'E'))) {
----------------
I don't think breaking is dependent on this, it's dependent on whether or not the input is a hex digit.

We've already parsed the exponent further up, so we should be able to have an `ExponentBegin` that we can use as the loop bound. 

(It's the same as `SuffixBegin` if there is no exponent)

================
Comment at: lib/Lex/LiteralSupport.cpp:1143
+        (radix == 10 && (*Ptr == 'e' || *Ptr == 'E'))) {
+      FoundExponent = true;
+      ++Ptr;
----------------
Does FoundExponent matter?

================
Comment at: lib/Lex/LiteralSupport.cpp:1175
+      // number of digits past the radix point.
+      --FractBaseShift;
+    }
----------------
This is technically Exponent.

================
Comment at: lib/Lex/LiteralSupport.cpp:1184
+  auto OldVal = Val;
+  Val <<= Scale;
+  IntOverflowOccurred |= Val.lshr(Scale) != OldVal;
----------------
If you want to verify overflow here, you can check that `Val.countLeadingZeroes() >= Scale`.

================
Comment at: lib/Lex/LiteralSupport.cpp:1187
+
+  uint64_t Base = (radix == 16) ? 2 : 10;
+  if (BaseShift > 0) {
----------------
I don't think loops are needed here. BaseShift is essentially Exponent so it should be possible to implement as a `pow(radix, abs(BaseShift))` and then either a mul or a div based on whether or not it was positive or negative.

================
Comment at: lib/Lex/LiteralSupport.cpp:1199
+
+  auto MaxVal = llvm::APInt::getMaxValue(StoreVal.getBitWidth());
+  if (Val.getBitWidth() > StoreVal.getBitWidth()) {
----------------
I think the final overflow check is the only one you need.

================
Comment at: lib/Sema/SemaExpr.cpp:1248
+  bool RHSFixed = RHSType->isFixedPointType();
+
+  if (LHSFixed && RHSFixed) {
----------------
leonardchan wrote:
> ebevhan wrote:
> > I don't see how these semantics work properly. The specification requires that operations be done in the full precision of both types. You cannot convert the types before performing the operation like this, since the operation will not be done in full precision in that case.
> > 
> > The operator semantics of Embedded-C require the operand types of binary operators to be different. It's only when you've performed the operation that you are allowed to convert the result to the resulting type.
> Initially the idea was to convert both sides to fixed point types, then perform standard binary operations between the fixed point types.
> 
> For the example, a `fract * int` would have the int converted to a fixed point type by left shifting it by the scale of the fract, multiplying, then right shifting by the scale again to get the resulting fract. The only unhandled thing is overflow, but the precision of the fract remains the same. The operands would also be casted up beforehand so there was enough space to store the result, which was casted down back to the original fract after performing the right shift by the scale.
> 
> Operations between fixed point types would follow a similar process of casting both operands to the higher rank fixed point type, and depending on the operation, more underlying shifting and casting would be done to retain full precision of the higher ranked type.
> 
> Though I will admit that I did not realize until now that multiplying a fixed point type by an integer does not require shifting the integer.
I see how you've reasoned; this is how C normally works. The `fract` is of higher rank than `int` and therefore is the 'common type' of the operation. However, even though it is higher rank there is no guarantee that you can perform the operation without overflowing. And overflow matters here; the spec says that it must be done in the full precision (integral + fractional) of both types.

> The only unhandled thing is overflow, but the precision of the fract remains the same. The operands would also be casted up beforehand so there was enough space to store the result, which was casted down back to the original fract after performing the right shift by the scale.

The precision remains the same (and while it doesn't have to be the same to perform an operation, it makes the implementation more regular; things like addition and subtraction 'just work'), but you cannot perform a conversion to `fract` *before* the operation itself, since if you do, there's nothing to 'cast up'. Casting up is needed for things like `fract * fract` to prevent overflow, but for `fract * int` you need to cast to a type that can fit both all values of the int and all values of the fract, and *then* you can cast up before doing the multiplication.

> Operations between fixed point types would follow a similar process of casting both operands to the higher rank fixed point type, and depending on the operation, more underlying shifting and casting would be done to retain full precision of the higher ranked type.

This might work, but I feel there could be edge cases. The E-C fixed-point ranks are very odd as they don't reflect reality; `short _Accum` cannot be considered strictly 'above' `long _Fract`, but the former has a higher rank than the latter. Depending on how the types are specified for a target, implicit casts between fixed-point types might inadvertantly discard bits, even though the spec says that operations must be done in full precision.

================
Comment at: lib/Sema/SemaExpr.cpp:1336
+  // Handle fixed point types
+  if (LHSType->isFixedPointType() || RHSType->isFixedPointType())
+    return handleFixedPointConversion(*this, LHS, RHS, LHSType, RHSType,
----------------
leonardchan wrote:
> ebevhan wrote:
> > I guess you haven't gotten there yet, but this should probably be before handleFloatConversion if you want to handle 'float + _fract' later.
> A later patch adds a case in `handleFloatConversion` where we check if one of the operands is an integer and performs a cast from there.
> one of the operands is an integer 

I assume you meant fixed-point here.

Repository:
  rC Clang

https://reviews.llvm.org/D46915