[PATCH] D53738: [Fixed Point Arithmetic] Fixed Point Addition

Bevin Hansson via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 5 07:34:52 PST 2018


ebevhan added inline comments.


================
Comment at: clang/lib/Sema/SemaExpr.cpp:1304
+    RHSTy = ResultTy;
+  }
+
----------------
rjmccall wrote:
> leonardchan wrote:
> > rjmccall wrote:
> > > leonardchan wrote:
> > > > rjmccall wrote:
> > > > > leonardchan wrote:
> > > > > > rjmccall wrote:
> > > > > > > ebevhan wrote:
> > > > > > > > rjmccall wrote:
> > > > > > > > > leonardchan wrote:
> > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > ebevhan wrote:
> > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > Hmm.  So adding a signed integer to an unsigned fixed-point type always converts the integer to unsigned?  That's a little weird, but only because the fixed-point rule seems to be the other way.  Anyway, I assume it's not a bug in the spec.
> > > > > > > > > > > > `handleFixedPointConversion` only calculates the result type of the expression, not the calculation type. The final result type of the operation will be the unsigned fixed-point type, but the calculation itself will be done in a signed type with enough precision to represent both the signed integer and the unsigned fixed-point type. 
> > > > > > > > > > > > 
> > > > > > > > > > > > Though, if the result ends up being negative, the final result is undefined unless the type is saturating. I don't think there is a test for the saturating case (signed integer + unsigned saturating fixed-point) in the SaturatedAddition tests.
> > > > > > > > > > > > `handleFixedPointConversion` only calculates the result type of the expression, not the calculation type.
> > > > > > > > > > > 
> > > > > > > > > > > Right, I understand that, but the result type of the expression obviously impacts the expressive range of the result, since you can end up with a negative value.
> > > > > > > > > > > 
> > > > > > > > > > > Hmm.  With that said, if the general principle is to perform the operation with full precision on the original operand values, why are unsigned fixed-point operands coerced to their corresponding signed types *before* the operation?  This is precision-limiting if the unsigned representation doesn't use a padding bit.  That seems like a bug in the spec.
> > > > > > > > > > > Hmm. With that said, if the general principle is to perform the operation with full precision on the original operand values, why are unsigned fixed-point operands coerced to their corresponding signed types *before* the operation? This is precision-limiting if the unsigned representation doesn't use a padding bit. That seems like a bug in the spec.
> > > > > > > > > > 
> > > > > > > > > > Possibly. When the standard mentions converting from signed to unsigned fixed point, the only requirement involved is conservation of magnitude (the number of integral bits excluding the sign)
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > when signed and unsigned fixed-point types are mixed, the unsigned type is converted to the corresponding signed type, and this should go without loss of magnitude
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > This is just speculation, but I'm under the impression that not as much "attention" was given for unsigned types as for signed types since "`In the embedded processor world, support for unsigned fixed-point data types is rare; normally only signed fixed-point data types are supported`", but was kept anyway since unsigned types are used a lot.
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > However, to disallow unsigned fixed-point arithmetic from programming languages (in general, and from C in particular) based on this observation, seems overly restrictive.
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > I also imagine that if the programmer needs more precision, the correct approach would be to cast up to a type with a higher scale. The standard seems to make an effort to expose as much in regards to the underlying fixed point types as possible:
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > it should be possible to write fixed-point algorithms that are independent of the actual fixed-point hardware support. This implies that a programmer (or a running program) should have access to all parameters that define the behavior of the underlying hardware (in other words: even if these parameters are implementation-defined).
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > So the user would at least know that unsigned types may not have the same scale as their corresponding signed types if the hardware handles them with different scales.
> > > > > > > > > > 
> > > > > > > > > > Also added test for signed integer + unsigned saturating fixed point.
> > > > > > > > > As long as we maintain the same typing behavior, does the standard permit us to just Do The Right Thing here and do the extended arithmetic with the original unsigned operand?  I'm sure there are some cases where we would produce a slightly different value than an implementation that does the coercion before the operation, but that might be permitted under the standard, and as you say, it would only affect some situations that it doesn't seem the standard has given much thought to.
> > > > > > > > The coercion of unsigned to signed is likely done for pragmatic reasons; if you have a signed and unsigned type of the same rank, operating on them together won't require any 'extra bits'. This means that if your hardware has native support for the operations, you won't end up in a situation where you really need 33 or 34 bits (for a 32 bit value) to actually perform the operation. This is one of the benefits of these kinds of implicit conversions.
> > > > > > > > 
> > > > > > > > It probably makes the implementation a bit easier as well, since you don't have to consider cases where both sides have different signedness. The loss of the extra bit of precision from the unsigned operand is probably not considered to be very important. As Leonard said, if you wanted higher precision you can simply use the next rank of types. ...Of course, they seem to have forgotten that that exact issue with differing signedness also applies to operations with fixed-point and integer operands.
> > > > > > > > 
> > > > > > > > Personally, I think the spec could be a bit more restrictive with how it handles these conversions, as the way they are now makes it really hard to generate code sensibly for machines that have native support for these kinds of operations; in many cases you'll end up requiring a couple bits too many or a couple bits too few to fit in a sensible register size, and ensuring that the scales match what your hardware can handle is also a challenge.
> > > > > > > > The coercion of unsigned to signed is likely done for pragmatic reasons; if you have a signed and unsigned type of the same rank, operating on them together won't require any 'extra bits'. This means that if your hardware has native support for the operations, you won't end up in a situation where you really need 33 or 34 bits (for a 32 bit value) to actually perform the operation. This is one of the benefits of these kinds of implicit conversions.
> > > > > > > 
> > > > > > > Yeah, I can see that.  I think it really depends on the target.  If you have hardware support, and you're using an unpadded representation for unsigned fixed-point values, then preserving the extra bit on mixed-signedness operations will sometimes prevent you from using the hardware optimally.  On the other hand, if you don't have hardware support, preserving the extra bit is usually easier than not: e.g., IIUC, multiplying a 32-bit signed _Fract with a 32-bit unpadded unsigned _Fract means just doing a 64-bit multiplication and dropping the low 32 bits, whereas doing the conversion first actually requires extra operations to ensure that the low bit doesn't contribute to the result.  And a target with hardware support is probably going to use a padded format for unsigned fixed-point precisely so that it can take advantage of the hardware.
> > > > > > > 
> > > > > > > > ...Of course, they seem to have forgotten that that exact issue with differing signedness also applies to operations with fixed-point and integer operands.
> > > > > > > 
> > > > > > > Right.
> > > > > > For signed-unsigned multiplication, we could do that without having to do the conversion first, but in the case of addition and subtraction, we would still need to drop a bit with unpadded unsigned types so both have the same scales.
> > > > > > 
> > > > > > I don't want to make too many assumptions about hardware support, but I imagine that if one supports 32 bit signed addition, and unsigned types did not have padding, it would be better instead to LSHR on unsigned fixed point type then do 32 bit signed addition rather than SEXT+SHL on the signed fixed point type then do 33 bit signed addition.
> > > > > > [I}n the case of addition and subtraction, we would still need to drop a bit with unpadded unsigned types so both have the same scales.
> > > > > 
> > > > > Right, but I don't think it matters because you're dropping that bit either way: with the specified semantics, you do the conversion beforehand which drops the bit, and with full-precision semantics, the bit isn't going to be representable and dropping it during the computation is equivalent to either rounding down (addition) or up (subtraction), either of which is allowed.
> > > > > 
> > > > > > I don't want to make too many assumptions about hardware support, but I imagine that if one supports 32 bit signed addition, and unsigned types did not have padding, it would be better instead to LSHR on unsigned fixed point type then do 32 bit signed addition rather than SEXT+SHL on the signed fixed point type then do 33 bit signed addition.
> > > > > 
> > > > > My point is just that (1) the 33-bit signed addition can just be a 32-bit signed addition given that you ultimately have to produce a 32-bit result and (2) targets with hardware support are unlikely to use an unpadded unsigned type anyway.  Although I suppose (2) is a bit questionable because hardware support could be added well after an ABI has been settled on.
> > > > Ok. So should anything be changed here then if both semantics work? Currently we still do the 32 bit addition for signed and unsigned types.
> > > Well, most of what we're doing here is trying to lay good groundwork for all operations, and "both semantics work" is only true for same-rank addition, which is why I'm asking whether we should consider what semantics we actually want in general.
> > I think the semantics for all binary arithmetic operators (+, -, *, /) can be simplified to:
> > 
> > 1. If one operand is a signed fixed point type and the other an unsigned type, convert the unsigned type to a signed type (which may involve reducing the scale).
> > 2. Find the common type of both operands that can hold both the larger scale and magnitude of the 2 operands, and cast them to this common type.
> > 3. Perform the binary operation.
> > 4. Cast to result type.
> > 
> > I think the semantics for addition are already settled with these semantics, and I believe subtraction shares the same semantics as those for addition. As you said before, the extra bit would not even matter with full precision since it would be trimmed anyway when converting from the common type to the result, and rounding can be either up or down.
> > 
> > Signed integer addition/subtraction with an unsigned fixed point type with these semantics will only lead to undefined behavior if the difference in the values between both types is negative since a negative value cannot be represented by an unsigned fixed point type (unless the result is saturating).
> > 
> > For multiplication and division, the result could technically be calculated also without having to cast between both sides, although for using the fixed point mul/div intrinsics, both operands would need to be casted to a common type with the same scale anyway. If a bit does end up getting dropped when casting an operand from unsigned to signed in (1), that dropped bit would not be representable when casting to the result anyway and the standard still allows us to return a rounded result.
> > 
> > When multiplying or dividing between a fixed point and integer though, we could just perform a regular integer mul/div if the result is non-saturating without having to perform the steps above. For saturating mul/div with an int, it might be easier to cast both sides to a common type then call the saturating fixed point mul/div intrinsics. In either case, I do not think rounding will be an issue here since we can still calculate the result precisely and the result will be either rounded up or down.
> > 
> > These semantics do not apply to comparison/relational operands, where if we do compare operands of different types or ranks, we would just need to find the common type between the 2 and perform the comparison.
> > As you said before, the extra bit would not even matter with full precision since it would be trimmed anyway when converting from the common type to the result, and rounding can be either up or down.
> 
> Note that this is only true if the unsigned value has a scale that's greater than or equal to the scale of the result type, which is true for same-rank addition but not necessarily with mixed-rank.
> 
> > For multiplication and division, the result could technically be calculated also without having to cast between both sides, although for using the fixed point mul/div intrinsics, both operands would need to be casted to a common type with the same scale anyway. If a bit does end up getting dropped when casting an operand from unsigned to signed in (1), that dropped bit would not be representable when casting to the result anyway and the standard still allows us to return a rounded result.
> 
> In a `_Fract` multiplication (or a mixed-rank multiplication), the extra bit of precision can affect higher bits in the result.  Consider an 8-bit format where you're multiplying unsigned `000.00001` with signed `+010.0000`: if you do this multiplication in full precision, the result is `+000.0001`, but if you do the conversion first, the result is `0`.
> 
> On some level, it just seems really strange to me that the spec calls for doing arbitrary-precision arithmetic and then converting to the result type for every situation except a signed/unsigned mismatch.
Wait, wait... You might be on to something.

The spec says in 4.1.4:
> In order to get useful and attainable results, the usual arithmetic conversions do not apply to the combination of an integer type and a fixed-point type, or the combination of two fixed-point types.
> In these cases:
> 1. the result of the operation is calculated using the values of the two operands, with their full precision;
> 2. if one operand has signed fixed-point type and the other operand has unsigned fixed-point type, then the unsigned fixed-point operand is converted to its corresponding signed fixed-point type and the resulting type is the type of the converted operand;
> 3. the result type is the type with the highest rank, [...]

(1) comes before (2); not after. In other words, shouldn't the true result of the operation be computed **before** the unsigned-signed correction? Point 2 is just worded really oddly, with 'operand is converted' and 'resulting type'.

It says that "the unsigned fixed-point operand is converted to its corresponding signed fixed-point type and the resulting type is the type of the converted operand", but this doesn't mean that the resulting converted operand is actually used, only its type is. This means that for the purpose of result type calculation in (3), the converted operand type from (2) will be used as that operand's type, rather than the unsigned type itself. Or?

The wording here is very hard to interpret.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53738/new/

https://reviews.llvm.org/D53738





More information about the cfe-commits mailing list