[PATCH] D48661: [Fixed Point Arithmetic] Fixed Point Constant

Leonard Chan via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jul 2 16:14:11 PDT 2018


leonardchan added inline comments.


================
Comment at: include/clang/Basic/FixedPoint.h:31
+  SatNoPadding,
+};
+
----------------
ebevhan wrote:
> rjmccall wrote:
> > I figured you'd want this to be a struct which include the scale, width, signed-ness, and saturating-ness; and then `APFixedPoint` can just store one of these next to a bare `APInt`.
> That's what I was expecting too, similar to the APFloat version.
> 
> What width would it contain? The width with or without padding? Probably the one with padding as the APInt itself has the width without padding.
Woops my bad, I thought you were referring to the `APFloatSemantics`. I actually didn't know about `fltSemantics` until now.


================
Comment at: lib/Basic/FixedPoint.cpp:20
+
+llvm::APSInt ShrToZero(const llvm::APSInt &Val, unsigned Amt) {
+  if (Val < 0)
----------------
ebevhan wrote:
> I'm unsure if this operation is useful for anything but converting fixed-point to integer. Make this a member of APFixedPoint, drop the arguments and give it a name that makes it clearer that it is a round-to-zero integer conversion? `toIntegerRoundToZero`?
You're right. Removed it altogether.


================
Comment at: lib/Basic/FixedPoint.cpp:53
+    // We can overflow here
+    unsigned ShiftAmt = DstScale - Scale;
+    if (Val < 0 && Val.countLeadingOnes() >= ShiftAmt)
----------------
ebevhan wrote:
> ebevhan wrote:
> > I think saturation can be modeled a bit better. It should be possible to do the overflow check/saturation once instead of twice, and also have it fit in better with the other conversions.
> > 
> > Saturation on a value is essentially checking whether all bits above and including a certain bit are all the same, and 'clamping' to either the minimum (the mask) or maximum (inverse of the mask) if not.
> > ```
> > // Saturate Value to SatWidth. This will clamp Value at the signed min/max of a value that is SatWidth long.
> > Saturate(SatWidth):
> >   Mask = highBits(Width, SatWidth + 1)
> >   Masked = Value & Mask
> >   Result = Value
> >   if (Masked == Mask || Masked == 0) {
> >     if (Masked.isNegative())
> >       Result = Mask
> >     else
> >       Result = ~Mask
> >   }
> > ```
> > This cannot be done on the value in the source scale, since upscaling after clamping would produce an incorrect value - we would shift in 0's from the right even though we should have saturated all bits completely. Also, we cannot upscale without first extending or we might shift out bits on the left that could affect saturation.
> > 
> > One thing to note is that (I'm ***pretty sure*** this is the case) fractional bits cannot affect saturation. This means that we can find a common scale, then perform the saturation operation, and then resize, and the value should just fall into place. Basically:
> >   # Upscale if necessary, but extend first to ensure that we don't drop any bits on the left. Do this by resizing to `SrcWidth - SrcScale + std::max(SrcScale, DstScale)` and then upscaling normally by `DstScale - SrcScale`.
> >   # Downscale if necessary by `SrcScale - DstScale`. (I think; in our downstream, we saturate first and then downscale, but we also assume that saturation can only occur on _Fract, and doing saturation first makes the saturation width calculation a bit messier because it will be a `max` expression. I'm unsure if the order can be changed.)
> >   # At this point, the scale of the value should be `DstScale`. Saturate with `Saturate(DstWidth)`.
> >   # Now the value should be in range for the new width, and will be at the right scale as well. Resize with `extOrTrunc(DstWidth)`.
> >   # (Also; if the result was negative and the dest type is unsigned, just make the result zero here instead.)
> > 
> > Note that the order of operations is different than what I've mentioned before with non-saturated conversion (downscale if needed, resize, upscale if needed), but it works because we only do the upscale as a resize+upscale. Somewhere in here you also need to change the signedness of the value, but I'm not entirely certain where. Likely after 4.
> > 
> > Everything I've mentioned here is semi-conjectural, since our implementation has different type widths than the defaults in this one, we can only saturate on _Fract, and our unsigned types have padding. It's possible that there are some assumptions that cause this method to fail for unsigned without padding, or perhaps for some other type conversion, but I haven't come up with a counterexample yet.
> Now that I think about it a bit more, it's clear that the Saturate routine does not work for unsigned fixed-point without padding. That would need to be taken into consideration, but the rest should work.
Thanks for the deeper explanation. The implementation is essentially the same and catching the unsigned fixed point without padding is just checking if the value is negative and we are casting to an unsigned.


================
Comment at: lib/Basic/FixedPoint.cpp:53
+    // We can overflow here
+    unsigned ShiftAmt = DstScale - Scale;
+    if (Val < 0 && Val.countLeadingOnes() >= ShiftAmt)
----------------
leonardchan wrote:
> ebevhan wrote:
> > ebevhan wrote:
> > > I think saturation can be modeled a bit better. It should be possible to do the overflow check/saturation once instead of twice, and also have it fit in better with the other conversions.
> > > 
> > > Saturation on a value is essentially checking whether all bits above and including a certain bit are all the same, and 'clamping' to either the minimum (the mask) or maximum (inverse of the mask) if not.
> > > ```
> > > // Saturate Value to SatWidth. This will clamp Value at the signed min/max of a value that is SatWidth long.
> > > Saturate(SatWidth):
> > >   Mask = highBits(Width, SatWidth + 1)
> > >   Masked = Value & Mask
> > >   Result = Value
> > >   if (Masked == Mask || Masked == 0) {
> > >     if (Masked.isNegative())
> > >       Result = Mask
> > >     else
> > >       Result = ~Mask
> > >   }
> > > ```
> > > This cannot be done on the value in the source scale, since upscaling after clamping would produce an incorrect value - we would shift in 0's from the right even though we should have saturated all bits completely. Also, we cannot upscale without first extending or we might shift out bits on the left that could affect saturation.
> > > 
> > > One thing to note is that (I'm ***pretty sure*** this is the case) fractional bits cannot affect saturation. This means that we can find a common scale, then perform the saturation operation, and then resize, and the value should just fall into place. Basically:
> > >   # Upscale if necessary, but extend first to ensure that we don't drop any bits on the left. Do this by resizing to `SrcWidth - SrcScale + std::max(SrcScale, DstScale)` and then upscaling normally by `DstScale - SrcScale`.
> > >   # Downscale if necessary by `SrcScale - DstScale`. (I think; in our downstream, we saturate first and then downscale, but we also assume that saturation can only occur on _Fract, and doing saturation first makes the saturation width calculation a bit messier because it will be a `max` expression. I'm unsure if the order can be changed.)
> > >   # At this point, the scale of the value should be `DstScale`. Saturate with `Saturate(DstWidth)`.
> > >   # Now the value should be in range for the new width, and will be at the right scale as well. Resize with `extOrTrunc(DstWidth)`.
> > >   # (Also; if the result was negative and the dest type is unsigned, just make the result zero here instead.)
> > > 
> > > Note that the order of operations is different than what I've mentioned before with non-saturated conversion (downscale if needed, resize, upscale if needed), but it works because we only do the upscale as a resize+upscale. Somewhere in here you also need to change the signedness of the value, but I'm not entirely certain where. Likely after 4.
> > > 
> > > Everything I've mentioned here is semi-conjectural, since our implementation has different type widths than the defaults in this one, we can only saturate on _Fract, and our unsigned types have padding. It's possible that there are some assumptions that cause this method to fail for unsigned without padding, or perhaps for some other type conversion, but I haven't come up with a counterexample yet.
> > Now that I think about it a bit more, it's clear that the Saturate routine does not work for unsigned fixed-point without padding. That would need to be taken into consideration, but the rest should work.
> Thanks for the deeper explanation. The implementation is essentially the same and catching the unsigned fixed point without padding is just checking if the value is negative and we are casting to an unsigned.
Also I wasn't sure if this was a typo, but I have

```
if (!(Masked == Mask || Masked == 0)) {
```

instead of 

```
if (Masked == Mask || Masked == 0) {
```

because I think the mask comparison checks if the bits above the sign changed which indicates saturation.


================
Comment at: lib/Basic/FixedPoint.cpp:61
+  } else if (DstScale < Scale) {
+    Val = ShrToZero(Val, Scale - DstScale);
+  }
----------------
ebevhan wrote:
> This is a bit tricky. The spec does indeed say that conversion to integer must round toward zero, but does not say that conversion between fixed-point types must. If we have these routines perform conversion like this, we would likely have to emit extra code (either an addition or the two negations) during fixed-to-fixed conversion to match the consteval behavior.
> 
> I think fixed-to-fixed conversion should keep using a normal shift, and reserve the round-to-zero shift for conversion to integer.
You're right. The spec also says that shifting falls under the typical rounding behavior which is implementation-defined.


================
Comment at: lib/Basic/FixedPoint.cpp:89
+  if (DstIsSaturated && Overflowed) {
+    if (IsNegative) {
+      // Use min
----------------
ebevhan wrote:
> You can use getMax and getMin here to simplify the logic, I think?
> 
> As I mention in my other comment I think that the saturation logic can be more strongly integrated into the conversion.
Removed `saturatedRescaleAndResize` and `rescaleAndResize` altogether since the logic in both can be performed in `convert`.


Repository:
  rC Clang

https://reviews.llvm.org/D48661





More information about the cfe-commits mailing list