[PATCH] D109676: [HardwareLoops] put +1 for loop count before zero extension

Sam Parker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 18 01:43:36 PDT 2021


samparker added inline comments.


================
Comment at: llvm/lib/Analysis/TargetTransformInfo.cpp:188
+  // as not overflow. Hareware count register will handle the count 0 case well.
+  // For example, on PowerPC, if the value in the Count Register is 0 before
+  // being decremented, it is -1 afterward.
----------------
shchenz wrote:
> samparker wrote:
> > This comment is making me uneasy... I'm don't think target specific stuff should be referenced in this generic layer. But either way, why would a count register ever have a value of -1, especially when we're expending quite some effort to test against zero in this transform?
> PowerPC don't have test and set form. On PowerPC, if the count register is 0 before being decremented, it will be wrapped to -1(all bits are 1) afterward, so we can handle the overflow case when `ExitCount` type is same with `CountType`. The total loop count is u64_MAX + 1
> 
> But on other targets like ARM, the hardware loops pass handles the overflow case for `ExitCount = SE.getAddExpr(ExitCount, SE.getOne(CountType));` when `ExitCount` type is same with `CountType` with test and set form? If this is the case, yes, the comments is not right, we need to update it for different targets.
> 
> But for all targets, making ` SE.getAddExpr(ExitCount, SE.getOne(CountType));` as not overflow when `SE.getTypeSizeInBits(ExitCount->getType()) == CountType->getBitWidth()` should be OK. Right?
Regardless of test.set support, all the intrinsics which set the counter use the trip count. So, if this is zero, as you're suggesting, the loop should never execute and so your counter should never be decremented to -1. What am I missing here? Could you point me at a test for an example?

The test.set forms, for ARM, are just so we can try to map on to 'while' loops and we can't presume anything in this method about test.set being successfully generated.

> But for all targets, making  SE.getAddExpr(ExitCount, SE.getOne(CountType)); as not overflow when SE.getTypeSizeInBits(ExitCount->getType()) == CountType->getBitWidth() should be OK. Right?
Yes, this sounds fine and I think this would be the only condition needed for the overflow check.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109676/new/

https://reviews.llvm.org/D109676



More information about the llvm-commits mailing list