[PATCH] D109676: [HardwareLoops] put +1 for loop count before zero extension

Philip Reames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 13 10:03:05 PDT 2021


reames added inline comments.


================
Comment at: llvm/lib/Analysis/TargetTransformInfo.cpp:170
     ExitBlock = BB;
-    ExitCount = EC;
+    TripCount = EC;
     break;
----------------
You're using a variable called TripCount to store an ExitCount (e.g. BTC) before adding the appropriate offset.  I strongly suggest using two variables.  


================
Comment at: llvm/lib/Analysis/TargetTransformInfo.cpp:178
+  bool NeedExtend =
+      !TripCount->getType()->isPointerTy() && TripCount->getType() != CountType;
+  // If we are going to extend the TripCount and we can prove that TripCount + 1
----------------
There appears to be an unasserted assumption that CountType is of a wider type than the exitcount's type.  The code structure does not make it clear this is true, at a minimum, please assert.  


================
Comment at: llvm/lib/Analysis/TargetTransformInfo.cpp:178
+  bool NeedExtend =
+      !TripCount->getType()->isPointerTy() && TripCount->getType() != CountType;
+  // If we are going to extend the TripCount and we can prove that TripCount + 1
----------------
reames wrote:
> There appears to be an unasserted assumption that CountType is of a wider type than the exitcount's type.  The code structure does not make it clear this is true, at a minimum, please assert.  
I'd suggest turning the pointer type check into an early return false unless you have a motivating case for the complexity.  Pointer typed trip counts should be rare.  


================
Comment at: llvm/lib/Analysis/TargetTransformInfo.cpp:185
+          L, ICmpInst::ICMP_NE,
+          SE.getAddExpr(TripCount, SE.getOne(TripCount->getType())),
+          SE.getZero(TripCount->getType())))
----------------
Please use a comparison of the exit count against maximum unsigned value instead.  The optimizer will probably turn it into the same, but the form you've written has potential interaction with SCEVs flag inference which is best avoided.  


================
Comment at: llvm/lib/Analysis/TargetTransformInfo.cpp:193
+  else
+    TripCount = SE.getAddExpr(TripCount, SE.getOne(TripCount->getType()));
+
----------------
If you use getNoopOrZeroExtend, you don't need the explicit else case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109676/new/

https://reviews.llvm.org/D109676



More information about the llvm-commits mailing list