[PATCH] D150851: [LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable

Mon Jun 12 06:23:45 PDT 2023

Mel-Chen added inline comments.

================
Comment at: llvm/include/llvm/Analysis/IVDescriptors.h:57
+                ///< loop induction PHI
+  SelectIVFCmp, ///< Integer select(fcmp(),x,y) where one of (x,y) is increasing
+                ///< loop induction PHI
----------------
fhahn wrote:
> To keep the diff more compact, could you split the FP handling off? It also looks like codegen is at least not tested for the FP case?
I am afraid there is misunderstanding. 
The test functions starting with "@select_fcmp" are testing the SelectIVFCmp reduction pattern. SelectIVFCmp is similar in semantics to SelectFCmp, where the operand types of the select instruction are integer, and the cmp instruction is fcmp.

================
Comment at: llvm/lib/Analysis/IVDescriptors.cpp:674
+    const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(Phi));
+    if (!AR->hasNoSignedWrap())
+      return false;
----------------
fhahn wrote:
> Do we need to distinguish here between no signed/no unsigned wrap and then chose `smax/umax` during codegen?
That's a good point. Implementing the patch for both signed and unsigned induction variables can be challenging in practice. 
The current patch only focuses on the signed IV because in most applications, we often encounter induction variables in the form of IV {0, +, step}. When the IV is signed, we can use -1 or the minimum value of the signed data type as a sentinel value. However, when the IV is unsigned, we don't have a value smaller than 0 to use. This doesn't mean that unsigned IV cannot be vectorized, but rather they require additional handling and a more refined approach.

Of course, if an unsigned IV is {1, +, step}, we can directly use the method implemented in this patch. However, such cases are less common, so we decided to focus on handling signed IV first.

================
Comment at: llvm/lib/Transforms/Utils/LoopUtils.cpp:1088
+  case RecurKind::SelectIVFCmp:
+    // TODO: Decreasing induction need fix here
+    return Builder.CreateIntMaxReduce(Src, true);
----------------
fhahn wrote:
> What does this mean? The current patch only handles increasing inductions, so there's no mis-compile that needs fixing here( which the comment kind-of implies)?
My bad. This maybe the note I leave when I implemented this patch. Will Remove it.

================
Comment at: llvm/test/Transforms/LoopVectorize/iv-select-cmp-no-wrap.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK
----------------
fhahn wrote:
> Could you submit the tests separately?
Sure. Will do.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150851/new/

https://reviews.llvm.org/D150851