[llvm] [LV] Don't predicate uniform divides with loop-invariant divisor. (PR #98904)

Wed Jul 17 14:20:27 PDT 2024

================
@@ -3710,6 +3713,15 @@ bool LoopVectorizationCostModel::isPredicatedInst(Instruction *I) const {
   case Instruction::SDiv:
   case Instruction::SRem:
   case Instruction::URem:
+    // When folding the tail, at least one of the lanes must execute
+    // unconditionally. If the divisor is loop-invariant no predication is
+    // needed, as predication would not prevent the divide-by-0 on the executed
+    // lane.
+    if (foldTailByMasking() && !Legal->blockNeedsPredication(I->getParent()) &&
+        TheLoop->isLoopInvariant(I->getOperand(1)) &&
+        (IsKnownUniform || isUniformAfterVectorization(I, VF)))
+      return false;
----------------
ayalz wrote:

This is similar to unpredicating loads and stores above, and should have consistent explanations and implementations. Checking if !Legal->blockNeedsPredication(I->getParent()) implies foldTailByMasking() given than we passed blockNeedsPredicationForAnyReason(I->getParent()) to get here. Why are IsKnownUniform and isUniformAfterVectorization needed - would Legal->isInvariant(I->getOperand(1)) suffice, as in the load-from-uniform-address case above?

A side-effecting instruction under mask could be unmasked provided (a) its side-effect operand(s) is uniform (divisor for div/rem, address for load, address and value for store), and (b) its mask is known to have its first lane set (conceivably extensible to having any lane set?). This is the case for instructions subject only to fold-tail mask. Would be good to have an explicit API, say, `MaskKnownToExcludeFirstLane()`.

Would it suffice to have
```
    if (Legal->isInvariant(I->getOperand(1)) &&
        !Legal->blockNeedsPredication(I->getParent()))
      return false;
```
as in the load-from-uniform-address case above?

https://github.com/llvm/llvm-project/pull/98904