[PATCH] D42309: [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking

Mon Jan 22 21:07:36 PST 2018

dcaballe added a comment.

Hi Matthew,

thanks for taking care of this! I like the general idea of the fix but I have a concern regarding the TODO in line 408:

  // TODO: We should not rely on InstCombine to rewrite the reduction in the
  //       smaller type. We should just generate a correctly typed expression
  //       to begin with.

The cost model is relying on an optimization that will hopefully be applied by InstCombine. I wonder if it would be too complicated to implement the actual type shrinking in LV code gen as part of the fix.
That would better align cost modeling with LV code generation, which is one the major concerns of the current infrastructure.

In the future VPlan infrastructure, we will definitely need to address this optimization as a VPlan-to-VPlan transformation. Thanks for bringing up this issue.
Diego

================
Comment at: lib/Transforms/Utils/LoopUtils.cpp:132

-      // Terminate the traversal if the operand is not an instruction, or we
-      // reach the starting value.
-      Instruction *J = dyn_cast<Instruction>(U.get());
-      if (!J || J == Start)
-        continue;
+  if (MaxBitWidth == DL.getTypeSizeInBits(Exit->getType()) && AC && DT) {
+    // If demanded bits wasn't able to limit the bit width, we can try to use
----------------
(I don't know DB/ValueTracking in depth so maybe I'm missing something.)
If I understand correctly, we would try with value tracking in the following scenario:

```
117    MaxBitWidth = 64
...
127    MaxBitWidth = 33
...
129    MaxBitWidth = 64
...
132    First condition is true
}
```

But we wouldn't in the following one:

```
117    MaxBitWidth = 64
...
127    MaxBitWidth = 31
...
129    MaxBitWidth = 32
...
132    First condition is false
}
```

Is this the expected behavior? In other words, if DB returns a width narrower than the original one and it's later rounded up to the original width (first scenario), could value tracking return an even narrower width?

If so, shouldn't we always try with value tracking, even when MaxBitWidth != DL.getTypeSizeInBits?
Otherwise, shouldn't we skip value tracking for those cases (first scenario)? We could invoke isPowerOf2_64 only once at the end of the function.

================
Comment at: lib/Transforms/Utils/LoopUtils.cpp:389
+    // looked through an 'and' instruction when evaluating a potential
+    // arithmetic reduction to determine if it may been type-promoted.
+    //
----------------
may been -> may have been?

Repository:
  rL LLVM

https://reviews.llvm.org/D42309