[PATCH] D22867: [LV] Untangle the concepts of uniform and scalar

Fri Jul 29 12:08:09 PDT 2016

mssimpso added a comment.

From Wei:

> What I mean is only vector register pressure is considered when selectVectorizationFactor. scalar register pressure is only considered in selectInterleaveCount and only when VF==1. When selectVectorizationFactor, scalar register pressure is not considered, so we don't want to count scalarized iv as a live range usage.

All right, I think I can agree with that. I think the confusing part here is that the "meaning" of calculateRegisterUsage seems to be different depending on who calls it.

> Yes, calculateRegisterUsage is only intended to track the pressure of vector registers when selectVectorizationFactor. The reason is explained above. Yes, I think it is better to add values to VecValuesToIgnore if isScalarAfterVectorization is true.

Okay. I think we're saying that calcuateRegisterUsage is used to track vector register pressure for vector factor selection, vector register pressure for unrolling (VF > 1), and scalar register pressure for unrolling (VF = 1). It doesn't make a distinction between general purpose and vector registers. So if the loop will have both scalar and vector values, the estimate may be less precise. In that case, I think I agree here. If VF > 1, the intended purpose is to only track the pressure of vector registers, so we should ignore all scalar values.

From Michael:

> We have three types of values:

> 

> 1. Uniform scalar - every value for which we generate a single scalar - the primary IV, base pointer for consecutive pointer values, etc.

> 2. Non-uniform "scalar" - values that are non-uniform, but for which we generate VF scalar values, instead of a single vector. For instance, the "scalar IVs".

> 3. Non-uniform non-scalar - values that actually get vectorized.

...

> In any case, as Wei wrote, we need two different groupings:

>  (a) One for 1 vs. 2 + 3, for operation cost estimation. This is what should go into ValuesNotWidened

>  (b) The other, for 1 + 2 vs. 3, for vector register pressure estimation.

I think I agree, but want to clarify since I removed ValuesNotWidened.

For cost estimation, we will consider (1): isUniformAfterVectorization(), and for register pressure estimation and IV scalarization, we will consider (3): isScalarAfterVectorization. In that case, I think if I replace isUniformAfterVectorization with isScalarAfterVectorization in collectValuesToIgnore, we will all be happy.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2005
@@ +2004,3 @@
+    // scalar after vectorization.
+    auto ScalarIndUpdate = all_of(IndUpdate->users(), [&](User *U) -> bool {
+      if (TheLoop->isLoopInvariant(U) || U == Ind)
----------------
mkuper wrote:
> Maybe return false early if !ScalarInd ?
> There's no need to compute ScalarIndUpdate in that case, is there?
Right, good point! I removed the optimization when trying to make the code easier to read. I'll update the patch.

https://reviews.llvm.org/D22867