[PATCH] D92285: [VPlan] Manage scalarized values using VPValues.

Mon Feb 15 14:43:18 PST 2021

gilr accepted this revision.
gilr added inline comments.
This revision is now accepted and ready to land.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9045
+  State.ILV->widenPHIInstruction(cast<PHINode>(getUnderlyingValue()), RdxDesc,
+                                 StartV, State.UF, State.VF, this, State);
 }
----------------
gilr wrote:
> No need to pass as arguments now that State itself is passed.
> No need to pass as arguments now that State itself is passed.

Does that make sense?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9207
+    State.set(this, VPhi, VPhi, Part);
+    State.set(getOperand(0), VPhi, Part);
   } else {
----------------
fhahn wrote:
> fhahn wrote:
> > gilr wrote:
> > > This is porting the resetVectorValue to State operations, which I think is fine by itself (I tend to agree with passing nullptr as the underlying value to VPPredInstPHI), but something else bothers me: It seems VPPredInstPHI doesn't take over the VPReplicate as the VPValue mapped to the predicated IR value during VPlan construction, which means that recipes constructed for users of predicated instructions use the ReplicateRecipe although it doesn't dominate them. We get away with this by overriding the IR's vector value, but for correct VP def-use we should have VPPredInstPHI take over as the registered VPValue in VPlan during construction. Does all that make sense?
> > I think the problem here is how we generate code for a scalarization region, like the one below. Uses after the region use the `PHI-PREDICATED-INSTRUCTION ` VPValue.
> > 
> > But when generating code for the region itself, we execute the same blocks once for each vector element. So in the second iteration `  "REPLICATE ir<%div> = udiv ir<16>, ir<%counter1.0> (S->V)\l" ` currently needs to pack its result in the result vector from the previous iteration. Currently that is done by overwriting the state of the def `"REPLICATE ir<%div> = udiv ir<16>, ir<%counter1.0> (S->V)\l"` with the predicated PHI from the previous iteration, so the vector-pack logic picks it up.
> > 
> > It might be possible to instead change the code to just have scalar phis and do the packing as part of the phi creation. I think for transformations, the VPValue def-use chain should be correct, and it is 'just' a code-gen improvement. Do you think we need to fix this before this one goes in?
> > 
> > 
> > 
> > ```
> >   subgraph cluster_N2 {
> >     fontname=Courier
> >     label="\<xVFxUF\> pred.udiv"
> >     N1 [label =
> >       "pred.udiv.entry:\n" +
> >         " +
> >         "BRANCH-ON-MASK vp<%4>\l"\l" +
> >          "CondBit: vp<%4> (for.body3)\l"
> >     ]
> >     N1 -> N3 [ label="T"]
> >     N1 -> N4 [ label="F"]
> >     N3 [label =
> >       "pred.udiv.if:\n" +
> >         "REPLICATE ir<%div> = udiv ir<16>, ir<%counter1.0> (S->V)\l"
> >     ]
> >     N3 -> N4 [ label=""]
> >     N4 [label =
> >       "pred.udiv.continue:\n" +
> >         "PHI-PREDICATED-INSTRUCTION vp<%6> = ir<%div>\l"
> >     ]
> >   }
> > ```
> @gilr I added a note explaining why we need to also update the operand. Do you think that is sufficient to move forward with this patch?
Yes, thanks!

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9207
+    State.set(this, VPhi, VPhi, Part);
+    State.set(getOperand(0), VPhi, Part);
   } else {
----------------
gilr wrote:
> fhahn wrote:
> > fhahn wrote:
> > > gilr wrote:
> > > > This is porting the resetVectorValue to State operations, which I think is fine by itself (I tend to agree with passing nullptr as the underlying value to VPPredInstPHI), but something else bothers me: It seems VPPredInstPHI doesn't take over the VPReplicate as the VPValue mapped to the predicated IR value during VPlan construction, which means that recipes constructed for users of predicated instructions use the ReplicateRecipe although it doesn't dominate them. We get away with this by overriding the IR's vector value, but for correct VP def-use we should have VPPredInstPHI take over as the registered VPValue in VPlan during construction. Does all that make sense?
> > > I think the problem here is how we generate code for a scalarization region, like the one below. Uses after the region use the `PHI-PREDICATED-INSTRUCTION ` VPValue.
> > > 
> > > But when generating code for the region itself, we execute the same blocks once for each vector element. So in the second iteration `  "REPLICATE ir<%div> = udiv ir<16>, ir<%counter1.0> (S->V)\l" ` currently needs to pack its result in the result vector from the previous iteration. Currently that is done by overwriting the state of the def `"REPLICATE ir<%div> = udiv ir<16>, ir<%counter1.0> (S->V)\l"` with the predicated PHI from the previous iteration, so the vector-pack logic picks it up.
> > > 
> > > It might be possible to instead change the code to just have scalar phis and do the packing as part of the phi creation. I think for transformations, the VPValue def-use chain should be correct, and it is 'just' a code-gen improvement. Do you think we need to fix this before this one goes in?
> > > 
> > > 
> > > 
> > > ```
> > >   subgraph cluster_N2 {
> > >     fontname=Courier
> > >     label="\<xVFxUF\> pred.udiv"
> > >     N1 [label =
> > >       "pred.udiv.entry:\n" +
> > >         " +
> > >         "BRANCH-ON-MASK vp<%4>\l"\l" +
> > >          "CondBit: vp<%4> (for.body3)\l"
> > >     ]
> > >     N1 -> N3 [ label="T"]
> > >     N1 -> N4 [ label="F"]
> > >     N3 [label =
> > >       "pred.udiv.if:\n" +
> > >         "REPLICATE ir<%div> = udiv ir<16>, ir<%counter1.0> (S->V)\l"
> > >     ]
> > >     N3 -> N4 [ label=""]
> > >     N4 [label =
> > >       "pred.udiv.continue:\n" +
> > >         "PHI-PREDICATED-INSTRUCTION vp<%6> = ir<%div>\l"
> > >     ]
> > >   }
> > > ```
> > @gilr I added a note explaining why we need to also update the operand. Do you think that is sufficient to move forward with this patch?
> Yes, thanks!
> Uses after the region use the `PHI-PREDICATED-INSTRUCTION ` VPValue.

Right, I missed the part where you replace the predicated instruction's VPValue with the new PredInstRecipe.

> It might be possible to instead change the code to just have scalar phis and do the packing as part of the phi creation. I think for transformations, the VPValue def-use chain should be correct, and it is 'just' a code-gen improvement. Do you think we need to fix this before this one goes in?

Not sure recipes should create values in other places than their designated location. Anyway, this shouldn't block this patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92285/new/

https://reviews.llvm.org/D92285