[PATCH] D78353: [LV] Don't emit the Vector IV if it is not used.

Mon Apr 27 01:33:41 PDT 2020

SjoerdMeijer marked an inline comment as done.
SjoerdMeijer added inline comments.
Herald added a project: LLVM.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1920
   Value *ScalarIV = CreateScalarIV(Step);
   CreateSplatIV(ScalarIV, Step);
   buildScalarSteps(ScalarIV, Step, EntryVal, ID);
----------------
Ayal wrote:
> SjoerdMeijer wrote:
> > Ayal wrote:
> > > Would it be possible to avoid generating the undesired broadcasting by wrapping this CreateSplatIV() under an "if (needsVectorInduction(EntryVal))" that checks if any user !shouldScalarizeInstruction(), analogous to needsScalarInduction(EntryVal) above?
> > > 
> > > Better avoid generating and carrying potential garbage to begin with if it's easy; or potentially augment cse() clean-up at the end.
> > > Would it be possible to avoid generating the undesired broadcasting by wrapping this CreateSplatIV() under an "if (needsVectorInduction(EntryVal))" that checks if any user !shouldScalarizeInstruction(), analogous to needsScalarInduction(EntryVal) above?
> > 
> > I agree that this would be nicer, but it looks like  `needsScalarInduction` is an easier case as it simply looks at orginal loop instructions and see if they need to be scalarised. For the possibly new `needsVectorInduction` variant, I think we would need to look at the uses of the vector IV in the vector code, which hasn't been created yet. And since we need to look at the uses, which is what this patch is doing, I thought doing this afterwards in the fixup routine was a simple and reasonable compromise. For example, I see cases of masked/load stores using the vector IV, and I don't think I have all the information at this point to check all uses. 
> > 
> > > Better avoid generating and carrying potential garbage to begin with if it's easy; or potentially augment cse() clean-up at the end.
> > 
> > As I wrote above, I am not sure if it is easier or even possible to avoid generating the dead code. But I will have a look again to see if I haven't missed anything.  If this is not possible, I understand your preference is to have some form of DCE running here?  I thought about that, but wasn't sure if that would be overkill for just removing the vectorIV...
> This indeed deserves some cleaning up.
> 
> The current decision is to vectorize (vs. scalarize) an IV if it or its update have any immediate user that will be vectorized;  see documentation noted below. That decision should be aware that fold-mask will add an immediate vector user to the primary induction; suggest to fix that below. 
> 
> The claim is that (then), this last default case of widenIntOrFpInduction() where EntryVal is scalarized, should not call CreateSplatIV() at all, as EntryVal must have no vector users.
> 
> Note that users first marked vector (!isScalarAfterVectorization; causing IV to be marked vector) may later be found to be isProfitableToScalarize; so IV might end up being vectorized with no vector users. But the converse cannot hold -- an initially marked vector IV cannot later be found isProfitableToScalarize because the latter applies to instructions in predicated blocks only.
> 
> An alternative approach, if/when supporting splatting scalar IV's into vectors is desired, may be to introduce them lazily, along with the scalar-to-vector broadcasts of ILV's getOrCreateVectorValue(). Going forward though, all such scalars-to-vector/vector-to-scalars should be modeled explicitly, during VPlan construction, rather than during a recipe's execution.
> 
> ILV does employ a self-cleaning cse() post-pass. But going forward, better have things cleaned up in VPlan.
Thanks Ayal for sharing and explaining this! I am looking into this now.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78353/new/

https://reviews.llvm.org/D78353