[PATCH] D79976: [LV] Handle Fold-Tail of loops with vectorizarion factor (VF) equal to 1

Bardia Mahjour via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 15 16:52:28 PDT 2020


bmahjour added inline comments.


================
Comment at: llvm/test/Transforms/LoopVectorize/tail-folding-vectorization-factor-1-scalar.ll:17
+; CHECK:         [[INDEX_NEXT]] = add i64 [[INDEX]], 4
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], 16
+; CHECK-NEXT:    br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0
----------------
anhtuyen wrote:
> Ayal wrote:
> > bmahjour wrote:
> > > How is it that the original loop executes 15 iterations, but the vector loop iterates 16? It seems the minimum iteration count check branch at the top should branch to the scalar loop instead of vector.ph.
> > (Thanks for asking, reminded to check above and below that fold-tail emits the desired scalar `icmp ule`'s, which are the focus of this patch.)
> > 
> > Fold-tail is responsible for rounding-up the trip count from 15 to 16, see https://reviews.llvm.org/D50480.
> > Regarding minimum iteration count check branch, fold-tail is also responsible in general for branching directly to vector.ph w/o an "if (trip-count < VF*UF)", which in this case is known to be false anyhow.
> Ayal @Ayal , thank you very much for your help to clarify. My guess is that Bardia @bmahjour might have been concerned whether by any chance the effective addition of the 16th iteration would affect the correctness of the generated code. Because I have neither evidence nor counterevidence to address his concern, if you can shed some light on it when you have some time, that will be great.  
> 
> Back to this patch but given the fact that the value of the trip-count is not the main focus hereof, I take the liberty of omitting its value from the checks. If that is not acceptable to any of us here, please let me know. 
> 
I guess I would understand how the rounding-up would work, if the instructions in the body were somehow predicated, but I don't see any predication in the output IR. Is that because there are no instructions in this test case with side-effects, read/writes, etc?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79976/new/

https://reviews.llvm.org/D79976





More information about the llvm-commits mailing list