[PATCH] D139074: Vectorization Of Conditional Statements Using BOSCC

Ashutosh Nema via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 26 00:40:30 PST 2023


Ashutosh updated this revision to Diff 500522.
Ashutosh added a comment.

Addressed the regressions in TSVC and updated the patch.

Observed these regressions are due to extra shuffle generated during append mask functionality.

This was appending compare mask corresponding to various unroll parts and then considering the appended mask in vector guard check.
i.e.

  VectorMask1 = (X[i,i+1,i+2,i+3] != <0,0,0,0>);  // Mask From UnrollPart1
  VectorMask2 = (X[i+4,i+5,i+6,i+7] != <0,0,0,0>); // Mask From UnrollPart2
  
  
  AppendedVectorMask = AppendVectorMask VectorMask1, VectorMask2;
  VectorMaskScalar = VectorToScalarCast AppendedVectorMask;
  
  if (VectorMaskScalar) {
    Mask.Vector.Store.A[i,i+1,i+2,i+3] = Mask.Vector.Load.B[i,i+1,i+2,i+3] 
                                       + Mask.Vector.Load.C[i,i+1,i+2,i+3]; // Based on VectorMask1
    Mask.Vector.Store.A[i+4,i+5,i+6,i+7] = Mask.Vector.Load.B[i+4,i+5,i+6,i+7] 
                                         + Mask.Vector.Load.C[i+4,i+5,i+6,i+7]; // Based on VectorMask2

}

This was not optimal specially for cases where the unroll factor is >=4, because it has to promote the mask to match the types.

Instead now updated the guard check to consider compare mask corresponding to each unroll part separately:

  VectorMask1 = (X[i,i+1,i+2,i+3] != <0,0,0,0>);  // Mask From UnrollPart1
  VectorMask2 = (X[i+4,i+5,i+6,i+7] != <0,0,0,0>);  // Mask From UnrollPart2
  
  VectorMaskScalar1 = VectorToScalarCast VectorMask1;
  VectorMaskScalar2 = VectorToScalarCast VectorMask2;
  VectorMaskScalar = VectorMaskScalar1 OR VectorMaskScalar2
   
  if (VectorMaskScalar) {
    Mask.Vector.Store.A[i,i+1,i+2,i+3] = Mask.Vector.Load.B[i,i+1,i+2,i+3] 
                                       + Mask.Vector.Load.C[i,i+1,i+2,i+3]; // Based on VectorMask1
    Mask.Vector.Store.A[i+4,i+5,i+6,i+7] = Mask.Vector.Load.B[i+4,i+5,i+6,i+7] 
                                         + Mask.Vector.Load.C[i+4,i+5,i+6,i+7]; // Based on VectorMask2

}


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139074/new/

https://reviews.llvm.org/D139074

Files:
  llvm/include/llvm/Analysis/VectorUtils.h
  llvm/lib/Analysis/VectorUtils.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
  llvm/lib/Transforms/Vectorize/VPlan.cpp
  llvm/lib/Transforms/Vectorize/VPlan.h
  llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
  llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
  llvm/lib/Transforms/Vectorize/VPlanValue.h
  llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
  llvm/test/Transforms/LoopVectorize/boscc0.ll
  llvm/test/Transforms/LoopVectorize/boscc1.ll
  llvm/test/Transforms/LoopVectorize/boscc2.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D139074.500522.patch
Type: text/x-patch
Size: 67979 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230226/f4af815a/attachment-0001.bin>


More information about the llvm-commits mailing list