[llvm] [LoopVectorize] Add cost of generating tail-folding mask to the loop (PR #90191)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Tue May 14 04:09:37 PDT 2024


================
@@ -430,116 +430,41 @@ define void @conditional_uniform_load(ptr noalias nocapture %a, ptr noalias noca
 ;
 ; TF-SCALABLE-LABEL: @conditional_uniform_load(
 ; TF-SCALABLE-NEXT:  entry:
-; TF-SCALABLE-NEXT:    br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
----------------
david-arm wrote:

Both TF-SCALABLE and TF-FIXED now fail to vectorise because the additional lane mask cost has just pushed it over the cost of the scalar loop. At least for the scalable case this may indicate the cost model for get.active.lane.mask calls needs improving?

https://github.com/llvm/llvm-project/pull/90191


More information about the llvm-commits mailing list