[PATCH] D154314: [LV] Remove the reminder loop if we know the mask is always true

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 6 01:39:47 PDT 2023

david-arm added inline comments.

Comment at: llvm/test/Transforms/LoopVectorize/AArch64/eliminate-tail-predication.ll:34
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 1024, [[N_VEC]]
-; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
I'm guessing that InstCombine does not determine this is guaranteed to always be true? However, I thought that someone did work in the DAGCombiner that will replace this with

  br i1 true, label [[EXIT:%.*]], label [[SCALAR_PH]]

when vscale is known to be a power of 2? Are you hoping to benefit from eliminating the scalar tail in IR because it helps us to make better decisions later in the pipeline? I can imagine it's beneficial for LTO where the scalar tail could prevent inlining.

If I remember correctly one of the problems with folding away the icmp in InstCombine is that it doesn't have access to the TTI interface so we cannot query the target.

Comment at: llvm/test/Transforms/LoopVectorize/X86/constant-fold.ll:30
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 2, 2
-; CHECK-NEXT:    br i1 [[CMP_N]], label [[BB3:%.*]], label [[SCALAR_PH]]
For all the fixed-lenth vector tests this icmp will get replaced with "i1 true" by InstCombine so the scalar tail should get automatically deleted.

Comment at: llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll:4
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
Hmm, I know this is not introduced by your patch, but I don't think we should have tests with target-specifics in the top level LoopVectorize directory.



More information about the llvm-commits mailing list