[PATCH] D93617: [LoopVectorize] Take non-instruction's size into consideration when computing leader's demanded bits
guopeilin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 4 01:31:25 PST 2021
guopeilin added inline comments.
================
Comment at: llvm/lib/Analysis/VectorUtils.cpp:539
+ if (!isa<Instruction>(Val)) {
+ if (!isa<Argument>(Val))
+ continue;
----------------
nikic wrote:
> I think we can't entirely ignore non-arguments (aka constants) here. Take a look at this modified form of `llvm/test/Transforms/LoopVectorize/avoid-truncate-icmp-operands.ll`:
>
> ```
> ; RUN: opt -loop-vectorize -S < %s | FileCheck %s
> target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
> target triple = "aarch64-unknown-linux-gnu"
>
> @a = dso_local local_unnamed_addr global i64 0, align 8
> @b = dso_local local_unnamed_addr global i16 0, align 4
>
> define dso_local void @myFunc(i32 %d, i32 %e) {
> ; CHECK: pred.store.continue2:
> ; CHECK-NEXT: %{{[0-9]+}} = icmp ult <2 x i64> %broadcast.splat{{[0-9]*}}, %broadcast.splat{{[0-9]*}}
> for.body29.lr.ph:
> br label %for.body29
>
> for.cond25.for.cond.cleanup28_crit_edge: ; preds = %for.inc
> ret void
>
> for.body29: ; preds = %for.inc, %for.body29.lr.ph
> %n.078 = phi i16 [ undef, %for.body29.lr.ph ], [ %add34, %for.inc ]
> br i1 undef, label %for.inc, label %if.then
>
> if.then: ; preds = %for.body29
> %conv31 = zext i8 undef to i64
> store i64 %conv31, i64* @a, align 8
> %cmp.i = icmp ult i32 %e, %d
> %.sroa.speculated = select i1 %cmp.i, i64 10000000000, i64 20000000000
> %conv32 = trunc i64 %.sroa.speculated to i16
> store i16 %conv32, i16* @b, align 4
> br label %for.inc
>
> for.inc: ; preds = %if.then, %for.body29
> %add34 = add nsw i16 %n.078, 2
> %cmp27 = icmp slt i16 %add34, 16
> br i1 %cmp27, label %for.body29, label %for.cond25.for.cond.cleanup28_crit_edge, !llvm.loop !6
> }
>
> !6 = distinct !{!6, !7}
> !7 = !{!"llvm.loop.vectorize.enable", i1 true}
> ```
>
> This gets vectorized with i32 width, but the two select constants are larger than 32-bit and get truncated.
>
> I think the logic here should be: For `ConstantInt`, take the `getActiveBits()` of the constant value, for everything else (including Arguments) take the getTypeSizeInBits (what you're effectively doing now).
Thanks a lot for reviewing this patch. And a new patch has been updated, which takes case `ConstantInt` into consideration, too.
And there still exists something that confuses me. Firstly, I don't know why `the two select constants are larger than 32-bit`? As far as I can see, both `%d` and `%e` are defined as an `i32 argument`, what is the purpose that we take `ConstantInt` into consideration here?
Secondly, for `Constant`, there are `ConstantFP`, `ConstantInt` and some other kind of constant, so shall we take them into consideration, too?
It would be a great appreciation if you could explain it a bit more. Again, thanks a lot for reviewing.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93617/new/
https://reviews.llvm.org/D93617
More information about the llvm-commits
mailing list