[PATCH] D29449: [SLP] Generalization of vectorization of CmpInst operands, NFC.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 7 01:00:24 PST 2017
ABataev added a comment.
In https://reviews.llvm.org/D29449#668395, @mkuper wrote:
> Sorry, I'm failing to communicate the example I have in mind.
> Here it is, concretely:
>
> declare void @bar(i1)
>
> define void @foo(i32* %A, i32 %k, i32 %n) {
> %idx0 = getelementptr inbounds i32, i32* %A, i64 0
> %idx4 = getelementptr inbounds i32, i32* %A, i64 4
> %load0 = load i32, i32* %idx0, align 8
> %load4 = load i32, i32* %idx4, align 8
> %mul0 = mul i32 %load0, %k
> %mul4 = mul i32 %load4, %k
> %res = add i32 %mul0, %mul4
> %cmp = icmp eq i32 %res, %n
> call void @bar(i1 %cmp)
> ret void
> }
>
>
>
> With the current code, we get:
> $ bin/opt -slp-vectorizer < ~/llvm/temp/cmpslp.ll -S -o - -slp-threshold=-10
>
> declare void @bar(i1)
>
> define void @foo(i32* %A, i32 %k, i32 %n) {
> %idx0 = getelementptr inbounds i32, i32* %A, i64 0
> %idx4 = getelementptr inbounds i32, i32* %A, i64 4
> %load0 = load i32, i32* %idx0, align 8
> %load4 = load i32, i32* %idx4, align 8
> %1 = insertelement <2 x i32> undef, i32 %k, i32 0
> %2 = insertelement <2 x i32> %1, i32 %k, i32 1
> %3 = insertelement <2 x i32> undef, i32 %load0, i32 0
> %4 = insertelement <2 x i32> %3, i32 %load4, i32 1
> %5 = mul <2 x i32> %2, %4
> %6 = extractelement <2 x i32> %5, i32 0
> %7 = extractelement <2 x i32> %5, i32 1
> %res = add i32 %6, %7
> %cmp = icmp eq i32 %res, %n
> call void @bar(i1 %cmp)
> ret void
> }
>
>
> The new code will not be able to vectorize this.
>
> I agree with you that (a) what we do now is generally pretty bad, and (b) we handle this case more or less by accident.
> But this patch is not NFC, and has the potential to regress this kind of cases.
Michael, I understand this.
What should I do then? Prepare a patch with vectorization of CallInst args and then prepare an NFC patch for CmpInst? Or you have something different in your mind?
https://reviews.llvm.org/D29449
More information about the llvm-commits
mailing list