[PATCH] D36213: [InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 2 09:19:52 PDT 2017
spatel added subscribers: efriedma, mcrosier, t.p.northover.
spatel added a comment.
I pushed 'test7' through llc for x86 and PPC64LE, and no problems. But then I tried AArch64 and ARM, and they went nuts whether it was an 'xor' or an 'and':
define <2 x i64> @test7(<4 x float> %a, <4 x float> %b) {
%cmp = fcmp ult <4 x float> %a, zeroinitializer
%cmp4 = fcmp ult <4 x float> %b, zeroinitializer
%sext = sext <4 x i1> %cmp to <4 x i32>
%sext5 = sext <4 x i1> %cmp4 to <4 x i32>
%and = and <4 x i32> %sext, %sext5
%conv = bitcast <4 x i32> %and to <2 x i64>
ret <2 x i64> %conv
}
define <2 x i64> @test7_better(<4 x float> %a, <4 x float> %b) {
%cmp = fcmp ult <4 x float> %a, zeroinitializer
%cmp4 = fcmp ult <4 x float> %b, zeroinitializer
%and1 = and <4 x i1> %cmp, %cmp4
%and = sext <4 x i1> %and1 to <4 x i32>
%conv = bitcast <4 x i32> %and to <2 x i64>
ret <2 x i64> %conv
}
$ ./llc -o - vcmp.ll -mtriple=aarch64
test7: // @test7
fcmge v0.4s, v0.4s, #0.0
mvn v0.16b, v0.16b
fcmge v1.4s, v1.4s, #0.0
bic v0.16b, v0.16b, v1.16b
ret
test7_better: // @test7_better
// BB#0:
fcmge v0.4s, v0.4s, #0.0
fcmge v1.4s, v1.4s, #0.0
mvn v0.16b, v0.16b
mvn v1.16b, v1.16b
xtn v0.4h, v0.4s
xtn v1.4h, v1.4s
and v0.8b, v0.8b, v1.8b
ushll v0.4s, v0.4h, #0
shl v0.4s, v0.4s, #31
sshr v0.4s, v0.4s, #31
ret
Given that the more common problem patterns already exist independent of this patch, I would agree to proceed. But let's ping people with an ARM stake for their opinions - @t.p.northover @efriedma @mcrosier ?
https://reviews.llvm.org/D36213
More information about the llvm-commits
mailing list