[PATCH] D27385: [x86] fold fand (fxor X, -1) Y --> fandn X, Y
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Dec 3 11:45:24 PST 2016
spatel added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:31745
+
+ if (!((VT == MVT::f32 && Subtarget.hasSSE1()) ||
+ (VT == MVT::f64 && Subtarget.hasSSE2())))
----------------
delena wrote:
> It should work for scalar and vector types, right? You check only scalar VT (f32, f64) here.
I think vector types always use combineANDXORWithAllOnesIntoANDNP() for this transform because we peek through the bitcasts to find the integer logic ops for vectors. For scalars, we transform to the X86-specific FP-logic nodes, so that's why we need a separate way to handle them. I'm not sure if that's necessary, but we had load folding bugs when we tried to handle vectors and scalars together.
So this example already works without this patch:
define <2 x double> @FsANDNPSrr(<2 x double> %x, <2 x double> %y) {
%bc1 = bitcast <2 x double> %x to <2 x i64>
%bc2 = bitcast <2 x double> %y to <2 x i64>
%not = xor <2 x i64> %bc2, <i64 -1, i64 -1>
%and = and <2 x i64> %bc1, %not
%bc3 = bitcast <2 x i64> %and to <2 x double>
ret <2 x double> %bc3
}
$ ./llc -o - andn.ll
andnps %xmm0, %xmm1
movaps %xmm1, %xmm0
retq
https://reviews.llvm.org/D27385
More information about the llvm-commits
mailing list