[PATCH] D27385: [x86] fold fand (fxor X, -1) Y --> fandn X, Y

Sat Dec 3 11:45:24 PST 2016

spatel added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:31745
+
+  if (!((VT == MVT::f32 && Subtarget.hasSSE1()) ||
+        (VT == MVT::f64 && Subtarget.hasSSE2())))
----------------
delena wrote:
> It should work for scalar and vector types, right? You check only scalar VT (f32, f64) here.
I think vector types always use combineANDXORWithAllOnesIntoANDNP() for this transform because we peek through the bitcasts to find the integer logic ops for vectors. For scalars, we transform to the X86-specific FP-logic nodes, so that's why we need a separate way to handle them. I'm not sure if that's necessary, but we had load folding bugs when we tried to handle vectors and scalars together.

So this example already works without this patch:
  define <2 x double> @FsANDNPSrr(<2 x double> %x, <2 x double> %y) {
    %bc1 = bitcast <2 x double> %x to <2 x i64>
    %bc2 = bitcast <2 x double> %y to <2 x i64>
    %not = xor <2 x i64> %bc2, <i64 -1, i64 -1>
    %and = and <2 x i64> %bc1, %not
    %bc3 = bitcast <2 x i64> %and to <2 x double>
    ret <2 x double> %bc3
  }

  $ ./llc -o - andn.ll 
    andnps	%xmm0, %xmm1
    movaps	%xmm1, %xmm0
    retq

https://reviews.llvm.org/D27385