[PATCH] D46031: [DAGCombiner] Masked merge: if 'B' is constant, de-canonicalize the pattern (invert the mask).

Wed May 2 13:33:34 PDT 2018

spatel added inline comments.

================
Comment at: test/CodeGen/AArch64/unfold-masked-merge-scalar-variablemask.ll:350-355
-; CHECK-NEXT:    and w8, w0, w2
-; CHECK-NEXT:    orn w0, w8, w2
+; CHECK-NEXT:    mvn w8, w0
+; CHECK-NEXT:    bic w8, w8, w2
+; CHECK-NEXT:    orr w0, w8, w0
 ; CHECK-NEXT:    ret
   %n0 = xor i32 %x, -1 ; %x
   %n1 = and i32 %n0, %mask
   %r = xor i32 %n1, -1
----------------
lebedev.ri wrote:
> spatel wrote:
> > How does this happen? Isn't that a miscompile?
> Hm, at first i thought it was indeed (https://reviews.llvm.org/D45733#1077183), but now i do not think so.
> https://godbolt.org/g/L4hDjW
> ^ so neither of our outputs is fully optimized.
> But if i manually transform that assembly to C, the end result tells me that DAGCombine/arm isel is simply missing some optimizations.
> I could be wrong, of course.
Ok, I was seeing an extra 'not' in there somewhere, so no miscompile. 

And the conclusion is that we don't care about this diff because it's already sub-optimal and instcombine should have folded it anyway.

That raises the question of why are we testing this in the first place though. Add a comment to explain that or just delete?

================
Comment at: test/CodeGen/AArch64/unfold-masked-merge-scalar-variablemask.ll:401-403
+; CHECK-NEXT:    eor w8, w0, w8
 ; CHECK-NEXT:    bic w8, w8, w2
+; CHECK-NEXT:    orr w0, w8, w0
----------------
This is a real regression, or am I seeing things that aren't there?

================
Comment at: test/CodeGen/X86/unfold-masked-merge-scalar-variablemask.ll:573-574
 ; CHECK-BMI:       # %bb.0:
-; CHECK-BMI-NEXT:    andl %edx, %edi
-; CHECK-BMI-NEXT:    notl %edx
-; CHECK-BMI-NEXT:    orl %edi, %edx
-; CHECK-BMI-NEXT:    movl %edx, %eax
+; CHECK-BMI-NEXT:    andnl %edx, %edi, %eax
+; CHECK-BMI-NEXT:    notl %eax
 ; CHECK-BMI-NEXT:    retq
----------------
lebedev.ri wrote:
> This improves.
But as with AArch, we don't care because instcombine would fold this?

Repository:
  rL LLVM

https://reviews.llvm.org/D46031