[PATCH] D48768: [X86] When have BMI2, prefer shifts to clear low/high bits, rather than variable mask.

Sat Jun 30 11:03:13 PDT 2018

craig.topper added inline comments.

================
Comment at: lib/Target/X86/X86InstrInfo.td:2566
+  // x << (64 - y) >> (64 - y)
+  def : Pat<(srl (shl GR64:$src, (i8 (trunc (sub 64, GR64:$lz)))),
+                 (i8 (trunc (sub 64, GR64:$lz)))),
----------------
craig.topper wrote:
> The fact that you needed this patterns is actually a bug in the code that turn turns load+and into the bzhi pattern. Normally the trunc would have narrowed the subtract to 32-bits. But the trunc, shift, and subtract were created after legalize ops, but the shit amount was created as i64. It should have been created directly as an i8. The re-legalize we run as part of the last DAG combine caught it and fixed it, but it was too late to get the subtract fixed.
> 
> I suppose this pattern might be useful if the subtract had other users that weren't the truncate which would prevent the normal combine from kicking in. But by that logic we would need to replicate all patterns that all look for truncated subtracts. And it wouldn't be limited to just the BZHI64 patterns. You could have a 32 bit data, but a shift amount that is initially 64 bits.
I've fixed the bad shift creation in r336051

Repository:
  rL LLVM

https://reviews.llvm.org/D48768