[PATCH] D36395: [InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046)

Mon Aug 7 12:27:27 PDT 2017

efriedma added a comment.

I spent a bit of time playing with the generated for for various implementations of rotate for 16-bit values.  It looks like this transform actually makes things worse; e.g. on ARM the transformed version compiles to one more instruction.  Granted, that's probably not important; I think the optimal lowering for a 16-bit rotate on ARM actually involves lowering it to a 32-bit rotate and fixing up the result, and that's probably a transform which doesn't make sense until legalization.

Also, on x86, it looks like we're missing an important pattern here: we generate the sequence `andb $15, %cl; rolw %cl, %ax`.

================
Comment at: lib/Transforms/InstCombine/InstCombineCasts.cpp:468
+  Value *X;
+  if (!match(ShVal, m_ZExt(m_Value(X))) || X->getType() != DestTy)
+    return nullptr;
----------------
Would it make sense to use known bits here rather than explicitly checking for zext?  This matters because we tend to transform `zext(trunc(x)) -> x & c`.

================
Comment at: lib/Transforms/InstCombine/InstCombineCasts.cpp:492
+  // pattern does not include this guarantee, we could still do the transform,
+  // but we would need to generate this mask for the shift amount.
+  unsigned WideBitWidth = ShAmt->getType()->getScalarSizeInBits();
----------------
It seems like generating the mask is worth doing here, if we think it'll be eliminated by later lowering.

https://reviews.llvm.org/D36395