[PATCH] [X86][AVX] Fix wrong lowering of VPERM2X128 nodes.

Michael Kuperstein michael.m.kuperstein at intel.com
Sun Mar 8 04:48:03 PDT 2015


LGTM, except one thing I'm not sure about (see comment).


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:9090
@@ -9089,2 +9089,3 @@
   // FIXME: Detect zero-vector inputs and use the VPERM2X128 to zero that half.
-  unsigned PermMask = Mask[0] / 2 | (Mask[2] / 2) << 4;
+  int MaskLO = Mask[0] == SM_SentinelUndef ? Mask[1] : Mask[0];
+  int MaskHI = Mask[2] == SM_SentinelUndef ? Mask[3] : Mask[2];
----------------
Can we end up with both mask elements being 'u'? E.g. <u, u, 0, 1>?
Or will all these cases be caught by different code paths?

Not that it really matters in practice, since even if it ends up here we'll just end up with -1 / 2 == 0 as the VPERM mask, but I don't think we want to the mask to depend on the numerical value of SentinelUndef.

http://reviews.llvm.org/D8119

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list