[PATCH] [X86][AVX] Fix wrong lowering of VPERM2X128 nodes.
Michael Kuperstein
michael.m.kuperstein at intel.com
Sun Mar 8 04:48:03 PDT 2015
LGTM, except one thing I'm not sure about (see comment).
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:9090
@@ -9089,2 +9089,3 @@
// FIXME: Detect zero-vector inputs and use the VPERM2X128 to zero that half.
- unsigned PermMask = Mask[0] / 2 | (Mask[2] / 2) << 4;
+ int MaskLO = Mask[0] == SM_SentinelUndef ? Mask[1] : Mask[0];
+ int MaskHI = Mask[2] == SM_SentinelUndef ? Mask[3] : Mask[2];
----------------
Can we end up with both mask elements being 'u'? E.g. <u, u, 0, 1>?
Or will all these cases be caught by different code paths?
Not that it really matters in practice, since even if it ends up here we'll just end up with -1 / 2 == 0 as the VPERM mask, but I don't think we want to the mask to depend on the numerical value of SentinelUndef.
http://reviews.llvm.org/D8119
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list