[llvm-commits] [llvm] r145487 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

Duncan Sands baldrick at free.fr
Wed Nov 30 01:47:37 PST 2011


Hi Craig,

> Add instruction selection support for AVX2 horizontal add/sub instructions.

thanks for doing this.  Please add a testcase.

> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Nov 30 03:10:50 2011
> @@ -14329,7 +14329,9 @@
>       return false;
>
>     EVT VT = LHS.getValueType();
> -  unsigned N = VT.getVectorNumElements();
> +  unsigned NumElts = VT.getVectorNumElements();
> +  unsigned NumLanes = VT.getSizeInBits()/128;

Here you assume that the number of bits is always a multiple of 128.  While this
is currently true (and probably always will be) how about adding an assertion
checking this?

> +  for (unsigned l = 0; l != NumLanes; ++l) {
> +    unsigned LaneStart = l*NumLaneElts;
> +    for (unsigned i = 0; i != NumLaneElts/2; ++i) {

Here you assume that NumLaneElts is even, in particular that it is not 1.  Is it
really impossible for it to be equal to 1?  If it is impossible, please add an
assertion that checks that.  If it is possible, please handle it!

> +      unsigned LIdx = LMask[i+LaneStart];
> +      unsigned RIdx = RMask[i+LaneStart];
> +
> +      // Ignore any UNDEF components.
> +      if (LIdx>= 2*NumElts || RIdx>= 2*NumElts ||
> +          (!A.getNode()&&  (LIdx<  NumElts || RIdx<  NumElts)) ||
> +          (!B.getNode()&&  (LIdx>= NumElts || RIdx>= NumElts)))
> +        continue;
> +
> +      // Check that successive elements are being operated on.  If not, this is
> +      // not a horizontal operation.
> +      if (!(LIdx == 2*i + LaneStart&&  RIdx == 2*i + LaneStart + 1)&&
> +          !(isCommutative&&  LIdx == 2*i + LaneStart + 1&&  RIdx == 2*i + LaneStart))
> +        return false;
> +    }

This loop and the one below are essentially doing the same thing.  How about
unifying them, but introducing an additional loop that loops over the values
0 and 1:
   for (unsigned h = 0; h != 2; ++h)
     for (unsigned i = 0; i != NumLaneElts/2; ++i) {
       unsigned LIdx = LMask[i+LaneStart+h*(NumLaneElts/2)];
and so on.

Ciao, Duncan.

PS: I didn't check the correctness of your multi-lane logic.



More information about the llvm-commits mailing list