[PATCH] Adjust the cost of vectorized SHL/SRL/SRA

Nadav Rotem nrotem at apple.com
Thu May 21 16:18:47 PDT 2015


Looks good to me assuming that you write the cost model tests, and that you write tests for the ISel changes.


> On May 21, 2015, at 4:15 PM, Wei Mi <wmi at google.com> wrote:
> 
> Hi nadav, aschwaighofer,
> 
> The patch is to solve the problem in https://llvm.org/bugs/show_bug.cgi?id=23582. It adjusts the cost of vectorized SHL/SRL/SRA and makes sure they are lowered to vectorized shift instruction.
> 
> There are a bunch of testcases needed to be adjusted. send out the patch first to see if it is ok generally. Will update the patch with adjusted tests.
> 
> REPOSITORY
>  rL LLVM
> 
> http://reviews.llvm.org/D9923
> 
> Files:
>  lib/Target/X86/X86ISelLowering.cpp
>  lib/Target/X86/X86TargetTransformInfo.cpp
> 
> Index: lib/Target/X86/X86ISelLowering.cpp
> ===================================================================
> --- lib/Target/X86/X86ISelLowering.cpp
> +++ lib/Target/X86/X86ISelLowering.cpp
> @@ -16402,6 +16402,23 @@
>   SDValue R = Op.getOperand(0);
>   SDValue Amt = Op.getOperand(1);
> 
> +  if (Subtarget->hasSSE2() &&
> +      (VT == MVT::v8i16 || VT == MVT::v4i32 || VT == MVT::v2i16) &&
> +      (VT != MVT::v2i64 || Op.getOpcode() != ISD::SRA)) {
> +    assert((VT == R.getSimpleValueType() && VT == Amt.getSimpleValueType()) &&
> +           "Unexpected operand type");
> +    switch (Op.getOpcode()) {
> +    default:
> +      llvm_unreachable("Unknown shift opcode!");
> +    case ISD::SHL:
> +      return DAG.getNode(X86ISD::VSHL, dl, VT, R, Op.getOperand(1));
> +    case ISD::SRL:
> +      return DAG.getNode(X86ISD::VSRL, dl, VT, R, Op.getOperand(1));
> +    case ISD::SRA:
> +      return DAG.getNode(X86ISD::VSRA, dl, VT, R, Op.getOperand(1));
> +    }
> +  }
> +
>   if ((VT == MVT::v2i64 && Op.getOpcode() != ISD::SRA) ||
>       VT == MVT::v4i32 || VT == MVT::v8i16 ||
>       (Subtarget->hasInt256() &&
> Index: lib/Target/X86/X86TargetTransformInfo.cpp
> ===================================================================
> --- lib/Target/X86/X86TargetTransformInfo.cpp
> +++ lib/Target/X86/X86TargetTransformInfo.cpp
> @@ -248,19 +248,19 @@
>     // used for vectorization and we don't want to make vectorized code worse
>     // than scalar code.
>     { ISD::SHL,  MVT::v16i8,  30 }, // cmpeqb sequence.
> -    { ISD::SHL,  MVT::v8i16,  8*10 }, // Scalarized.
> -    { ISD::SHL,  MVT::v4i32,  2*5 }, // We optimized this using mul.
> -    { ISD::SHL,  MVT::v2i64,  2*10 }, // Scalarized.
> +    { ISD::SHL,  MVT::v8i16,  1 },
> +    { ISD::SHL,  MVT::v4i32,  1 },
> +    { ISD::SHL,  MVT::v2i64,  1 },
>     { ISD::SHL,  MVT::v4i64,  4*10 }, // Scalarized.
> 
>     { ISD::SRL,  MVT::v16i8,  16*10 }, // Scalarized.
> -    { ISD::SRL,  MVT::v8i16,  8*10 }, // Scalarized.
> -    { ISD::SRL,  MVT::v4i32,  4*10 }, // Scalarized.
> -    { ISD::SRL,  MVT::v2i64,  2*10 }, // Scalarized.
> +    { ISD::SRL,  MVT::v8i16,  1 },
> +    { ISD::SRL,  MVT::v4i32,  1 },
> +    { ISD::SRL,  MVT::v2i64,  1 },
> 
>     { ISD::SRA,  MVT::v16i8,  16*10 }, // Scalarized.
> -    { ISD::SRA,  MVT::v8i16,  8*10 }, // Scalarized.
> -    { ISD::SRA,  MVT::v4i32,  4*10 }, // Scalarized.
> +    { ISD::SRA,  MVT::v8i16,  1 },
> +    { ISD::SRA,  MVT::v4i32,  1 },
>     { ISD::SRA,  MVT::v2i64,  2*10 }, // Scalarized.
> 
>     // It is not a good idea to vectorize division. We have to scalarize it and
> 
> EMAIL PREFERENCES
>  http://reviews.llvm.org/settings/panel/emailpreferences/
> <D9923.26282.patch>





More information about the llvm-commits mailing list