[llvm] r177421 - Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.

Muhammad Tauqir Ahmad mtahmed at uwaterloo.ca
Wed Mar 20 09:38:54 PDT 2013


I just noticed that this particular pattern could be further optimized
(after Elena's patch on these sext patterns) and I decided to do it.

Muhammad Tauqir Ahmad
----------------------------------------------------
Candidate for Honours Computer Science
Combinatorics and Optimizations Minor
University of Waterloo


On Wed, Mar 20, 2013 at 12:01 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Jan,
>
> The IR may contain <4 x i8> to <4 x i64> sext conversions. This patch
> optimizes it from 8 cycles to 6. I am not sure why Muhammad is interested in
> this pattern.
>
> Nadav
>
>
> On Mar 20, 2013, at 6:57 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote:
>
> Is there a reason to expand it to <4 x i64> instead of <4 x i32>, and if so,
> shouldn't <4 x i32> be expanded as well? Would it be equally good to expand
> to <4 x i32> since not all processors have 256-bit registers?
>
>
> - Jan
>
>
> ________________________________
> From: Nadav Rotem <nrotem at apple.com>
> To: llvm-commits at cs.uiuc.edu
> Sent: Tuesday, March 19, 2013 2:38 PM
> Subject: [llvm] r177421 - Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
>
> Author: nadav
> Date: Tue Mar 19 13:38:27 2013
> New Revision: 177421
>
> URL: http://llvm.org/viewvc/llvm-project?rev=177421&view=rev
> Log:
> Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
> Patch by Ahmad, Muhammad T <muhammad.t.ahmad at intel.com>
>
>
> Modified:
>     llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>     llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
>     llvm/trunk/test/Analysis/CostModel/X86/cast.ll
>     llvm/trunk/test/CodeGen/X86/avx-sext.ll
>
> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=177421&r1=177420&r2=177421&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Mar 19 13:38:27 2013
> @@ -11827,8 +11827,23 @@ SDValue X86TargetLowering::LowerSIGN_EXT
>        // fall through
>      case MVT::v4i32:
>      case MVT::v8i16: {
> -      SDValue Tmp1 = getTargetVShiftNode(X86ISD::VSHLI, dl, VT,
> -                                         Op.getOperand(0), ShAmt, DAG);
> +      // (sext (vzext x)) -> (vsext x)
> +      SDValue Op0 = Op.getOperand(0);
> +      SDValue Op00 = Op0.getOperand(0);
> +      SDValue Tmp1;
> +      // Hopefully, this VECTOR_SHUFFLE is just a VZEXT.
> +      if (Op0.getOpcode() == ISD::BITCAST &&
> +          Op00.getOpcode() == ISD::VECTOR_SHUFFLE)
> +        Tmp1 = LowerVectorIntExtend(Op00, DAG);
> +      if (Tmp1.getNode()) {
> +        SDValue Tmp1Op0 = Tmp1.getOperand(0);
> +        assert(Tmp1Op0.getOpcode() == X86ISD::VZEXT &&
> +               "This optimization is invalid without a VZEXT.");
> +        return DAG.getNode(X86ISD::VSEXT, dl, VT, Tmp1Op0.getOperand(0));
> +      }
> +
> +      // If the above didn't work, then just use Shift-Left + Shift-Right.
> +      Tmp1 = getTargetVShiftNode(X86ISD::VSHLI, dl, VT, Op0, ShAmt, DAG);
>        return getTargetVShiftNode(X86ISD::VSRAI, dl, VT, Tmp1, ShAmt, DAG);
>      }
>    }
>
> Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=177421&r1=177420&r2=177421&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Tue Mar 19 13:38:27
> 2013
> @@ -257,8 +257,8 @@ unsigned X86TTI::getCastInstrCost(unsign
>      { ISD::ZERO_EXTEND, MVT::v8i32, MVT::v8i1,  6 },
>      { ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i1,  9 },
>      { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i1,  8 },
> -    { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i8,  8 },
> -    { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i16, 8 },
> +    { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i8,  6 },
> +    { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i16, 6 },
>      { ISD::TRUNCATE,    MVT::v8i32, MVT::v8i64, 3 },
>    };
>
>
> Modified: llvm/trunk/test/Analysis/CostModel/X86/cast.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/cast.ll?rev=177421&r1=177420&r2=177421&view=diff
> ==============================================================================
> --- llvm/trunk/test/Analysis/CostModel/X86/cast.ll (original)
> +++ llvm/trunk/test/Analysis/CostModel/X86/cast.ll Tue Mar 19 13:38:27 2013
> @@ -44,9 +44,9 @@ define i32 @zext_sext(<8 x i1> %in) {
>    %B = zext <8 x i16> undef to <8 x i32>
>    ;CHECK: cost of 1 {{.*}} sext
>    %C = sext <4 x i32> undef to <4 x i64>
> -  ;CHECK: cost of 8 {{.*}} sext
> +  ;CHECK: cost of 6 {{.*}} sext
>    %C1 = sext <4 x i8> undef to <4 x i64>
> -  ;CHECK: cost of 8 {{.*}} sext
> +  ;CHECK: cost of 6 {{.*}} sext
>    %C2 = sext <4 x i16> undef to <4 x i64>
>
>    ;CHECK: cost of 1 {{.*}} zext
>
> Modified: llvm/trunk/test/CodeGen/X86/avx-sext.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-sext.ll?rev=177421&r1=177420&r2=177421&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/avx-sext.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/avx-sext.ll Tue Mar 19 13:38:27 2013
> @@ -165,3 +165,24 @@ define <4 x i64> @sext_4i8_to_4i64(<4 x
>    ret <4 x i64> %extmask
> }
>
> +; AVX: sext_4i8_to_4i64
> +; AVX: vpmovsxbd
> +; AVX: vpmovsxdq
> +; AVX: vpmovsxdq
> +; AVX: ret
> +define <4 x i64> @load_sext_4i8_to_4i64(<4 x i8> *%ptr) {
> + %X = load <4 x i8>* %ptr
> + %Y = sext <4 x i8> %X to <4 x i64>
> + ret <4 x i64>%Y
> +}
> +
> +; AVX: sext_4i16_to_4i64
> +; AVX: vpmovsxwd
> +; AVX: vpmovsxdq
> +; AVX: vpmovsxdq
> +; AVX: ret
> +define <4 x i64> @load_sext_4i16_to_4i64(<4 x i16> *%ptr) {
> + %X = load <4 x i16>* %ptr
> + %Y = sext <4 x i16> %X to <4 x i64>
> + ret <4 x i64>%Y
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



More information about the llvm-commits mailing list