[llvm] r177421 - Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
Nadav Rotem
nrotem at apple.com
Wed Mar 20 09:01:50 PDT 2013
Hi Jan,
The IR may contain <4 x i8> to <4 x i64> sext conversions. This patch optimizes it from 8 cycles to 6. I am not sure why Muhammad is interested in this pattern.
Nadav
On Mar 20, 2013, at 6:57 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote:
> Is there a reason to expand it to <4 x i64> instead of <4 x i32>, and if so, shouldn't <4 x i32> be expanded as well? Would it be equally good to expand to <4 x i32> since not all processors have 256-bit registers?
>
>
> - Jan
>
>
>> ________________________________
>> From: Nadav Rotem <nrotem at apple.com>
>> To: llvm-commits at cs.uiuc.edu
>> Sent: Tuesday, March 19, 2013 2:38 PM
>> Subject: [llvm] r177421 - Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
>>
>> Author: nadav
>> Date: Tue Mar 19 13:38:27 2013
>> New Revision: 177421
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=177421&view=rev
>> Log:
>> Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
>> Patch by Ahmad, Muhammad T <muhammad.t.ahmad at intel.com>
>>
>>
>> Modified:
>> llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>> llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
>> llvm/trunk/test/Analysis/CostModel/X86/cast.ll
>> llvm/trunk/test/CodeGen/X86/avx-sext.ll
>>
>> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=177421&r1=177420&r2=177421&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Mar 19 13:38:27 2013
>> @@ -11827,8 +11827,23 @@ SDValue X86TargetLowering::LowerSIGN_EXT
>> // fall through
>> case MVT::v4i32:
>> case MVT::v8i16: {
>> - SDValue Tmp1 = getTargetVShiftNode(X86ISD::VSHLI, dl, VT,
>> - Op.getOperand(0), ShAmt, DAG);
>> + // (sext (vzext x)) -> (vsext x)
>> + SDValue Op0 = Op.getOperand(0);
>> + SDValue Op00 = Op0.getOperand(0);
>> + SDValue Tmp1;
>> + // Hopefully, this VECTOR_SHUFFLE is just a VZEXT.
>> + if (Op0.getOpcode() == ISD::BITCAST &&
>> + Op00.getOpcode() == ISD::VECTOR_SHUFFLE)
>> + Tmp1 = LowerVectorIntExtend(Op00, DAG);
>> + if (Tmp1.getNode()) {
>> + SDValue Tmp1Op0 = Tmp1.getOperand(0);
>> + assert(Tmp1Op0.getOpcode() == X86ISD::VZEXT &&
>> + "This optimization is invalid without a VZEXT.");
>> + return DAG.getNode(X86ISD::VSEXT, dl, VT, Tmp1Op0.getOperand(0));
>> + }
>> +
>> + // If the above didn't work, then just use Shift-Left + Shift-Right.
>> + Tmp1 = getTargetVShiftNode(X86ISD::VSHLI, dl, VT, Op0, ShAmt, DAG);
>> return getTargetVShiftNode(X86ISD::VSRAI, dl, VT, Tmp1, ShAmt, DAG);
>> }
>> }
>>
>> Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=177421&r1=177420&r2=177421&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original)
>> +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Tue Mar 19 13:38:27 2013
>> @@ -257,8 +257,8 @@ unsigned X86TTI::getCastInstrCost(unsign
>> { ISD::ZERO_EXTEND, MVT::v8i32, MVT::v8i1, 6 },
>> { ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i1, 9 },
>> { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i1, 8 },
>> - { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i8, 8 },
>> - { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i16, 8 },
>> + { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i8, 6 },
>> + { ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i16, 6 },
>> { ISD::TRUNCATE, MVT::v8i32, MVT::v8i64, 3 },
>> };
>>
>>
>> Modified: llvm/trunk/test/Analysis/CostModel/X86/cast.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/cast.ll?rev=177421&r1=177420&r2=177421&view=diff
>> ==============================================================================
>> --- llvm/trunk/test/Analysis/CostModel/X86/cast.ll (original)
>> +++ llvm/trunk/test/Analysis/CostModel/X86/cast.ll Tue Mar 19 13:38:27 2013
>> @@ -44,9 +44,9 @@ define i32 @zext_sext(<8 x i1> %in) {
>> %B = zext <8 x i16> undef to <8 x i32>
>> ;CHECK: cost of 1 {{.*}} sext
>> %C = sext <4 x i32> undef to <4 x i64>
>> - ;CHECK: cost of 8 {{.*}} sext
>> + ;CHECK: cost of 6 {{.*}} sext
>> %C1 = sext <4 x i8> undef to <4 x i64>
>> - ;CHECK: cost of 8 {{.*}} sext
>> + ;CHECK: cost of 6 {{.*}} sext
>> %C2 = sext <4 x i16> undef to <4 x i64>
>>
>> ;CHECK: cost of 1 {{.*}} zext
>>
>> Modified: llvm/trunk/test/CodeGen/X86/avx-sext.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-sext.ll?rev=177421&r1=177420&r2=177421&view=diff
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/X86/avx-sext.ll (original)
>> +++ llvm/trunk/test/CodeGen/X86/avx-sext.ll Tue Mar 19 13:38:27 2013
>> @@ -165,3 +165,24 @@ define <4 x i64> @sext_4i8_to_4i64(<4 x
>> ret <4 x i64> %extmask
>> }
>>
>> +; AVX: sext_4i8_to_4i64
>> +; AVX: vpmovsxbd
>> +; AVX: vpmovsxdq
>> +; AVX: vpmovsxdq
>> +; AVX: ret
>> +define <4 x i64> @load_sext_4i8_to_4i64(<4 x i8> *%ptr) {
>> + %X = load <4 x i8>* %ptr
>> + %Y = sext <4 x i8> %X to <4 x i64>
>> + ret <4 x i64>%Y
>> +}
>> +
>> +; AVX: sext_4i16_to_4i64
>> +; AVX: vpmovsxwd
>> +; AVX: vpmovsxdq
>> +; AVX: vpmovsxdq
>> +; AVX: ret
>> +define <4 x i64> @load_sext_4i16_to_4i64(<4 x i16> *%ptr) {
>> + %X = load <4 x i16>* %ptr
>> + %Y = sext <4 x i16> %X to <4 x i64>
>> + ret <4 x i64>%Y
>> +}
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130320/0e94f1c9/attachment.html>
More information about the llvm-commits
mailing list