[llvm-commits] [PATCH] ARM: Allow long extensions to combine into vmull

Pete Couperus pjcoup at gmail.com
Mon Nov 12 22:01:55 PST 2012


Hi David,

Thanks for taking a look.
My apologies, I hadn't seen your patch when I'd written this up.
And, you're right, I neglected the v2i8 case.
I originally had logic in for handling long sign/zero extension as well,
but I couldn't trigger that with any of the basic test cases I thought of.
Curiously, the unsigned/long zero extension cases work in 3.1, but the
codegen for unsigned v2i16->v2i64 followed by mul is substantially worse
than for the signed case.
All the test cases that you sent pass with this new patch (allowing the
v2i8->v2i64 extension).
Perhaps someone else wants to take a look :).

Pete


On Mon, Nov 12, 2012 at 10:00 AM, David Peixotto <dpeixott at codeaurora.org>wrote:

> Hi Pete,****
>
> ** **
>
> I sent a patch for this a while back, but never got any feedback and
> failed to push it all the way through. I’ve attached my patch for reference.
> ****
>
> ** **
>
> It looks like the patches are fairly similar in their intent. I have a few
> more test cases, so please include those in your patch as well. It would
> also be good to include some of the comments, particularly the
> SkipExtension comment that explains the motivation for that function.****
>
> ** **
>
> The main differences between the patches that I see are:****
>
> ** **
>
> **1)      **Handling a sign extend that is not large enough****
>
> **2)      **Handling v2i8 vector size****
>
> ** **
>
> I’ll expand on these points below.****
>
> ** **
>
> Index: lib/Target/ARM/ARMISelLowering.cpp****
>
> ===================================================================****
>
> --- lib/Target/ARM/ARMISelLowering.cpp            (revision 167684)****
>
> +++ lib/Target/ARM/ARMISelLowering.cpp         (working copy)****
>
> @@ -4922,11 +4922,30 @@****
>
> static SDValue SkipExtension(SDNode *N, SelectionDAG &DAG) {****
>
>    if (N->getOpcode() == ISD::SIGN_EXTEND || N->getOpcode() ==
> ISD::ZERO_EXTEND)****
>
>      return N->getOperand(0);****
>
> ** **
>
> What it N is a small vector here? Don’t we need to make sure that we have
> a sign/zero extend that will extend the size to 64-bit vector?****
>
> ** **
>
> -  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N))****
>
> -    return DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(),
> LD->getChain(),****
>
> -                       LD->getBasePtr(), LD->getPointerInfo(),
> LD->isVolatile(),****
>
> -                       LD->isNonTemporal(), LD->isInvariant(),****
>
> -                       LD->getAlignment());****
>
> +****
>
> +  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N)) {****
>
> +    SDValue BareLD = DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(),****
>
> +                                LD->getChain(), LD->getBasePtr(),****
>
> +                                LD->getPointerInfo(), LD->isVolatile(),**
> **
>
> +                                LD->isNonTemporal(), LD->isInvariant(),**
> **
>
> +                                LD->getAlignment());****
>
> +    if (BareLD.getValueType().is64BitVector())****
>
> +      return BareLD;****
>
> +****
>
> +    // This is a long extension, v4i8 -> v4i32 or v2i16 -> v2i64****
>
> +    MVT ExtVT;****
>
> +    if (LD->getMemoryVT() == MVT::v4i8)****
>
> +      ExtVT = MVT::v4i16;****
>
> +    else if (LD->getMemoryVT() == MVT::v2i16)****
>
> +      ExtVT = MVT::v2i32;****
>
> ** **
>
> What about MVT::v2i8. We can handle that case similar to v2i16 by
> extending the size to v2i32.****
>
> ** **
>
> +****
>
> +    if (LD->getExtensionType() == ISD::SEXTLOAD)****
>
> +      return DAG.getNode(ISD::SIGN_EXTEND, BareLD.getDebugLoc(), ExtVT,
> BareLD);****
>
> +    else if (LD->getExtensionType() == ISD::ZEXTLOAD)****
>
> +      return DAG.getNode(ISD::ZERO_EXTEND, BareLD.getDebugLoc(), ExtVT,
> BareLD);****
>
> +****
>
> +    llvm_unreachable("Unexpected extended load type for SkipExtension.");
> ****
>
> +  }****
>
>    // Otherwise, the value must be a BUILD_VECTOR.  For v2i64, it will****
>
>    // have been legalized as a BITCAST from v4i32.****
>
>    if (N->getOpcode() == ISD::BITCAST) {****
>
> ** **
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation****
>
> ** **
>
> ** **
>
> *From:* llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] *On Behalf Of *Pete Couperus
> *Sent:* Saturday, November 10, 2012 11:48 PM
> *To:* llvm-commits at cs.uiuc.edu
> *Subject:* [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull****
>
> ** **
>
> Hello,
>
> The ARM backend currently crashes when trying to combine:
>         %tmp1 = load <4 x i8>* %A
>         %tmp2 = load <4 x i8>* %B
>         %tmp3 = sext <4 x i8> %tmp1 to <4 x i32>
>         %tmp4 = sext <4 x i8> %tmp2 to <4 x i32>
>         %tmp5 = mul <4 x i32> %tmp3, %tmp4
>
> into a vmull.  This was reported as PR 12281.
> This patch fixes this problem.
> Please review!
> Thanks,
>
> Pete****
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121112/1f185102/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: longext_vmull_redux.diff
Type: application/octet-stream
Size: 5012 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121112/1f185102/attachment.obj>


More information about the llvm-commits mailing list