[llvm-commits] [PATCH] ARM: Allow long extensions to combine into vmull

Pete Couperus pjcoup at gmail.com
Wed Nov 14 22:20:40 PST 2012


Hi David,

This mostly looks fine to me.
Minor comments:
1. The body in your SkipLoadExtension is overindented (4 spaces, should be
2).
2. I'd reverse the conditional in AddRequiredExtension from:
  if (OrigTy.getSizeInBits() < 64) {
    [stuff]
  }
  return N;
to:
  if (OrigTy.is64BitVector())
    return N;
  [stuff]

Maybe 2 is personal preference, but I think it reads easier this way.
Otherwise, I think this is good.

Pete

On Wed, Nov 14, 2012 at 5:06 PM, David Peixotto <dpeixott at codeaurora.org>wrote:

> Hi Pete,****
>
> ** **
>
> As long as we are fixing it I think it makes sense to handle both cases
> (bare sign/zero extend and sign/zero load extend). If you have no
> objections, let’s move ahead with my patch since it is already factored
> that way.****
>
> ** **
>
> I’ve attached an updated version that I rebased against the current head
> (contents are the same as my original patch). Please take a look and let me
> know what you think and we can work together to get it merged.****
>
> ** **
>
> -David****
>
> ** **
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation****
>
> ** **
>
> ** **
>
> *From:* Pete Couperus [mailto:pjcoup at gmail.com]
> *Sent:* Tuesday, November 13, 2012 10:09 PM
>
> *To:* David Peixotto
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm-commits] [PATCH] ARM: Allow long extensions to
> combine into vmull****
>
> ** **
>
> Hi David,
>
> I had basically aimed to minimize the changes, to address one problem.  I
> agree with you, and think there are potential problems in this area...zext
> vs. sext, and SIGN/ZERO_EXTEND vs. [SZ]EXTLOAD.  Without the long sign/zero
> extension change, we'll still trigger the outer assertion...but just
> writing that makes me feel that we might as well put that logic in as
> well.  Since that is essentially your patch, want to have another go?  Or,
> would you like me to fold it into mine?  I'll try and be pleasantly
> persistent to get either of them in :).
>
> Pete
>
> ****
>
> On Tue, Nov 13, 2012 at 1:33 PM, David Peixotto <dpeixott at codeaurora.org>
> wrote:****
>
> Hi Pete,****
>
>  ****
>
> Thanks for updating your patch. ****
>
>  ****
>
> I don’t think I was able to trigger the long sign/zero extension with a
> simple test case either, although I don’t know of a reason why it would not
> be possible to trigger that path. If we don’t want to put the full logic in
> to support the possibility how about we add a comment and an assertion to
> verify that the underlying node is a 64-bit vector. That way if our
> assumption is wrong we can quickly identify and fix the issue.****
>
>  ****
>
> -David****
>
>  ****
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation****
>
>  ****
>
>  ****
>
> *From:* Pete Couperus [mailto:pjcoup at gmail.com]
> *Sent:* Monday, November 12, 2012 10:02 PM
> *To:* David Peixotto
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm-commits] [PATCH] ARM: Allow long extensions to
> combine into vmull****
>
>  ****
>
> Hi David,
>
> Thanks for taking a look.
> My apologies, I hadn't seen your patch when I'd written this up.
> And, you're right, I neglected the v2i8 case.
> I originally had logic in for handling long sign/zero extension as well,
> but I couldn't trigger that with any of the basic test cases I thought of.
> Curiously, the unsigned/long zero extension cases work in 3.1, but the
> codegen for unsigned v2i16->v2i64 followed by mul is substantially worse
> than for the signed case.
> All the test cases that you sent pass with this new patch (allowing the
> v2i8->v2i64 extension).
> Perhaps someone else wants to take a look :).
>
> Pete****
>
> On Mon, Nov 12, 2012 at 10:00 AM, David Peixotto <dpeixott at codeaurora.org>
> wrote:****
>
> Hi Pete,****
>
>  ****
>
> I sent a patch for this a while back, but never got any feedback and
> failed to push it all the way through. I’ve attached my patch for reference.
> ****
>
>  ****
>
> It looks like the patches are fairly similar in their intent. I have a few
> more test cases, so please include those in your patch as well. It would
> also be good to include some of the comments, particularly the
> SkipExtension comment that explains the motivation for that function.****
>
>  ****
>
> The main differences between the patches that I see are:****
>
>  ****
>
> 1)      Handling a sign extend that is not large enough****
>
> 2)      Handling v2i8 vector size****
>
>  ****
>
> I’ll expand on these points below.****
>
>  ****
>
> Index: lib/Target/ARM/ARMISelLowering.cpp****
>
> ===================================================================****
>
> --- lib/Target/ARM/ARMISelLowering.cpp            (revision 167684)****
>
> +++ lib/Target/ARM/ARMISelLowering.cpp         (working copy)****
>
> @@ -4922,11 +4922,30 @@****
>
> static SDValue SkipExtension(SDNode *N, SelectionDAG &DAG) {****
>
>    if (N->getOpcode() == ISD::SIGN_EXTEND || N->getOpcode() ==
> ISD::ZERO_EXTEND)****
>
>      return N->getOperand(0);****
>
>  ****
>
> What it N is a small vector here? Don’t we need to make sure that we have
> a sign/zero extend that will extend the size to 64-bit vector?****
>
>  ****
>
> -  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N))****
>
> -    return DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(),
> LD->getChain(),****
>
> -                       LD->getBasePtr(), LD->getPointerInfo(),
> LD->isVolatile(),****
>
> -                       LD->isNonTemporal(), LD->isInvariant(),****
>
> -                       LD->getAlignment());****
>
> +****
>
> +  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N)) {****
>
> +    SDValue BareLD = DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(),****
>
> +                                LD->getChain(), LD->getBasePtr(),****
>
> +                                LD->getPointerInfo(), LD->isVolatile(),**
> **
>
> +                                LD->isNonTemporal(), LD->isInvariant(),**
> **
>
> +                                LD->getAlignment());****
>
> +    if (BareLD.getValueType().is64BitVector())****
>
> +      return BareLD;****
>
> +****
>
> +    // This is a long extension, v4i8 -> v4i32 or v2i16 -> v2i64****
>
> +    MVT ExtVT;****
>
> +    if (LD->getMemoryVT() == MVT::v4i8)****
>
> +      ExtVT = MVT::v4i16;****
>
> +    else if (LD->getMemoryVT() == MVT::v2i16)****
>
> +      ExtVT = MVT::v2i32;****
>
>  ****
>
> What about MVT::v2i8. We can handle that case similar to v2i16 by
> extending the size to v2i32.****
>
>  ****
>
> +****
>
> +    if (LD->getExtensionType() == ISD::SEXTLOAD)****
>
> +      return DAG.getNode(ISD::SIGN_EXTEND, BareLD.getDebugLoc(), ExtVT,
> BareLD);****
>
> +    else if (LD->getExtensionType() == ISD::ZEXTLOAD)****
>
> +      return DAG.getNode(ISD::ZERO_EXTEND, BareLD.getDebugLoc(), ExtVT,
> BareLD);****
>
> +****
>
> +    llvm_unreachable("Unexpected extended load type for SkipExtension.");
> ****
>
> +  }****
>
>    // Otherwise, the value must be a BUILD_VECTOR.  For v2i64, it will****
>
>    // have been legalized as a BITCAST from v4i32.****
>
>    if (N->getOpcode() == ISD::BITCAST) {****
>
>  ****
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation****
>
>  ****
>
>  ****
>
> *From:* llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] *On Behalf Of *Pete Couperus
> *Sent:* Saturday, November 10, 2012 11:48 PM
> *To:* llvm-commits at cs.uiuc.edu
> *Subject:* [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull****
>
>  ****
>
> Hello,
>
> The ARM backend currently crashes when trying to combine:
>         %tmp1 = load <4 x i8>* %A
>         %tmp2 = load <4 x i8>* %B
>         %tmp3 = sext <4 x i8> %tmp1 to <4 x i32>
>         %tmp4 = sext <4 x i8> %tmp2 to <4 x i32>
>         %tmp5 = mul <4 x i32> %tmp3, %tmp4
>
> into a vmull.  This was reported as PR 12281.
> This patch fixes this problem.
> Please review!
> Thanks,
>
> Pete****
>
>  ****
>
> ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121114/095bdcdd/attachment.html>


More information about the llvm-commits mailing list