[llvm-commits] [PATCH] ARM: Allow long extensions to combine into vmull

Sebastian Pop spop at codeaurora.org
Fri Nov 30 10:10:41 PST 2012


Hi David,

I will commit your patch.

I haven't realized that you don't have svn write access.
Can you please work with Chris to get svn write access?
See the instructions here:
http://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

Thanks,
Sebastian

On Fri, Nov 30, 2012 at 11:22 AM, David Peixotto
<dpeixott at codeaurora.org> wrote:
> Ping
>
>
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
>
>
>
>
> From: Evan Cheng [mailto:evan.cheng at apple.com]
> Sent: Tuesday, November 27, 2012 9:17 PM
>
>
> To: David Peixotto
> Cc: Pete Couperus; llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull
>
>
>
> LGTM
>
> On Nov 26, 2012, at 4:47 PM, David Peixotto <dpeixott at codeaurora.org> wrote:
>
> Thanks for the review, Evan. I’ve attached an updated patch that addresses
> your concern by renaming the functions to indicate they are for VMULL
> lowering (e.g. SkipExtensionForVMULL). I also updated the unit test to work
> with –march=arm –mattr=+neon instead of specifying –mcpu=cortex-a8.
>
>
>
> Please review further or commit if it looks good. Thanks!
>
>
>
> -David
>
>
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
>
>
>
>
> From: Evan Cheng [mailto:evan.cheng at apple.com]
> Sent: Sunday, November 25, 2012 12:02 PM
> To: David Peixotto
> Cc: 'Pete Couperus'; llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull
>
>
>
> The patch looks fine to me. However, I don't like the generic names of the
> functions AddRequiredExtension, SkipLoadExtension, and SkipExtension. These
> functions are only used for VMULL lowering and should be named as such.
>
>
>
> Thanks,
>
>
>
> Evan
>
>
>
> On Nov 15, 2012, at 10:47 AM, David Peixotto <dpeixott at codeaurora.org>
> wrote:
>
>
>
>
> Thanks for the review, Pete. I updated the patch based on your comments and
> attached the new version.
>
>
>
> Reviewers, please send any other feedback.
>
>
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
>
>
>
>
> From: Pete Couperus [mailto:pjcoup at gmail.com]
> Sent: Wednesday, November 14, 2012 10:21 PM
> To: David Peixotto
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull
>
>
>
> Hi David,
>
> This mostly looks fine to me.
> Minor comments:
> 1. The body in your SkipLoadExtension is overindented (4 spaces, should be
> 2).
> 2. I'd reverse the conditional in AddRequiredExtension from:
>   if (OrigTy.getSizeInBits() < 64) {
>     [stuff]
>   }
>   return N;
> to:
>   if (OrigTy.is64BitVector())
>     return N;
>   [stuff]
>
> Maybe 2 is personal preference, but I think it reads easier this way.
> Otherwise, I think this is good.
>
> Pete
>
> On Wed, Nov 14, 2012 at 5:06 PM, David Peixotto <dpeixott at codeaurora.org>
> wrote:
>
> Hi Pete,
>
>
>
> As long as we are fixing it I think it makes sense to handle both cases
> (bare sign/zero extend and sign/zero load extend). If you have no
> objections, let’s move ahead with my patch since it is already factored that
> way.
>
>
>
> I’ve attached an updated version that I rebased against the current head
> (contents are the same as my original patch). Please take a look and let me
> know what you think and we can work together to get it merged.
>
>
>
> -David
>
>
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
>
>
>
>
> From: Pete Couperus [mailto:pjcoup at gmail.com]
> Sent: Tuesday, November 13, 2012 10:09 PM
>
>
> To: David Peixotto
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull
>
>
>
> Hi David,
>
> I had basically aimed to minimize the changes, to address one problem.  I
> agree with you, and think there are potential problems in this area...zext
> vs. sext, and SIGN/ZERO_EXTEND vs. [SZ]EXTLOAD.  Without the long sign/zero
> extension change, we'll still trigger the outer assertion...but just writing
> that makes me feel that we might as well put that logic in as well.  Since
> that is essentially your patch, want to have another go?  Or, would you like
> me to fold it into mine?  I'll try and be pleasantly persistent to get
> either of them in :).
>
> Pete
>
> On Tue, Nov 13, 2012 at 1:33 PM, David Peixotto <dpeixott at codeaurora.org>
> wrote:
>
> Hi Pete,
>
>
>
> Thanks for updating your patch.
>
>
>
> I don’t think I was able to trigger the long sign/zero extension with a
> simple test case either, although I don’t know of a reason why it would not
> be possible to trigger that path. If we don’t want to put the full logic in
> to support the possibility how about we add a comment and an assertion to
> verify that the underlying node is a 64-bit vector. That way if our
> assumption is wrong we can quickly identify and fix the issue.
>
>
>
> -David
>
>
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
>
>
>
>
> From: Pete Couperus [mailto:pjcoup at gmail.com]
> Sent: Monday, November 12, 2012 10:02 PM
> To: David Peixotto
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
> into vmull
>
>
>
> Hi David,
>
> Thanks for taking a look.
> My apologies, I hadn't seen your patch when I'd written this up.
> And, you're right, I neglected the v2i8 case.
> I originally had logic in for handling long sign/zero extension as well, but
> I couldn't trigger that with any of the basic test cases I thought of.
> Curiously, the unsigned/long zero extension cases work in 3.1, but the
> codegen for unsigned v2i16->v2i64 followed by mul is substantially worse
> than for the signed case.
> All the test cases that you sent pass with this new patch (allowing the
> v2i8->v2i64 extension).
> Perhaps someone else wants to take a look :).
>
> Pete
>
> On Mon, Nov 12, 2012 at 10:00 AM, David Peixotto <dpeixott at codeaurora.org>
> wrote:
>
> Hi Pete,
>
>
>
> I sent a patch for this a while back, but never got any feedback and failed
> to push it all the way through. I’ve attached my patch for reference.
>
>
>
> It looks like the patches are fairly similar in their intent. I have a few
> more test cases, so please include those in your patch as well. It would
> also be good to include some of the comments, particularly the SkipExtension
> comment that explains the motivation for that function.
>
>
>
> The main differences between the patches that I see are:
>
>
>
> 1)      Handling a sign extend that is not large enough
>
> 2)      Handling v2i8 vector size
>
>
>
> I’ll expand on these points below.
>
>
>
> Index: lib/Target/ARM/ARMISelLowering.cpp
>
> ===================================================================
>
> --- lib/Target/ARM/ARMISelLowering.cpp            (revision 167684)
>
> +++ lib/Target/ARM/ARMISelLowering.cpp         (working copy)
>
> @@ -4922,11 +4922,30 @@
>
> static SDValue SkipExtension(SDNode *N, SelectionDAG &DAG) {
>
>    if (N->getOpcode() == ISD::SIGN_EXTEND || N->getOpcode() ==
> ISD::ZERO_EXTEND)
>
>      return N->getOperand(0);
>
>
>
> What it N is a small vector here? Don’t we need to make sure that we have a
> sign/zero extend that will extend the size to 64-bit vector?
>
>
>
> -  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N))
>
> -    return DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(), LD->getChain(),
>
> -                       LD->getBasePtr(), LD->getPointerInfo(),
> LD->isVolatile(),
>
> -                       LD->isNonTemporal(), LD->isInvariant(),
>
> -                       LD->getAlignment());
>
> +
>
> +  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N)) {
>
> +    SDValue BareLD = DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(),
>
> +                                LD->getChain(), LD->getBasePtr(),
>
> +                                LD->getPointerInfo(), LD->isVolatile(),
>
> +                                LD->isNonTemporal(), LD->isInvariant(),
>
> +                                LD->getAlignment());
>
> +    if (BareLD.getValueType().is64BitVector())
>
> +      return BareLD;
>
> +
>
> +    // This is a long extension, v4i8 -> v4i32 or v2i16 -> v2i64
>
> +    MVT ExtVT;
>
> +    if (LD->getMemoryVT() == MVT::v4i8)
>
> +      ExtVT = MVT::v4i16;
>
> +    else if (LD->getMemoryVT() == MVT::v2i16)
>
> +      ExtVT = MVT::v2i32;
>
>
>
> What about MVT::v2i8. We can handle that case similar to v2i16 by extending
> the size to v2i32.
>
>
>
> +
>
> +    if (LD->getExtensionType() == ISD::SEXTLOAD)
>
> +      return DAG.getNode(ISD::SIGN_EXTEND, BareLD.getDebugLoc(), ExtVT,
> BareLD);
>
> +    else if (LD->getExtensionType() == ISD::ZEXTLOAD)
>
> +      return DAG.getNode(ISD::ZERO_EXTEND, BareLD.getDebugLoc(), ExtVT,
> BareLD);
>
> +
>
> +    llvm_unreachable("Unexpected extended load type for SkipExtension.");
>
> +  }
>
>    // Otherwise, the value must be a BUILD_VECTOR.  For v2i64, it will
>
>    // have been legalized as a BITCAST from v4i32.
>
>    if (N->getOpcode() == ISD::BITCAST) {
>
>
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
>
>
>
>
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Pete Couperus
> Sent: Saturday, November 10, 2012 11:48 PM
> To: llvm-commits at cs.uiuc.edu
> Subject: [llvm-commits] [PATCH] ARM: Allow long extensions to combine into
> vmull
>
>
>
> Hello,
>
> The ARM backend currently crashes when trying to combine:
>         %tmp1 = load <4 x i8>* %A
>         %tmp2 = load <4 x i8>* %B
>         %tmp3 = sext <4 x i8> %tmp1 to <4 x i32>
>         %tmp4 = sext <4 x i8> %tmp2 to <4 x i32>
>         %tmp5 = mul <4 x i32> %tmp3, %tmp4
>
> into a vmull.  This was reported as PR 12281.
> This patch fixes this problem.
> Please review!
> Thanks,
>
> Pete
>
>
>
>
>
>
>
> <0001-Codegen-failure-for-vmull-with-small-vectors.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> <0001-Codegen-failure-for-vmull-with-small-vectors.patch>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation




More information about the llvm-commits mailing list