[llvm-commits] [PATCH] ARM: Allow long extensions to combine into vmull

David Peixotto dpeixott at codeaurora.org
Thu Nov 15 10:47:52 PST 2012


Thanks for the review, Pete. I updated the patch based on your comments and
attached the new version.

 

Reviewers, please send any other feedback.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Pete Couperus [mailto:pjcoup at gmail.com] 
Sent: Wednesday, November 14, 2012 10:21 PM
To: David Peixotto
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
into vmull

 

Hi David,

This mostly looks fine to me.
Minor comments:
1. The body in your SkipLoadExtension is overindented (4 spaces, should be
2).
2. I'd reverse the conditional in AddRequiredExtension from:
  if (OrigTy.getSizeInBits() < 64) {
    [stuff]
  }
  return N;
to:
  if (OrigTy.is64BitVector())
    return N;
  [stuff]

Maybe 2 is personal preference, but I think it reads easier this way.
Otherwise, I think this is good.

Pete

On Wed, Nov 14, 2012 at 5:06 PM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Hi Pete,

 

As long as we are fixing it I think it makes sense to handle both cases
(bare sign/zero extend and sign/zero load extend). If you have no
objections, let's move ahead with my patch since it is already factored that
way.

 

I've attached an updated version that I rebased against the current head
(contents are the same as my original patch). Please take a look and let me
know what you think and we can work together to get it merged.

 

-David

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Pete Couperus [mailto:pjcoup at gmail.com] 
Sent: Tuesday, November 13, 2012 10:09 PM


To: David Peixotto
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
into vmull

 

Hi David,

I had basically aimed to minimize the changes, to address one problem.  I
agree with you, and think there are potential problems in this area...zext
vs. sext, and SIGN/ZERO_EXTEND vs. [SZ]EXTLOAD.  Without the long sign/zero
extension change, we'll still trigger the outer assertion...but just writing
that makes me feel that we might as well put that logic in as well.  Since
that is essentially your patch, want to have another go?  Or, would you like
me to fold it into mine?  I'll try and be pleasantly persistent to get
either of them in :).

Pete

On Tue, Nov 13, 2012 at 1:33 PM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Hi Pete,

 

Thanks for updating your patch. 

 

I don't think I was able to trigger the long sign/zero extension with a
simple test case either, although I don't know of a reason why it would not
be possible to trigger that path. If we don't want to put the full logic in
to support the possibility how about we add a comment and an assertion to
verify that the underlying node is a 64-bit vector. That way if our
assumption is wrong we can quickly identify and fix the issue.

 

-David

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Pete Couperus [mailto:pjcoup at gmail.com] 
Sent: Monday, November 12, 2012 10:02 PM
To: David Peixotto
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [PATCH] ARM: Allow long extensions to combine
into vmull

 

Hi David,

Thanks for taking a look.
My apologies, I hadn't seen your patch when I'd written this up.
And, you're right, I neglected the v2i8 case.
I originally had logic in for handling long sign/zero extension as well, but
I couldn't trigger that with any of the basic test cases I thought of.
Curiously, the unsigned/long zero extension cases work in 3.1, but the
codegen for unsigned v2i16->v2i64 followed by mul is substantially worse
than for the signed case.
All the test cases that you sent pass with this new patch (allowing the
v2i8->v2i64 extension).  
Perhaps someone else wants to take a look :).

Pete

On Mon, Nov 12, 2012 at 10:00 AM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Hi Pete,

 

I sent a patch for this a while back, but never got any feedback and failed
to push it all the way through. I've attached my patch for reference.

 

It looks like the patches are fairly similar in their intent. I have a few
more test cases, so please include those in your patch as well. It would
also be good to include some of the comments, particularly the SkipExtension
comment that explains the motivation for that function.

 

The main differences between the patches that I see are:

 

1)      Handling a sign extend that is not large enough

2)      Handling v2i8 vector size

 

I'll expand on these points below.

 

Index: lib/Target/ARM/ARMISelLowering.cpp

===================================================================

--- lib/Target/ARM/ARMISelLowering.cpp            (revision 167684)

+++ lib/Target/ARM/ARMISelLowering.cpp         (working copy)

@@ -4922,11 +4922,30 @@

static SDValue SkipExtension(SDNode *N, SelectionDAG &DAG) {

   if (N->getOpcode() == ISD::SIGN_EXTEND || N->getOpcode() ==
ISD::ZERO_EXTEND)

     return N->getOperand(0);

 

What it N is a small vector here? Don't we need to make sure that we have a
sign/zero extend that will extend the size to 64-bit vector?

 

-  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N))

-    return DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(), LD->getChain(),

-                       LD->getBasePtr(), LD->getPointerInfo(),
LD->isVolatile(),

-                       LD->isNonTemporal(), LD->isInvariant(),

-                       LD->getAlignment());

+

+  if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N)) {

+    SDValue BareLD = DAG.getLoad(LD->getMemoryVT(), N->getDebugLoc(),

+                                LD->getChain(), LD->getBasePtr(),

+                                LD->getPointerInfo(), LD->isVolatile(),

+                                LD->isNonTemporal(), LD->isInvariant(),

+                                LD->getAlignment());

+    if (BareLD.getValueType().is64BitVector())

+      return BareLD;

+

+    // This is a long extension, v4i8 -> v4i32 or v2i16 -> v2i64

+    MVT ExtVT;

+    if (LD->getMemoryVT() == MVT::v4i8)

+      ExtVT = MVT::v4i16;

+    else if (LD->getMemoryVT() == MVT::v2i16)

+      ExtVT = MVT::v2i32;

 

What about MVT::v2i8. We can handle that case similar to v2i16 by extending
the size to v2i32.

 

+

+    if (LD->getExtensionType() == ISD::SEXTLOAD)

+      return DAG.getNode(ISD::SIGN_EXTEND, BareLD.getDebugLoc(), ExtVT,
BareLD);

+    else if (LD->getExtensionType() == ISD::ZEXTLOAD)

+      return DAG.getNode(ISD::ZERO_EXTEND, BareLD.getDebugLoc(), ExtVT,
BareLD);

+

+    llvm_unreachable("Unexpected extended load type for SkipExtension.");

+  }

   // Otherwise, the value must be a BUILD_VECTOR.  For v2i64, it will

   // have been legalized as a BITCAST from v4i32.

   if (N->getOpcode() == ISD::BITCAST) {

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: llvm-commits-bounces at cs.uiuc.edu
[mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Pete Couperus
Sent: Saturday, November 10, 2012 11:48 PM
To: llvm-commits at cs.uiuc.edu
Subject: [llvm-commits] [PATCH] ARM: Allow long extensions to combine into
vmull

 

Hello,

The ARM backend currently crashes when trying to combine:
        %tmp1 = load <4 x i8>* %A
        %tmp2 = load <4 x i8>* %B
        %tmp3 = sext <4 x i8> %tmp1 to <4 x i32>
        %tmp4 = sext <4 x i8> %tmp2 to <4 x i32>
        %tmp5 = mul <4 x i32> %tmp3, %tmp4

into a vmull.  This was reported as PR 12281.
This patch fixes this problem.
Please review!
Thanks,

Pete

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121115/36644179/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Codegen-failure-for-vmull-with-small-vectors.patch
Type: application/octet-stream
Size: 10414 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121115/36644179/attachment.obj>


More information about the llvm-commits mailing list