review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

David Peixotto dpeixott at codeaurora.org
Fri Oct 18 10:35:21 PDT 2013


Ok, can you file a bug with a test case that demonstrates the issue? One
change I made in the byval lowering for thumb1 was to generate the constant
pool load using tLDRpci instead of LDRcp. I see that in
ARMConstantPoolIslandsPass.cpp it recognizes that tLDRpci uses imm8 so I
agree that it is a problem if it is still placing the constant pool too far
away.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: 林作健 [mailto:manjian2006 at gmail.com] 
Sent: Friday, October 18, 2013 9:36 AM
To: David Peixotto
Cc: llvm-commits; bob.wilson at apple.com; Manman Ren
Subject: RE: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

it is a different bug.together with the copy struct one makes compiling
armv5 impossible. when encounter a long long function accessing constant
pool ,it may generates code too far away to access it.i should have reported
two bugs.As you can see,armv5 compatible ldr encoding just has an imm8
offset instead of imm12.Current constant pool lowering code uses imm12
improperly.

在 2013年10月19日 上午12:16,"David Peixotto" <dpeixott at codeaurora.org>写
道:

Hi林作健,

 

I don’t completely understand the issue you are trying to solve. Is this a
bug in the struct byval lowering or a problem with the constant island pass?
Do you have a test case that shows the problem?

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: 林作健 [mailto:manjian2006 at gmail.com] 
Sent: Friday, October 18, 2013 12:49 AM
To: David Peixotto; bob.wilson at apple.com
Cc: Manman Ren; llvm-commits
Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

内嵌图片 1

 

Dear Bob,please note Encoding T2 ONLY supports imm8.And Encoding T3 is ONLY
supported  by ARMv6T2 and later.So my patch will fix this bug.

 

2013/10/18 林作健 <manjian2006 at gmail.com>

Dear all,

please note that the patch to ARMConstantIslandPass.cpp also effectively fix
a bug,which causing as in thumb arm5te mode  fails to generate code.Because
the ldr code fails to access constant pool too far away from it.

 

2013/10/18 David Peixotto <dpeixott at codeaurora.org>

Committed in r192915 and r192916.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Thursday, October 17, 2013 11:14 AM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

 

Please commit,

 

Thanks,

Manman

 

On Thu, Oct 17, 2013 at 9:32 AM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Manman,

 

Thanks for the tip on how to improve the test case. I’ve updated it with
your suggestion to only produce the output once. I’ve attached the final
patches.

 

Ok to commit?

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Wednesday, October 16, 2013 4:39 PM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

+;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
-verify-machineinstrs -filetype=obj | llvm-objdump -triple
thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
--check-prefix=THUMB1

+;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
-verify-machineinstrs -filetype=obj | llvm-objdump -triple
thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
--check-prefix=T1POST

<-- These two run lines are the same except the prefix, just wondering why?

 

I was trying to have a test that checked for the presence of the correct
load/store instructions and the absence of the post-increment version for
thumb1. I want to pass on output like this:

 

ldr r0, [r1]

add r0, #4

 

and fail on output like this:

ldr r0, [r1], #4

 

It wasn’t clear to me how to combine those two checks into a single pass
with FileCheck and so I used two passes. One to check for expected output
and one to check for lack of bad output.

 

About the testing case, I am not sure how to do it in a single pass, but you
can save the objdump to a temp file and use the temp file to perform two
checks:

cat %t | FileCheck %s --check-prefix=THUMB1

cat %t | FileCheck %s --check-prefix=T1POST

 

LGTM otherwise,

 

Thanks,

Manman

 

On Wed, Oct 16, 2013 at 3:57 PM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Attached the updated patches based on the feedback so far. Please help
review these changes. Thanks!

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: David Peixotto [mailto:dpeixott at codeaurora.org] 
Sent: Wednesday, October 16, 2013 11:45 AM
To: 'Manman Ren'
Cc: '林作健'; 'llvm-commits'
Subject: RE: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

Thanks for the feedback. I will make the changes you suggested and upload a
new set of patches shortly. I responded below to your questions.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Wednesday, October 16, 2013 10:32 AM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

 

A few nitpicks on the refactoring patch:

 

+#include "llvm/Support/Debug.h"

 #include "llvm/Support/ErrorHandling.h"

 #include "llvm/Support/MathExtras.h"

 #include "llvm/Support/raw_ostream.h"

 #include "llvm/Target/TargetOptions.h"

+#include <utility>

<-- I am not sure whether these two files are needed or you forgot to clean
it up

 

The <utility> include was added because I’m using std::pair. The Debug.h
was leftover from debugging. I will remove it.

 

Could you add comments to each member function of TargetStructByvalEmitter
and StructByvalEmitter?

 

Sure, no problem

 

@@ -7286,26 +7603,19 @@ EmitStructByval(MachineInstr *MI, MachineBasicBlock
*BB) const {

                        Attribute::NoImplicitFloat) &&

         Subtarget->hasNEON()) {

       if ((Align % 16 == 0) && SizeVal >= 16) {

-        ldrOpc = ARM::VLD1q32wb_fixed;

-        strOpc = ARM::VST1q32wb_fixed;

         UnitSize = 16;

-        TRC_Vec = (const TargetRegisterClass*)&ARM::DPairRegClass;

-      }

-      else if ((Align % 8 == 0) && SizeVal >= 8) {

-        ldrOpc = ARM::VLD1d32wb_fixed;

-        strOpc = ARM::VST1d32wb_fixed;

+      } else if ((Align % 8 == 0) && SizeVal >= 8) {

         UnitSize = 8;

-        TRC_Vec = (const TargetRegisterClass*)&ARM::DPRRegClass;

       }

     }

     // Can't use NEON instructions.

     if (UnitSize == 0) {

-      ldrOpc = isThumb2 ? ARM::t2LDR_POST : ARM::LDR_POST_IMM;

-      strOpc = isThumb2 ? ARM::t2STR_POST : ARM::STR_POST_IMM;

       UnitSize = 4;

     }

   }

<-- Can you remove the parenthesis around the single statement?

 

Will do.

 

On the second patch:

+  unsigned emitConstantLoad(MachineBasicBlock *BB, MachineInstr *MI,

+                            DebugLoc &dl, unsigned Constant,

+                            const DataLayout *DL) {

+    unsigned constReg = MRI.createVirtualRegister(TRC);

+    unsigned Idx = getConstantPoolIndex(BB->getParent(), DL, Constant);

+    AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(ARM::tLDRpci)).addReg(

+        constReg, RegState::Define).addConstantPoolIndex(Idx));

+    return constReg;

+    return 0;

+  }

+

<-- Please remove the extra "return 0".

 

Will do.

 

+;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
-verify-machineinstrs -filetype=obj | llvm-objdump -triple
thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
--check-prefix=THUMB1

+;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
-verify-machineinstrs -filetype=obj | llvm-objdump -triple
thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
--check-prefix=T1POST

<-- These two run lines are the same except the prefix, just wondering why?

 

I was trying to have a test that checked for the presence of the correct
load/store instructions and the absence of the post-increment version for
thumb1. I want to pass on output like this:

 

ldr r0, [r1]

add r0, #4

 

and fail on output like this:

ldr r0, [r1], #4

 

It wasn’t clear to me how to combine those two checks into a single pass
with FileCheck and so I used two passes. One to check for expected output
and one to check for lack of bad output.

 

 

Thanks,

Manman

 

On Tue, Oct 15, 2013 at 3:27 PM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Hi Manman,

 

I split the patch into two parts as requested. Most of the changes are in
the refactoring patch. The thumb1 support is pretty straightforward after
the refactoring is done.

 

Thanks for your help reviewing these patches.

 

 

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Tuesday, October 15, 2013 2:01 PM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

 

Hi David,

 

Thanks for working on this.

Can you separate your patch to a refactoring patch and a patch for Thumb1?

 

It is easier to review and we should submit it as two commits.

 

Manman

 

On Tue, Oct 15, 2013 at 1:16 PM, David Peixotto <dpeixott at codeaurora.org>
wrote:

I’ve attached a patch that fixes the issue for thumb1 and thumb2. I
refactored the code a bit to isolate all the target-specific to a helper
class. It makes the code a bit longer, but it helped me to see where we need
different operations for each subtarget and to implement them in isolation.
I’m not opposed to doing it all inline, but I wasn’t sure I would be able
to get all the details right that way.

 

This commit refactors the code so that we can easily support thumb1

targets and implements the thumb1 lowering.

 

The original lowering algorithm is unchanged, but it now uses a

helper class to generate the machine instructions.  The refactoring

introduces a StructByvalEmitter class that encapsulates the

operations needed to lower the COPY_STRUCT pseudo instruction. The

basic operations (e.g. emit load, emit store) are all wrappers to

the correct machine instruction for each target. The operations are

then defined for Arm, Thumb1, and Thumb2 targets.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Tuesday, October 15, 2013 11:51 AM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

 

 

On Tue, Oct 15, 2013 at 10:13 AM, David Peixotto <dpeixott at codeaurora.org>
wrote:

Why does this patch need to make a change to the ARMConstantIslandsPass? I
think it may be because we are still using an arm instruction for the
constant load instead of the equivalent thumb instruction (tLDRpci).

 

I think your patch is incomplete for supporting thumb1. You need to modify
the lowering for COPY_STRUCT_BYVAL_I32 in a few more places. We should be
using the thumb1 opcodes for all the instructions generated by the lowering.
You have handled the load/store case for the values, but you still need to
take care of the constant pool load, the subtract (for the loop counter) and
the branch. These instructions should use the tLDRpci (for constant pool
load), tSubi8 (for the loop counter subtract), and tBcc (for the loop
branch).

 

I didn’t see your bug and had filed another bug for this same issue:
http://llvm.org/bugs/show_bug.cgi?id=17309. Please also check that your
patch works with the test case I uploaded to that bug (also attached to this
email). In addition to checks for thumb1, this test case also exposes a bug
in the current code with the thumb2 lowering. It will trigger an assertion
because an operand is added to a full instruction. It looks like the thumb2
failure comes from a copy/paste error when building the machine
instructions.

 

Thanks for the testing case. Yes, the thumb2 failure is from a copy/pasto. I
will fix that.

 

Manman 

 

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

 

 

From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.
edu] On Behalf Of ???
Sent: Monday, October 14, 2013 6:29 PM
To: Manman Ren
Cc: llvm-commits
Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code
generates unrecognized code

 

I didn't add THUMB1 lines for function h simple armv5te doesn't support vldr
instr.

And here is the patch for formatting.

 

2013/10/15 Manman Ren <manman.ren at gmail.com>

Hi,

 

The changes to ARMISelLowering.cpp look good to me.

 

--- lib/Target/ARM/ARMConstantIslandPass.cpp (版本 187190)

+++ lib/Target/ARM/ARMConstantIslandPass.cpp (工作副本)

@@ -634,6 +634,7 @@

 initializeFunctionInfo(const std::vector<MachineInstr*> &CPEMIs) {

   BBInfo.clear();

   BBInfo.resize(MF->getNumBlockIDs());

+  const ARMSubtarget * Subtarget =
&MF->getTarget().getSubtarget<ARMSubtarget>();

 

   // First thing, compute the size of all basic blocks, and see if the
function

   // has any inline assembly in it. If so, we have to be conservative about

@@ -757,7 +758,12 @@

           case ARM::LDRi12:

           case ARM::LDRcp:

           case ARM::t2LDRpci:

-            Bits = 12;  // +-offset_12

+            if(Subtarget->hasV7Ops())

+            {

+              Bits = 12;  // +-offset_12

+            } else {

+              Bits = 8;  // +-offset_8

+            }

 

Please simplify above by removing the parenthesis, or using "? 12 : 8".

I am not really familiar with the const island pass, so I cc'ed Bob for him
to review it.

 

The testing case looks good in general, is there any reason you didn't add
THUMB1 lines for function h()?

 

Thanks,

Manman

 

 

On Sun, Oct 13, 2013 at 4:48 AM, 林作健 <manjian2006 at gmail.com> wrote:

 


_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

 

 

 

 

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131018/60e63f23/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 343970 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131018/60e63f23/attachment.png>


More information about the llvm-commits mailing list