review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

Wed Oct 16 11:44:33 PDT 2013

Thanks for the feedback. I will make the changes you suggested and upload a new set of patches shortly. I responded below to your questions.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Wednesday, October 16, 2013 10:32 AM
To: David Peixotto
Cc: 林作健; llvm-commits
Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

 

 

A few nitpicks on the refactoring patch:

 

+#include "llvm/Support/Debug.h"

 #include "llvm/Support/ErrorHandling.h"

 #include "llvm/Support/MathExtras.h"

 #include "llvm/Support/raw_ostream.h"

 #include "llvm/Target/TargetOptions.h"

+#include <utility>

<-- I am not sure whether these two files are needed or you forgot to clean it up

 

The <utility> include was added because I’m using std::pair. The Debug.h was leftover from debugging. I will remove it.

 

Could you add comments to each member function of TargetStructByvalEmitter and StructByvalEmitter?

 

Sure, no problem

 

@@ -7286,26 +7603,19 @@ EmitStructByval(MachineInstr *MI, MachineBasicBlock *BB) const {

                        Attribute::NoImplicitFloat) &&

         Subtarget->hasNEON()) {

       if ((Align % 16 == 0) && SizeVal >= 16) {

-        ldrOpc = ARM::VLD1q32wb_fixed;

-        strOpc = ARM::VST1q32wb_fixed;

         UnitSize = 16;

-        TRC_Vec = (const TargetRegisterClass*)&ARM::DPairRegClass;

-      }

-      else if ((Align % 8 == 0) && SizeVal >= 8) {

-        ldrOpc = ARM::VLD1d32wb_fixed;

-        strOpc = ARM::VST1d32wb_fixed;

+      } else if ((Align % 8 == 0) && SizeVal >= 8) {

         UnitSize = 8;

-        TRC_Vec = (const TargetRegisterClass*)&ARM::DPRRegClass;

       }

     }

     // Can't use NEON instructions.

     if (UnitSize == 0) {

-      ldrOpc = isThumb2 ? ARM::t2LDR_POST : ARM::LDR_POST_IMM;

-      strOpc = isThumb2 ? ARM::t2STR_POST : ARM::STR_POST_IMM;

       UnitSize = 4;

     }

   }

<-- Can you remove the parenthesis around the single statement?

 

Will do.

 

On the second patch:

+  unsigned emitConstantLoad(MachineBasicBlock *BB, MachineInstr *MI,

+                            DebugLoc &dl, unsigned Constant,

+                            const DataLayout *DL) {

+    unsigned constReg = MRI.createVirtualRegister(TRC);

+    unsigned Idx = getConstantPoolIndex(BB->getParent(), DL, Constant);

+    AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(ARM::tLDRpci)).addReg(

+        constReg, RegState::Define).addConstantPoolIndex(Idx));

+    return constReg;

+    return 0;

+  }

+

<-- Please remove the extra "return 0".

 

Will do.

 

+;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi              -verify-machineinstrs -filetype=obj | llvm-objdump -triple thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s --check-prefix=THUMB1

+;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi              -verify-machineinstrs -filetype=obj | llvm-objdump -triple thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s --check-prefix=T1POST

<-- These two run lines are the same except the prefix, just wondering why?

 

I was trying to have a test that checked for the presence of the correct load/store instructions and the absence of the post-increment version for thumb1. I want to pass on output like this:

 

ldr r0, [r1]

add r0, #4

 

and fail on output like this:

ldr r0, [r1], #4

 

It wasn’t clear to me how to combine those two checks into a single pass with FileCheck and so I used two passes. One to check for expected output and one to check for lack of bad output.

 

 

Thanks,

Manman

 

On Tue, Oct 15, 2013 at 3:27 PM, David Peixotto <dpeixott at codeaurora.org> wrote:

Hi Manman,

 

I split the patch into two parts as requested. Most of the changes are in the refactoring patch. The thumb1 support is pretty straightforward after the refactoring is done.

 

Thanks for your help reviewing these patches.

 

 

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Tuesday, October 15, 2013 2:01 PM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

 

 

Hi David,

 

Thanks for working on this.

Can you separate your patch to a refactoring patch and a patch for Thumb1?

 

It is easier to review and we should submit it as two commits.

 

Manman

 

On Tue, Oct 15, 2013 at 1:16 PM, David Peixotto <dpeixott at codeaurora.org> wrote:

I’ve attached a patch that fixes the issue for thumb1 and thumb2. I refactored the code a bit to isolate all the target-specific to a helper class. It makes the code a bit longer, but it helped me to see where we need different operations for each subtarget and to implement them in isolation. I’m not opposed to doing it all inline, but I wasn’t sure I would be able to get all the details right that way.

 

This commit refactors the code so that we can easily support thumb1

targets and implements the thumb1 lowering.

 

The original lowering algorithm is unchanged, but it now uses a

helper class to generate the machine instructions.  The refactoring

introduces a StructByvalEmitter class that encapsulates the

operations needed to lower the COPY_STRUCT pseudo instruction. The

basic operations (e.g. emit load, emit store) are all wrappers to

the correct machine instruction for each target. The operations are

then defined for Arm, Thumb1, and Thumb2 targets.

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

 

 

From: Manman Ren [mailto:manman.ren at gmail.com] 
Sent: Tuesday, October 15, 2013 11:51 AM
To: David Peixotto
Cc: 林作健; llvm-commits


Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

 

 

 

On Tue, Oct 15, 2013 at 10:13 AM, David Peixotto <dpeixott at codeaurora.org> wrote:

Why does this patch need to make a change to the ARMConstantIslandsPass? I think it may be because we are still using an arm instruction for the constant load instead of the equivalent thumb instruction (tLDRpci).

 

I think your patch is incomplete for supporting thumb1. You need to modify the lowering for COPY_STRUCT_BYVAL_I32 in a few more places. We should be using the thumb1 opcodes for all the instructions generated by the lowering. You have handled the load/store case for the values, but you still need to take care of the constant pool load, the subtract (for the loop counter) and the branch. These instructions should use the tLDRpci (for constant pool load), tSubi8 (for the loop counter subtract), and tBcc (for the loop branch).

 

I didn’t see your bug and had filed another bug for this same issue: http://llvm.org/bugs/show_bug.cgi?id=17309. Please also check that your patch works with the test case I uploaded to that bug (also attached to this email). In addition to checks for thumb1, this test case also exposes a bug in the current code with the thumb2 lowering. It will trigger an assertion because an operand is added to a full instruction. It looks like the thumb2 failure comes from a copy/paste error when building the machine instructions.

 

Thanks for the testing case. Yes, the thumb2 failure is from a copy/pasto. I will fix that.

 

Manman 

 

 

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

 

 

From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of ???
Sent: Monday, October 14, 2013 6:29 PM
To: Manman Ren
Cc: llvm-commits
Subject: Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

 

I didn't add THUMB1 lines for function h simple armv5te doesn't support vldr instr.

And here is the patch for formatting.

 

2013/10/15 Manman Ren <manman.ren at gmail.com>

Hi,

 

The changes to ARMISelLowering.cpp look good to me.

--- lib/Target/ARM/ARMConstantIslandPass.cpp (版本 187190)

+++ lib/Target/ARM/ARMConstantIslandPass.cpp (工作副本)

@@ -634,6 +634,7 @@

 initializeFunctionInfo(const std::vector<MachineInstr*> &CPEMIs) {

   BBInfo.clear();

   BBInfo.resize(MF->getNumBlockIDs());

+  const ARMSubtarget * Subtarget = &MF->getTarget().getSubtarget<ARMSubtarget>();

 

   // First thing, compute the size of all basic blocks, and see if the function

   // has any inline assembly in it. If so, we have to be conservative about

@@ -757,7 +758,12 @@

           case ARM::LDRi12:

           case ARM::LDRcp:

           case ARM::t2LDRpci:

-            Bits = 12;  // +-offset_12

+            if(Subtarget->hasV7Ops())

+            {

+              Bits = 12;  // +-offset_12

+            } else {

+              Bits = 8;  // +-offset_8

+            }

 

Please simplify above by removing the parenthesis, or using "? 12 : 8".

I am not really familiar with the const island pass, so I cc'ed Bob for him to review it.

 

The testing case looks good in general, is there any reason you didn't add THUMB1 lines for function h()?

 

Thanks,

Manman

 

 

On Sun, Oct 13, 2013 at 4:48 AM, 林作健 <manjian2006 at gmail.com> wrote:

 


_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131016/3fe45f75/attachment.html>