review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo code generates unrecognized code

Fri Oct 18 00:48:56 PDT 2013

[image: 内嵌图片 1]

Dear Bob,please note Encoding T2 ONLY supports imm8.And Encoding T3 is
ONLYsupported  by
ARMv6T2 and later.So my patch will fix this bug.

2013/10/18 林作健 <manjian2006 at gmail.com>

> Dear all,
> please note that the patch to ARMConstantIslandPass.cpp also effectively
> fix a bug,which causing as in thumb arm5te mode  fails to generate
> code.Because the ldr code fails to access constant pool too far away from
> it.
>
>
> 2013/10/18 David Peixotto <dpeixott at codeaurora.org>
>
>> Committed in r192915 and r192916.****
>>
>> ** **
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>> ** **
>>
>> ** **
>>
>> *From:* Manman Ren [mailto:manman.ren at gmail.com]
>> *Sent:* Thursday, October 17, 2013 11:14 AM
>> *To:* David Peixotto
>> *Cc:* 林作健; llvm-commits
>>
>> *Subject:* Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>> ** **
>>
>> ** **
>>
>> Please commit,****
>>
>> ** **
>>
>> Thanks,****
>>
>> Manman****
>>
>> ** **
>>
>> On Thu, Oct 17, 2013 at 9:32 AM, David Peixotto <dpeixott at codeaurora.org>
>> wrote:****
>>
>> Manman,****
>>
>>  ****
>>
>> Thanks for the tip on how to improve the test case. I’ve updated it with
>> your suggestion to only produce the output once. I’ve attached the final
>> patches.****
>>
>>  ****
>>
>> Ok to commit?****
>>
>>  ****
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>>  ****
>>
>>  ****
>>
>> *From:* Manman Ren [mailto:manman.ren at gmail.com]
>> *Sent:* Wednesday, October 16, 2013 4:39 PM
>> *To:* David Peixotto
>> *Cc:* 林作健; llvm-commits****
>>
>>
>> *Subject:* Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>>  ****
>>
>> +;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
>>  -verify-machineinstrs -filetype=obj | llvm-objdump -triple
>> thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
>> --check-prefix=THUMB1****
>>
>> +;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
>>  -verify-machineinstrs -filetype=obj | llvm-objdump -triple
>> thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
>> --check-prefix=T1POST****
>>
>> <-- These two run lines are the same except the prefix, just wondering
>> why?****
>>
>>  ****
>>
>> I was trying to have a test that checked for the presence of the correct
>> load/store instructions and the absence of the post-increment version for
>> thumb1. I want to pass on output like this:****
>>
>>  ****
>>
>> ldr r0, [r1]****
>>
>> add r0, #4****
>>
>>  ****
>>
>> and fail on output like this:****
>>
>> ldr r0, [r1], #4****
>>
>>  ****
>>
>> It wasn’t clear to me how to combine those two checks into a single pass
>> with FileCheck and so I used two passes. One to check for expected output
>> and one to check for lack of bad output.****
>>
>>  ****
>>
>> About the testing case, I am not sure how to do it in a single pass, but
>> you can save the objdump to a temp file and use the temp file to perform
>> two checks:****
>>
>> cat %t | FileCheck %s --check-prefix=THUMB1****
>>
>> cat %t | FileCheck %s --check-prefix=T1POST****
>>
>>  ****
>>
>> LGTM otherwise,****
>>
>>  ****
>>
>> Thanks,****
>>
>> Manman****
>>
>>  ****
>>
>> On Wed, Oct 16, 2013 at 3:57 PM, David Peixotto <dpeixott at codeaurora.org>
>> wrote:****
>>
>> Attached the updated patches based on the feedback so far. Please help
>> review these changes. Thanks!****
>>
>>  ****
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>>  ****
>>
>>  ****
>>
>> *From:* David Peixotto [mailto:dpeixott at codeaurora.org]
>> *Sent:* Wednesday, October 16, 2013 11:45 AM
>> *To:* 'Manman Ren'
>> *Cc:* '林作健'; 'llvm-commits'
>> *Subject:* RE: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>>  ****
>>
>> Thanks for the feedback. I will make the changes you suggested and upload
>> a new set of patches shortly. I responded below to your questions.****
>>
>>  ****
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>>  ****
>>
>>  ****
>>
>> *From:* Manman Ren [mailto:manman.ren at gmail.com <manman.ren at gmail.com>]
>> *Sent:* Wednesday, October 16, 2013 10:32 AM
>> *To:* David Peixotto
>> *Cc:* 林作健; llvm-commits****
>>
>>
>> *Subject:* Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>>  ****
>>
>>  ****
>>
>> A few nitpicks on the refactoring patch:****
>>
>>  ****
>>
>> +#include "llvm/Support/Debug.h"****
>>
>>  #include "llvm/Support/ErrorHandling.h"****
>>
>>  #include "llvm/Support/MathExtras.h"****
>>
>>  #include "llvm/Support/raw_ostream.h"****
>>
>>  #include "llvm/Target/TargetOptions.h"****
>>
>> +#include <utility>****
>>
>> <-- I am not sure whether these two files are needed or you forgot to
>> clean it up****
>>
>>  ****
>>
>> The <utility> include was added because I’m using std::pair. The Debug.h
>> was leftover from debugging. I will remove it.****
>>
>>  ****
>>
>> Could you add comments to each member function of
>> TargetStructByvalEmitter and StructByvalEmitter?****
>>
>>  ****
>>
>> Sure, no problem****
>>
>>  ****
>>
>> @@ -7286,26 +7603,19 @@ EmitStructByval(MachineInstr *MI,
>> MachineBasicBlock *BB) const {****
>>
>>                         Attribute::NoImplicitFloat) &&****
>>
>>          Subtarget->hasNEON()) {****
>>
>>        if ((Align % 16 == 0) && SizeVal >= 16) {****
>>
>> -        ldrOpc = ARM::VLD1q32wb_fixed;****
>>
>> -        strOpc = ARM::VST1q32wb_fixed;****
>>
>>          UnitSize = 16;****
>>
>> -        TRC_Vec = (const TargetRegisterClass*)&ARM::DPairRegClass;****
>>
>> -      }****
>>
>> -      else if ((Align % 8 == 0) && SizeVal >= 8) {****
>>
>> -        ldrOpc = ARM::VLD1d32wb_fixed;****
>>
>> -        strOpc = ARM::VST1d32wb_fixed;****
>>
>> +      } else if ((Align % 8 == 0) && SizeVal >= 8) {****
>>
>>          UnitSize = 8;****
>>
>> -        TRC_Vec = (const TargetRegisterClass*)&ARM::DPRRegClass;****
>>
>>        }****
>>
>>      }****
>>
>>      // Can't use NEON instructions.****
>>
>>      if (UnitSize == 0) {****
>>
>> -      ldrOpc = isThumb2 ? ARM::t2LDR_POST : ARM::LDR_POST_IMM;****
>>
>> -      strOpc = isThumb2 ? ARM::t2STR_POST : ARM::STR_POST_IMM;****
>>
>>        UnitSize = 4;****
>>
>>      }****
>>
>>    }****
>>
>> <-- Can you remove the parenthesis around the single statement?****
>>
>>  ****
>>
>> Will do.****
>>
>>  ****
>>
>> On the second patch:****
>>
>> +  unsigned emitConstantLoad(MachineBasicBlock *BB, MachineInstr *MI,****
>>
>> +                            DebugLoc &dl, unsigned Constant,****
>>
>> +                            const DataLayout *DL) {****
>>
>> +    unsigned constReg = MRI.createVirtualRegister(TRC);****
>>
>> +    unsigned Idx = getConstantPoolIndex(BB->getParent(), DL, Constant);*
>> ***
>>
>> +    AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(ARM::tLDRpci)).addReg(*
>> ***
>>
>> +        constReg, RegState::Define).addConstantPoolIndex(Idx));****
>>
>> +    return constReg;****
>>
>> +    return 0;****
>>
>> +  }****
>>
>> +****
>>
>> <-- Please remove the extra "return 0".****
>>
>>  ****
>>
>> Will do.****
>>
>>  ****
>>
>> +;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
>>  -verify-machineinstrs -filetype=obj | llvm-objdump -triple
>> thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
>> --check-prefix=THUMB1****
>>
>> +;RUN: llc < %s -mtriple=thumbv5-none-linux-gnueabi
>>  -verify-machineinstrs -filetype=obj | llvm-objdump -triple
>> thumbv5-none-linux-gnueabi -disassemble - | FileCheck %s
>> --check-prefix=T1POST****
>>
>> <-- These two run lines are the same except the prefix, just wondering
>> why?****
>>
>>  ****
>>
>> I was trying to have a test that checked for the presence of the correct
>> load/store instructions and the absence of the post-increment version for
>> thumb1. I want to pass on output like this:****
>>
>>  ****
>>
>> ldr r0, [r1]****
>>
>> add r0, #4****
>>
>>  ****
>>
>> and fail on output like this:****
>>
>> ldr r0, [r1], #4****
>>
>>  ****
>>
>> It wasn’t clear to me how to combine those two checks into a single pass
>> with FileCheck and so I used two passes. One to check for expected output
>> and one to check for lack of bad output.****
>>
>>  ****
>>
>>  ****
>>
>> Thanks,****
>>
>> Manman****
>>
>>  ****
>>
>> On Tue, Oct 15, 2013 at 3:27 PM, David Peixotto <dpeixott at codeaurora.org>
>> wrote:****
>>
>> Hi Manman,****
>>
>>  ****
>>
>> I split the patch into two parts as requested. Most of the changes are in
>> the refactoring patch. The thumb1 support is pretty straightforward after
>> the refactoring is done.****
>>
>>  ****
>>
>> Thanks for your help reviewing these patches.****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>>  ****
>>
>>  ****
>>
>> *From:* Manman Ren [mailto:manman.ren at gmail.com]
>> *Sent:* Tuesday, October 15, 2013 2:01 PM
>> *To:* David Peixotto
>> *Cc:* 林作健; llvm-commits****
>>
>>
>> *Subject:* Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>>  ****
>>
>>  ****
>>
>> Hi David,****
>>
>>  ****
>>
>> Thanks for working on this.****
>>
>> Can you separate your patch to a refactoring patch and a patch for Thumb1?
>> ****
>>
>>  ****
>>
>> It is easier to review and we should submit it as two commits.****
>>
>>  ****
>>
>> Manman****
>>
>>  ****
>>
>> On Tue, Oct 15, 2013 at 1:16 PM, David Peixotto <dpeixott at codeaurora.org>
>> wrote:****
>>
>> I’ve attached a patch that fixes the issue for thumb1 and thumb2. I
>> refactored the code a bit to isolate all the target-specific to a helper
>> class. It makes the code a bit longer, but it helped me to see where we
>> need different operations for each subtarget and to implement them in
>> isolation. I’m not opposed to doing it all inline, but I wasn’t sure I
>> would be able to get all the details right that way.****
>>
>>  ****
>>
>> This commit refactors the code so that we can easily support thumb1****
>>
>> targets and implements the thumb1 lowering.****
>>
>>  ****
>>
>> The original lowering algorithm is unchanged, but it now uses a****
>>
>> helper class to generate the machine instructions.  The refactoring****
>>
>> introduces a StructByvalEmitter class that encapsulates the****
>>
>> operations needed to lower the COPY_STRUCT pseudo instruction. The****
>>
>> basic operations (e.g. emit load, emit store) are all wrappers to****
>>
>> the correct machine instruction for each target. The operations are****
>>
>> then defined for Arm, Thumb1, and Thumb2 targets.****
>>
>>  ****
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>>  ****
>>
>>  ****
>>
>> *From:* Manman Ren [mailto:manman.ren at gmail.com]
>> *Sent:* Tuesday, October 15, 2013 11:51 AM
>> *To:* David Peixotto
>> *Cc:* 林作健; llvm-commits****
>>
>>
>> *Subject:* Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> On Tue, Oct 15, 2013 at 10:13 AM, David Peixotto <dpeixott at codeaurora.org>
>> wrote:****
>>
>> Why does this patch need to make a change to the ARMConstantIslandsPass?
>> I think it may be because we are still using an arm instruction for the
>> constant load instead of the equivalent thumb instruction (tLDRpci).****
>>
>>  ****
>>
>> I think your patch is incomplete for supporting thumb1. You need to
>> modify the lowering for COPY_STRUCT_BYVAL_I32 in a few more places. We
>> should be using the thumb1 opcodes for all the instructions generated by
>> the lowering. You have handled the load/store case for the values, but you
>> still need to take care of the constant pool load, the subtract (for the
>> loop counter) and the branch. These instructions should use the tLDRpci
>> (for constant pool load), tSubi8 (for the loop counter subtract), and tBcc
>> (for the loop branch).****
>>
>>  ****
>>
>> I didn’t see your bug and had filed another bug for this same issue:
>> http://llvm.org/bugs/show_bug.cgi?id=17309. Please also check that your
>> patch works with the test case I uploaded to that bug (also attached to
>> this email). In addition to checks for thumb1, this test case also exposes
>> a bug in the current code with the thumb2 lowering. It will trigger an
>> assertion because an operand is added to a full instruction. It looks like
>> the thumb2 failure comes from a copy/paste error when building the machine
>> instructions.****
>>
>>  ****
>>
>> Thanks for the testing case. Yes, the thumb2 failure is from a
>> copy/pasto. I will fix that.****
>>
>>  ****
>>
>> Manman ****
>>
>>  ****
>>
>>  ****
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation****
>>
>>  ****
>>
>>  ****
>>
>> *From:* llvm-commits-bounces at cs.uiuc.edu [mailto:
>> llvm-commits-bounces at cs.uiuc.edu] *On Behalf Of *???
>> *Sent:* Monday, October 14, 2013 6:29 PM
>> *To:* Manman Ren
>> *Cc:* llvm-commits
>> *Subject:* Re: review request Bug 16545 - COPY_STRUCT_BYVAL_I32 pseudo
>> code generates unrecognized code****
>>
>>  ****
>>
>> I didn't add THUMB1 lines for function h simple armv5te doesn't support
>> vldr instr.****
>>
>> And here is the patch for formatting.****
>>
>>  ****
>>
>> 2013/10/15 Manman Ren <manman.ren at gmail.com>****
>>
>> Hi,****
>>
>>  ****
>>
>> The changes to ARMISelLowering.cpp look good to me.****
>>
>>  ****
>>
>> --- lib/Target/ARM/ARMConstantIslandPass.cpp (版本 187190)****
>>
>> +++ lib/Target/ARM/ARMConstantIslandPass.cpp (工作副本)****
>>
>> @@ -634,6 +634,7 @@****
>>
>>  initializeFunctionInfo(const std::vector<MachineInstr*> &CPEMIs) {****
>>
>>    BBInfo.clear();****
>>
>>    BBInfo.resize(MF->getNumBlockIDs());****
>>
>> +  const ARMSubtarget * Subtarget =
>> &MF->getTarget().getSubtarget<ARMSubtarget>();****
>>
>>  ****
>>
>>    // First thing, compute the size of all basic blocks, and see if the
>> function****
>>
>>    // has any inline assembly in it. If so, we have to be conservative
>> about****
>>
>> @@ -757,7 +758,12 @@****
>>
>>            case ARM::LDRi12:****
>>
>>            case ARM::LDRcp:****
>>
>>            case ARM::t2LDRpci:****
>>
>> -            Bits = 12;  // +-offset_12****
>>
>> +            if(Subtarget->hasV7Ops())****
>>
>> +            {****
>>
>> +              Bits = 12;  // +-offset_12****
>>
>> +            } else {****
>>
>> +              Bits = 8;  // +-offset_8****
>>
>> +            }****
>>
>>  ****
>>
>> Please simplify above by removing the parenthesis, or using "? 12 : 8".**
>> **
>>
>> I am not really familiar with the const island pass, so I cc'ed Bob for
>> him to review it.****
>>
>>  ****
>>
>> The testing case looks good in general, is there any reason you didn't
>> add THUMB1 lines for function h()?****
>>
>>  ****
>>
>> Thanks,****
>>
>> Manman****
>>
>>  ****
>>
>>  ****
>>
>> On Sun, Oct 13, 2013 at 4:48 AM, 林作健 <manjian2006 at gmail.com> wrote:****
>>
>>  ****
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> ** **
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131018/abfd6a93/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2013-10-18 15:45:35.png
Type: image/png
Size: 312607 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131018/abfd6a93/attachment.png>