Fix PR15293: ARM codegen ice - expected larger existing stack allocation

Stepan Dyatkovskiy stpworld at narod.ru
Wed May 1 11:36:00 PDT 2013


Hello,
Please find patch in attachment that fixes byval support in ARM backend 
for llvm level.

-Stepan.

Stepan Dyatkovskiy wrote:
> Commited in r180774. Now I'm working on milestone patch, that fixes
> byval support for llvm level. I published it before, now I need to
> update it. Hope it come soon.
>
> -Stepan.
>
> Manman Ren wrote:
>>
>> LGTM,
>>
>> Thanks,
>> Manman
>>
>> On Apr 28, 2013, at 10:50 AM, Stepan Dyatkovskiy wrote:
>>
>>> Hi Manman,
>>> Just for keeping things actual: I reattached patch with new names.
>>> -Stepan
>>>
>>> Stepan Dyatkovskiy wrote:
>>>> Hi Manman,
>>>>
>>>>> Since it represents the stack area needed to save parameters that are
>>>> actually passed in registers, how about RegsSaveSizeForParams or just
>>>> RegsSaveSize?
>>>>
>>>> In AAPCS R0-R3 are called like:
>>>> "Argument / result / scratch register".
>>>> So maybe we'll use "set/getArgRegsSaveSize" names?
>>>>
>>>>> Also it is weird to have "unsigned VARegSaveSize =
>>>> AFI->getByValRegsSaveSize();"
>>>>
>>>> Yea :-) Here replace "VAReg" with "ArgRegs": ArgRegsSaveSize,
>>>> ArgRegsSize.
>>>>
>>>> -Stepan
>>>>
>>>> Manman Ren wrote:
>>>>>
>>>>> Hi Stepan,
>>>>>
>>>>> Thanks for working on this.
>>>>>
>>>>> I put in a comment about the refactor patch sometime ago:
>>>>>>> I am not sure about the renaming patch though. Functions
>>>>>>> get|setVarArgsRegSaveSize are used for both vararg and byval.
>>>>>>> It does not sound right to replace VarARgs with ByVal.
>>>>>>
>>>>>> Actually it is used not for vararg function, as I found it main role
>>>>>> of this param in emitPrologue, it is used for proper stack
>>>>>> allocation.
>>>>>> Stack size should enough for storing all registers, used for byval
>>>>>> parameters. And yes, if it is VA function, we must also extend this
>>>>>> stack area for all unallocated R0-R3 regs storage.
>>>>>
>>>>> Agree. That is why I think ByVal is not a great name either.
>>>>> Since it represents the stack area needed to save parameters that are
>>>>> actually passed in registers, how about RegsSaveSizeForParams or just
>>>>> RegsSaveSize?
>>>>> Also it is weird to have "unsigned VARegSaveSize =
>>>>> AFI->getByValRegsSaveSize();" where we assign return value of
>>>>> getByValRegsSaveSize to a variable "VARegSaveSize".
>>>>>
>>>>> Manman
>>>>>
>>>>> On Apr 26, 2013, at 11:12 AM, Stepan Dyatkovskiy wrote:
>>>>>
>>>>>> Hi Manman and Renato,
>>>>>> Patch was tested with test-suite all test have bee passed.
>>>>>> -Stepan
>>>>>>
>>>>>> Stepan Dyatkovskiy wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> AAPCS, 5.5 Parameters Passing, Stage C, C4 and C5 have been fixed,
>>>>>>> r180011.
>>>>>>>
>>>>>>> So, next patch is in attachment: VarArgsStyleRegisters refactoring.
>>>>>>>
>>>>>>> -Stepan.
>>>>>>>
>>>>>>> Stepan Dyatkovskiy wrote:
>>>>>>>> Hello, Manman.
>>>>>>>> That's kind of snowball!
>>>>>>>> Today test-suite finished with two new failures on my beagle.
>>>>>>>> It helped me find out one more hidden logic dependency in ARM
>>>>>>>> backend.
>>>>>>>>
>>>>>>>> May be its not a mistake, but look:
>>>>>>>>
>>>>>>>> In ARMCallingConv.h, "f64AssignAAPCS" allocates registers for f64.
>>>>>>>> This
>>>>>>>> method is called for VA functions, since there is no CPRCs, and
>>>>>>>> *any*
>>>>>>>> parameters could be allocated in R0-R3. Either R0,R1 or R2,R3
>>>>>>>> could be
>>>>>>>> allocated here. If all the registers except R3 were allocated
>>>>>>>> somewhere
>>>>>>>> before, f64AssignAAPCS just moves whole f64 to stack, it doesn't
>>>>>>>> waste
>>>>>>>> R3, so R3 could be allocated later. But logic seems to be same
>>>>>>>> as in
>>>>>>>> ARMTargetLowering::HandleByVal.
>>>>>>>>
>>>>>>>> My main fix in calling convention was:
>>>>>>>>
>>>>>>>> def CC_ARM_AAPCS_Common : CallingConv<[
>>>>>>>> ...
>>>>>>>> -  CCIfType<[i32], CCIf<"State.getNextStackOffset() == 0 &&"
>>>>>>>> -                       "ArgFlags.getOrigAlign() != 8",
>>>>>>>> +  CCIfType<[i32], CCIf<"ArgFlags.getOrigAlign() != 8",
>>>>>>>>                          CCAssignToReg<[R0, R1, R2, R3]>>>,
>>>>>>>>
>>>>>>>> Now I got some picture why "State.getNextStackOffset() == 0" was
>>>>>>>> added:
>>>>>>>> due to f64AssignAAPCS implementation. May be the are more reasons:
>>>>>>>> next
>>>>>>>> test-suite results will show it then :-)
>>>>>>>>
>>>>>>>> Anyway, original CCIfType brakes C.4 and C.5, while my fix
>>>>>>>> brakes some
>>>>>>>> implicit statement defined in f64AssignAAPCS:
>>>>>>>>
>>>>>>>> "stack-offset != 0 doesn't mean all registers are allocated"
>>>>>>>>
>>>>>>>> So I've fixed f64AssignAAPCS too. Once we waste registers in
>>>>>>>> HandleByVal
>>>>>>>> (when we send parameter to stack), I think it would be good to keep
>>>>>>>> same
>>>>>>>> behaviour for f64AssignAAPCS since its HandleByVal's analogue
>>>>>>>> for f64
>>>>>>>> params.
>>>>>>>>
>>>>>>>> I attached new patch, now with 3 tests :-)
>>>>>>>> (3rd test is 2013-04-21-AAPCS-VA-C.1.cp.ll)
>>>>>>>>
>>>>>>>> -Stepan.
>>>>>>>>
>>>>>>>> Manman Ren wrote:
>>>>>>>>>
>>>>>>>>> LGTM,
>>>>>>>>>
>>>>>>>>> Thanks a lot,
>>>>>>>>> Manman
>>>>>>>>>
>>>>>>>>> On Apr 18, 2013, at 8:41 PM, Stepan Dyatkovskiy wrote:
>>>>>>>>>
>>>>>>>>>> Fixed :-) New patch (+2 tests) is attached.
>>>>>>>>>> -Stepan
>>>>>>>>>>
>>>>>>>>>> Manman Ren wrote:
>>>>>>>>>>>
>>>>>>>>>>> "AAPCS-C.4-vs-VFP-2013-04-17.patch" looks good to me.
>>>>>>>>>>>
>>>>>>>>>>> As a side note, we may have issues with 9 floats, 3 ints and a
>>>>>>>>>>> small
>>>>>>>>>>> byval.
>>>>>>>>>>> The standard says we should split the byval only when NSAA is
>>>>>>>>>>> equal
>>>>>>>>>>> to SP (C.5), in the case of 9 floats, 3 ints and a small byval,
>>>>>>>>>>> we should not split the byval since we advanced NSAA with the
>>>>>>>>>>> 9th
>>>>>>>>>>> float.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Manman
>>>>>>>>>>>
>>>>>>>>>>> On Apr 16, 2013, at 8:51 PM, Stepan Dyatkovskiy wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Manman,
>>>>>>>>>>>>
>>>>>>>>>>>>> AAPCS-C.4-vs-VFP-2013-04-16.patch looks good except that the
>>>>>>>>>>>>> testing case.
>>>>>>>>>>>>> Should it have 9 doubles and 1 integer instead of a single
>>>>>>>>>>>>> double
>>>>>>>>>>>>> and a single integer?
>>>>>>>>>>>>> The testing case passes even without the fix.
>>>>>>>>>>>> That was my midnight mistake, sorry.
>>>>>>>>>>>> I reattached patch with proper test. In the end of test
>>>>>>>>>>>> file, I've
>>>>>>>>>>>> added comments for @doFoo: small description of how parameters
>>>>>>>>>>>> must
>>>>>>>>>>>> be passed:
>>>>>>>>>>>> first eight doubles --> D0-D7
>>>>>>>>>>>> 9th double --> Stack
>>>>>>>>>>>> 10th param (first i32) --> R0.
>>>>>>>>>>>>
>>>>>>>>>>>> It is important to have 9 double params here, since we must
>>>>>>>>>>>> fill-up
>>>>>>>>>>>> all possible VFPs, then get compiler to use stack offset,
>>>>>>>>>>>> and only
>>>>>>>>>>>> after start use GPRs.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not sure about the renaming patch though. Functions
>>>>>>>>>>>>> get|setVarArgsRegSaveSize are used for both vararg and byval.
>>>>>>>>>>>>> It does not sound right to replace VarARgs with ByVal.
>>>>>>>>>>>>
>>>>>>>>>>>> Actually it is used not for vararg function, as I found it main
>>>>>>>>>>>> role of this param in emitPrologue, it is used for proper stack
>>>>>>>>>>>> allocation.
>>>>>>>>>>>> Stack size should enough for storing all registers, used for
>>>>>>>>>>>> byval
>>>>>>>>>>>> parameters. And yes, if it is VA function, we must also extend
>>>>>>>>>>>> this
>>>>>>>>>>>> stack area for all unallocated R0-R3 regs storage.
>>>>>>>>>>> Agree. That is why I think ByVal is not a great name either.
>>>>>>>>>>> Since it represents the stack area needed to save parameters
>>>>>>>>>>> that
>>>>>>>>>>> are actually passed in registers, how about
>>>>>>>>>>> RegsSaveSizeForParams or
>>>>>>>>>>> just RegsSaveSize?
>>>>>>>>>>> Also it is weird to have "unsigned VARegSaveSize =
>>>>>>>>>>> AFI->getByValRegsSaveSize();" where we assign return value of
>>>>>>>>>>> getByValRegsSaveSize to a variable "VARegSaveSize".
>>>>>>>>>>>
>>>>>>>>>>> Thanks for working on this.
>>>>>>>>>>>>
>>>>>>>>>>>> -Stepan.
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for working on this.
>>>>>>>>>>>>> Manman
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Apr 15, 2013, at 9:23 AM, Stepan Dyatkovskiy wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Manman,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *4.* One new change I've made is
>>>>>>>>>>>>>>>> CC_ARM_AAPCS_Common rule.
>>>>>>>>>>>>>>> Can we fix the above issue in a separate patch with
>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>> testing case?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well.. We can use next case: Co-Processor register candidates
>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>> either in VFP or in stack, so when all VFP are allocated
>>>>>>>>>>>>>> stack is
>>>>>>>>>>>>>> used. We can use stack without GPR allocation in that case,
>>>>>>>>>>>>>> passing 9 f64 params, for example: first eight goes to d0-d7,
>>>>>>>>>>>>>> night goes to stack. Now as 10th parameter we pass i32,
>>>>>>>>>>>>>> according
>>>>>>>>>>>>>> to AAPCS, 5.5 Parameters Passing, Stage C, C.4:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [quote]
>>>>>>>>>>>>>> If the size in words of the argument is not more than r4
>>>>>>>>>>>>>> minus
>>>>>>>>>>>>>> NCRN, the argument is copied into
>>>>>>>>>>>>>> core registers, starting at the NCRN. The NCRN is
>>>>>>>>>>>>>> incremented by
>>>>>>>>>>>>>> the number of registers used.
>>>>>>>>>>>>>> Successive registers hold the parts of the argument they
>>>>>>>>>>>>>> would
>>>>>>>>>>>>>> hold if its value were loaded into
>>>>>>>>>>>>>> those registers from memory using an LDM instruction. The
>>>>>>>>>>>>>> argument has now been allocated.
>>>>>>>>>>>>>> [/quote]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It must to R0 in our case. But currently it goes to stack.
>>>>>>>>>>>>>> I also checked this case with GCC - it works as in AAPCS
>>>>>>>>>>>>>> written.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> About refactoring. I propose do it in two stages, first
>>>>>>>>>>>>>> rename
>>>>>>>>>>>>>> "getVarArgsRegSaveSize" to "get/setByValRegsSaveSize", and
>>>>>>>>>>>>>> detach
>>>>>>>>>>>>>> "StoreByValRegs" functionality from "VarArgStyleRegisters"
>>>>>>>>>>>>>> method.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I attached first refactor patch, Calling-Convention patch
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> test-case, and calling-convention .c test case.
>>>>>>>>>>>>>> Try run .c file with "arm-linux-gnueabihf-gcc" and
>>>>>>>>>>>>>> "./arm-linux-gnueabihf-clang -mcpu=cortex-a9".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> P.S.: My beagleboard is very very slow, "release" build takes
>>>>>>>>>>>>>> about 15 hours, test-suite may be 15-20 hours too. I'll
>>>>>>>>>>>>>> try ask
>>>>>>>>>>>>>> somebody to present me faster device, but it take some time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Stepan.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Manman Ren wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Apr 12, 2013, at 5:15 AM, Stepan Dyatkovskiy wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello Manman,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I reattached full patch with fixes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. I've removedskipWastedByValRegs, addByValWastedRegs and
>>>>>>>>>>>>>>>> Wasted flag.
>>>>>>>>>>>>>>>> 2. Fixed usage of "offset" variable in LowerFormalArguments
>>>>>>>>>>>>>>>> 3. And fixed regression test: added function with two byval
>>>>>>>>>>>>>>>> structures.
>>>>>>>>>>>>>>> The above looks good.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *4.* One new change I've made is
>>>>>>>>>>>>>>>> CC_ARM_AAPCS_Common rule.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Currently, for align == 4 and parameter type <=i32, we
>>>>>>>>>>>>>>>> try to
>>>>>>>>>>>>>>>> assign it to the first unallocated register from [r0-r3]
>>>>>>>>>>>>>>>> range,
>>>>>>>>>>>>>>>> but only if getNextStackOffset() == 0.
>>>>>>>>>>>>>>>> I suppose that "getNextStackOffset() != 0" is treated as
>>>>>>>>>>>>>>>> fact?
>>>>>>>>>>>>>>>> that all registers are used? and params goes to the
>>>>>>>>>>>>>>>> stack now.
>>>>>>>>>>>>>>>> After byval structures will be implemented, the next
>>>>>>>>>>>>>>>> case is
>>>>>>>>>>>>>>>> possible:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> void @f(%i32_struct* byval, i32)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here, we need to allocate stack for first parameter, but
>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>> we still have unallocated registers. And second parameter
>>>>>>>>>>>>>>>> should be sent to r1. Current CC_ARM_AAPCS_Common
>>>>>>>>>>>>>>>> implementation will send second parameter to the stack.
>>>>>>>>>>>>>>>> Though, I'm still not sure about this change, just
>>>>>>>>>>>>>>>> because may
>>>>>>>>>>>>>>>> be there are more reasons to check nextStackOffset value.
>>>>>>>>>>>>>>> Can we fix the above issue in a separate patch with
>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>> testing case?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +    // for byval parameters.
>>>>>>>>>>>>>>>>>>> +    {
>>>>>>>>>>>>>>>>>>> +      unsigned InRegsParamsCount =
>>>>>>>>>>>>>>>>>>> State->getInRegsParamsCount();
>>>>>>>>>>>>>>>>>>> +      if (InRegsParamsCount) {
>>>>>>>>>>>>>>>>>>> +        unsigned RB, RE;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> State->getInRegsParamInfo(InRegsParamsCount-1, RB,
>>>>>>>>>>>>>>>>>>> RE);
>>>>>>>>>>>>>>>>>>> +        assert(reg >= RE);
>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>> Is this only for assertion? Is this necessary?
>>>>>>>>>>>>>>>>> Yes. I've just moved this check from "if" statement,
>>>>>>>>>>>>>>>>> since it
>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>> true always.
>>>>>>>>>>>>>>>> How do you think, do we need this assertion?
>>>>>>>>>>>>>>> I don't really think it is necessary :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I didn't present refactor patch, just because main patch
>>>>>>>>>>>>>>>> didn't
>>>>>>>>>>>>>>>> get LGTM community and hypothetically may contain some
>>>>>>>>>>>>>>>> mistakes..
>>>>>>>>>>>>>>> LGTM otherwise.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Manman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Stepan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Stepan Dyatkovskiy wrote:
>>>>>>>>>>>>>>>>> Manman Ren wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Stepan,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> One high-level question: do we need to insert the wasted
>>>>>>>>>>>>>>>>>> record to
>>>>>>>>>>>>>>>>>> ByValRegs and later on skip those records?
>>>>>>>>>>>>>>>>>> It seems to me that we can only insert the records for
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> actual byval
>>>>>>>>>>>>>>>>>> parameters.
>>>>>>>>>>>>>>>>>> If that is true, we can remove addByValWastedRegs and
>>>>>>>>>>>>>>>>>> skipWastedByValRegs.
>>>>>>>>>>>>>>>>> Hm.. yes its reasonable. I'll try to fix.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +        // "offset" value would be usefull if byval
>>>>>>>>>>>>>>>>>>> parameter outsides
>>>>>>>>>>>>>>>>>>> +        // the registers area. Note, in that case
>>>>>>>>>>>>>>>>>>> nextInRegsParam()
>>>>>>>>>>>>>>>>>>> will not
>>>>>>>>>>>>>>>>>>> +        // skip any waste registers, since parameter
>>>>>>>>>>>>>>>>>>> fillup
>>>>>>>>>>>>>>>>>>> the whole
>>>>>>>>>>>>>>>>>>> rest
>>>>>>>>>>>>>>>>>>> +        // of registers set.
>>>>>>>>>>>>>>>>>>> +        unsigned RegsUsed = RegEnd - RegBegin;
>>>>>>>>>>>>>>>>>>> +        if (!CCInfo.nextInRegsParam() &&
>>>>>>>>>>>>>>>>>>> Flags.getByValSize() >
>>>>>>>>>>>>>>>>>>> 4*RegsUsed )
>>>>>>>>>>>>>>>>>>> +          offset = RegEnd - RegBegin;
>>>>>>>>>>>>>>>>>>>        }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -      if (Flags.getByValSize() - 4*offset > 0) {
>>>>>>>>>>>>>>>>>>> +      if (offset) {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This still does not look right.
>>>>>>>>>>>>>>>>>> There are 3 cases:
>>>>>>>>>>>>>>>>>> If the byval parameter is in registers partially, we
>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>> create a
>>>>>>>>>>>>>>>>>> COPY_STRUCT_BYVAL with the correct offset;
>>>>>>>>>>>>>>>>>> If the byval parameter is all in registers, there is
>>>>>>>>>>>>>>>>>> no need
>>>>>>>>>>>>>>>>>> to create a
>>>>>>>>>>>>>>>>>> COPY_STRUCT_BYVAL;
>>>>>>>>>>>>>>>>>> If the byval parameter is all on stack, we should
>>>>>>>>>>>>>>>>>> create a
>>>>>>>>>>>>>>>>>> COPY_STRUCT_BYVAL with offset equal to 0.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> When looking at the above code, offset is default to
>>>>>>>>>>>>>>>>>> zero, if
>>>>>>>>>>>>>>>>>> CurByValIdx >= ByValArgsCount (this byval parameter is
>>>>>>>>>>>>>>>>>> all on
>>>>>>>>>>>>>>>>>> stack),
>>>>>>>>>>>>>>>>>> offset will still be zero,
>>>>>>>>>>>>>>>>>> and we will not create a COPY_STRUCT_BYVAL.
>>>>>>>>>>>>>>>>>> The correct logic seems to be always updating offset
>>>>>>>>>>>>>>>>>> inside
>>>>>>>>>>>>>>>>>> the if and
>>>>>>>>>>>>>>>>>> check Flags.getByValSize() > 4*offset
>>>>>>>>>>>>>>>>>> if (CurByValIdx < ByValArgsCount) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    offset = RegEnd - RegBegin; //number of registers
>>>>>>>>>>>>>>>>>> occupied
>>>>>>>>>>>>>>>>>> by this
>>>>>>>>>>>>>>>>>> byval parameter
>>>>>>>>>>>>>>>>>>    CCInfo.nextInRegsParam(); // go to the next byval
>>>>>>>>>>>>>>>>>> parameter that is
>>>>>>>>>>>>>>>>>> in registers partially or all
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>> if (Flags.getByValSize() > 4*offset) {
>>>>>>>>>>>>>>>>>>    // Part of the byval parameter is on stack.
>>>>>>>>>>>>>>>>>>    ...
>>>>>>>>>>>>>>>>> Oh.. I missed third case. Then we keep first variant here.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -  if ((!State->isFirstByValRegValid()) &&
>>>>>>>>>>>>>>>>>>> -      (ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>>>>>>>>>>>>>>> +  if ((ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>>>>>>>>>>>>>>> +    // All registers reserved for byval parameters
>>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>> be allocated.
>>>>>>>>>>>>>>>>>>> +    // Check that first unallocated register is higher
>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>> last one
>>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>> +    // for byval parameters.
>>>>>>>>>>>>>>>>>>> +    {
>>>>>>>>>>>>>>>>>>> +      unsigned InRegsParamsCount =
>>>>>>>>>>>>>>>>>>> State->getInRegsParamsCount();
>>>>>>>>>>>>>>>>>>> +      if (InRegsParamsCount) {
>>>>>>>>>>>>>>>>>>> +        unsigned RB, RE;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> State->getInRegsParamInfo(InRegsParamsCount-1, RB,
>>>>>>>>>>>>>>>>>>> RE);
>>>>>>>>>>>>>>>>>>> +        assert(reg >= RE);
>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>> Is this only for assertion? Is this necessary?
>>>>>>>>>>>>>>>>> Yes. I've just moved this check from "if" statement,
>>>>>>>>>>>>>>>>> since it
>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>> true always.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -void
>>>>>>>>>>>>>>>>>>> -ARMTargetLowering::VarArgStyleRegisters(CCState
>>>>>>>>>>>>>>>>>>> &CCInfo,
>>>>>>>>>>>>>>>>>>> SelectionDAG
>>>>>>>>>>>>>>>>>>> &DAG,
>>>>>>>>>>>>>>>>>>> -                                        DebugLoc dl,
>>>>>>>>>>>>>>>>>>> SDValue &Chain,
>>>>>>>>>>>>>>>>>>> -                                        const Value
>>>>>>>>>>>>>>>>>>> *OrigArg,
>>>>>>>>>>>>>>>>>>> -                                        unsigned
>>>>>>>>>>>>>>>>>>> OffsetFromOrigArg,
>>>>>>>>>>>>>>>>>>> -                                        unsigned
>>>>>>>>>>>>>>>>>>> ArgOffset,
>>>>>>>>>>>>>>>>>>> -                                        bool
>>>>>>>>>>>>>>>>>>> ForceMutable)
>>>>>>>>>>>>>>>>>>> const {
>>>>>>>>>>>>>>>>>>> +// Return: The frame index registers were stored into.
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +ARMTargetLowering::StoreByValRegs(CCState &CCInfo,
>>>>>>>>>>>>>>>>>>> SelectionDAG &DAG,
>>>>>>>>>>>>>>>>>>> +                                  DebugLoc dl, SDValue
>>>>>>>>>>>>>>>>>>> &Chain,
>>>>>>>>>>>>>>>>>>> +                                  const Value *OrigArg,
>>>>>>>>>>>>>>>>>>> +                                  unsigned
>>>>>>>>>>>>>>>>>>> InRegsParamRecordIdx,
>>>>>>>>>>>>>>>>>>> +                                  unsigned
>>>>>>>>>>>>>>>>>>> OffsetFromOrigArg,
>>>>>>>>>>>>>>>>>>> +                                  unsigned ArgOffset,
>>>>>>>>>>>>>>>>>>> +                                  bool ForceMutable)
>>>>>>>>>>>>>>>>>>> const {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It would be much cleaner if we can move the
>>>>>>>>>>>>>>>>>> refactoring of
>>>>>>>>>>>>>>>>>> VarArgStyleRegisters to a separate patch.
>>>>>>>>>>>>>>>>> For sure.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> P.S.: Today I had some troubles with electricity, so
>>>>>>>>>>>>>>>>> I'll try
>>>>>>>>>>>>>>>>> present
>>>>>>>>>>>>>>>>> new patch tomorrow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Stepan.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> <pr15293-2013-04-12.patch>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <stack-offset.c><VarArgsRegSaveSize-2013-04-16.patch><AAPCS-C.4-vs-VFP-2013-04-16.patch>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> <AAPCS-C.4-vs-VFP-2013-04-17.patch>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> <AAPCS-C.4-vs-VFP-2013-04-19.patch>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> <StoreByValRegs-2013-04-17.patch>
>>>>>
>>>>
>>>
>>> <StoreByValRegs-2013-04-28.patch>
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr15293-2013-05-01.patch
Type: text/x-diff
Size: 18398 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130501/7c51ded2/attachment.patch>


More information about the llvm-commits mailing list