[llvm-dev] Conditional Register Assignment based on the no of loop iterations

hameeza ahmed via llvm-dev llvm-dev at lists.llvm.org
Mon Jul 10 21:00:45 PDT 2017


Thank you for a your reply.
No i am not asking for software pipelining. my goal is different.

I am targetting a hardware with 64 element vector operations. now there are
2 scenarios;

if my loop has >=2048 iterations i use vector width=2048 and if my loop has
<2048 iterations i use vector width=64. but my hardware doesnot support
vector width=2048 it does support vector width=64.
now when iterations >=2048, again they are splitted into v64i32( have
implemented it). till this point instruction selection is working fine.
the only difference should come in register set. the registers are same for
both the scenarios but their ordering is need to be different like;
if no of iterations >=2048, ordering should be of Reg_A

R_0_V_0, R_0_V_1, R_0_V_2, R_0_V_3,
>>> R_1_V_0, R_1_V_1, R_1_V_2, R_1_V_3,
>>> R_2_V_0, R_2_V_1, R_2_V_2, R_2_V_3.
>>> These registers defined in object Reg_A
>>>
>> here;

1st load will take place in R_0_V_0
>>> 2nd load will take place in R_0_V_1
>>> 3rd load will take place in R_0_V_2
>>> 4th load will take place in R_0_V_3
>>>
>>
if no of iterations <2048, ordering should be of Reg_B

Reg_B;
>>>
>>> R_0_V_0, R_1_V_0, R_2_V_0,  //here R changes
>>> R_0_V_1, R_1_V_1, R_2_V_1,
>>> R_0_V_2, R_1_V_2, R_2_V_2,
>>> R_0_V_3, R_1_V_3, R_2_V_3.
>>>
>> here

1st load to take place in R_0_V_0

2nd load to take place in R_1_V_0
>>>
>> 3rd load to take place in R_2_V_0


here i need is this that if no of iterations<2048 and i set vector width=64
then in this case i want my operations to be performed on Reg_B if
iterations>=2048, operations to be performed on Reg_A.

i have think of several solutions to this but none worked yet??

can it be something like;

i provide support for v2048i32 in llvm and then use expand like this;

    setOperationAction(ISD::LOAD     , MVT::v2048i32  , Expand);

so that it can split v2048i32 into v64i32 and in instructioninfo.td i
associate reg_a with it instead of reg_b there in instructioninfo.td i will
use v2048i32 for >=2048 iterations and v64i32 for <2048 iterations but that
v2048i32 is required to be lowered to v64i32 (in isellowering.cpp) because
no support for v2048i32 in hardware. so i will use v2048i32 here just to
distinguish  no of iterations...

What do you say??
please help. i am seriously stuck here.

Thank you





On Tue, Jul 11, 2017 at 7:48 AM, Matthias Braun <mbraun at apple.com> wrote:

> Are you asking for software pipelining? There is a MachinePipeliner pass
> that targets can put into their pass pipeline.
>
> - Matthias
>
> On Jul 10, 2017, at 3:46 AM, hameeza ahmed via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> can someone suggest me solution for this problem??
> Need serious help. My work is stuck............
>
> On Mon, Jul 10, 2017 at 10:22 AM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> Here basically my problem is vector width since i have used v64i32 in my
>> backend. now if vector width=64. i want the Reg_B class registers to be
>> assigned and if vector width=2048 i want Reg_A registers to be assigned to
>> instruction.
>>
>> Should i incorporate the solution in lowering stage? some thing like;
>>
>> addRegisterClass(MVT::v2048i32, &X86::Reg_B);
>>
>>               setOperationAction(ISD::MNLOAD,
>> MVT::v2048i32, custom);
>>
>> then in function LowerOperation(SDValue Op, SelectionDAG &DAG)
>>
>> i should do,
>>
>>   case ISD::MNLOAD:               return LOAD2048(Op, Subtarget, DAG);
>>
>> then i will implement
>> static SDValue LOAD2048(SDValue Op, const X86Subtarget &Subtarget,
>>                           SelectionDAG &DAG)
>> {
>>
>> //dont know the details of this part
>> but here i plan to encode 2048 elements again in 32 v64i32 but with
>> different instruction name like previously it was
>> load<LD256; i intend to make it load<LD256_N
>>
>> so that in instructioninfo.td while pattern matching both LD256 and
>> LD256_N are treated separately. 1 will use Reg_B registers and other will
>> use Reg_A respectively.
>>
>>
>> Is it fine???
>> Please guide me...
>> I need serious help, please.....
>>
>> Thank You
>>
>>
>>
>> On Mon, Jul 10, 2017 at 9:29 AM, hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> or should i write a condition in registerinfo.td; to define the
>>> registers in object Reg_A in specific order according to loop iterations.
>>>
>>> On Mon, Jul 10, 2017 at 9:17 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>> wrote:
>>>
>>>> hello,
>>>>
>>>> i have a situation where i have to assign the registers to instructions
>>>> based on the loop iterations.
>>>>
>>>> for eg..
>>>> the registers are:
>>>> R_0_V_0, R_0_V_1, R_0_V_2, R_0_V_3,
>>>> R_1_V_0, R_1_V_1, R_1_V_2, R_1_V_3,
>>>> R_2_V_0, R_2_V_1, R_2_V_2, R_2_V_3.
>>>> These registers defined in object Reg_A
>>>>
>>>> These are total 12 registers. will use them contiguously, here i
>>>>  define it in above mentioned order i.e changing V first then R.
>>>>
>>>>
>>>> for eg;
>>>> if no of iterations>=4.
>>>> 1st load will take place in R_0_V_0
>>>> 2nd load will take place in R_0_V_1
>>>> 3rd load will take place in R_0_V_2
>>>> 4th load will take place in R_0_V_3
>>>> I am getting this required behavior for iterations>=4. I want this to
>>>> happen only if there are 4 or above iterations in loop.
>>>>
>>>>
>>>> But if my iterations are less than 4 like 3
>>>> again it will do the same thing;
>>>> 1st load will take place in R_0_V_0
>>>> 2nd load will take place in R_0_V_1
>>>> 3rd load will take place in R_0_V_2
>>>>
>>>> Here i dont want the above to happen rather it should increment R
>>>> instead of V in this case.
>>>> It should do something as follows:
>>>> 1st load to take place in R_0_V_0
>>>> 2nd load to take place in R_1_V_0
>>>> 3rd load to take place in R_2_V_0
>>>>
>>>> Now, how to achieve this?
>>>>
>>>> Can i mention some condition in instructioninfo.td file?
>>>> and in registerinfo.td file instead of 1 object Reg_A, there will be 2
>>>> objects Reg_A and Reg_B
>>>> where Reg_B defines same registers but in different order.
>>>>
>>>> Reg_B;
>>>>
>>>> R_0_V_0, R_1_V_0, R_2_V_0,  //here R changes
>>>> R_0_V_1, R_1_V_1, R_2_V_1,
>>>> R_0_V_2, R_1_V_2, R_2_V_2,
>>>> R_0_V_3, R_1_V_3, R_2_V_3.
>>>>
>>>> So that in instructioninfo.td file it will be something like;
>>>>
>>>> if (no of iterations>=4)
>>>>
>>>> load ................$Reg_A      ; here all register operands will come
>>>> from Reg_A instance.
>>>>
>>>> if (no of iterations<4)
>>>>
>>>> load ................$Reg_B      ; here all register operands will come
>>>> from Reg_B instance.
>>>>
>>>>
>>>> Is the above approach possible??? if yes then how can we acquire the no
>>>> of iterations in instructioninfo.td file??
>>>>
>>>> or can you suggest some better way?
>>>>
>>>> Looking forward to response
>>>>
>>>> Thank You.
>>>>
>>>>
>>>>
>>>>
>>>
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170711/b7b24f27/attachment-0001.html>


More information about the llvm-dev mailing list