[llvm-dev] Error in v64i32 type in x86 backend
Craig Topper via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 7 21:59:48 PDT 2017
You don't want RI. That's used for instructions that need a reg prefix. You
need to use $src1 and $src2 in the assembly string too. It also looks like
you have two closing ] brackets.
~Craig
On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:
> Thank you;
> i have changed as follows.is it fine now?
>
> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1,
> VR2048:$src2),
> "VADD_256B\t{$src, $dst|$dst, $src}", [(set VR2048:$dst,
> (add VR2048:$src1, VR2048:$src2))]]>;
>
> Also here i have changed class RI to I. Does it make any difference?
>
>
>
> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper <craig.topper at gmail.com>
> wrote:
>
>> IIC_XADD_REG is used to associate latency and other information for use
>> by the instruction scheduling pass.
>>
>> You're missing a pattern in the square bracket to match an add node. You
>> also need two VR2048 registers in the 'ins'
>>
>> ~Craig
>>
>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> Can you please tell whether following add is correct to add 2 64xi32
>>> numbers.
>>>
>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048
>>> :$src),
>>> "VADD_256B\t{$src, $dst|$dst, $src}", [],
>>> IIC_XADD_REG>, TB;
>>>
>>> what is llc_xadd_reg here?
>>>
>>>
>>>
>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper <craig.topper at gmail.com>
>>> wrote:
>>>
>>>> Change the i32 in the store pattern to v64i32.
>>>>
>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed <hahmed2305 at gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you. i understood how avx512 vector instructions are written in
>>>>> x86instravx512. i need to define my vector instructions so i wrote;
>>>>>
>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins
>>>>> i32mem:$src),
>>>>> "vmov_256B_rm\t{$src, $dst|$dst, $src}",
>>>>> [(set VR2048:$dst, (v64i32 (scalar_to_vector
>>>>> (loadi32 addr:$src))))],
>>>>> IIC_MOV_MEM>, EVEX;
>>>>>
>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins i32mem:$dst,
>>>>> VR2048:$src),
>>>>> "vmov_256B_mr\t{$src, $dst|$dst, $src}",
>>>>> [(store (i32 (bitconvert VR2048:$src)),
>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX;
>>>>>
>>>>> in x86instrinfo.td;
>>>>>
>>>>> when i build i got these instructions in X86GenInstrInfo.
>>>>> but still my instruction is not selected when i run input file in
>>>>> debug mode; getting following errors;
>>>>>
>>>>>
>>>>> ===== Instruction selection begins: BB#1 'vector.body'
>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, undef:i64
>>>>>
>>>>> ISEL: Starting pattern match on root node: t9: ch =
>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)>
>>>>> t8, t7, t11, undef:i64
>>>>>
>>>>> Skipped scope entry (due to false predicate) at index 14, continuing
>>>>> at 81
>>>>> Skipped scope entry (due to false predicate) at index 82, continuing
>>>>> at 149
>>>>> Skipped scope entry (due to false predicate) at index 150,
>>>>> continuing at 217
>>>>> Skipped scope entry (due to false predicate) at index 218,
>>>>> continuing at 267
>>>>> Skipped scope entry (due to false predicate) at index 268,
>>>>> continuing at 317
>>>>> Skipped scope entry (due to false predicate) at index 318,
>>>>> continuing at 367
>>>>> Skipped scope entry (due to false predicate) at index 368,
>>>>> continuing at 394
>>>>> Skipped scope entry (due to false predicate) at index 395,
>>>>> continuing at 421
>>>>> Skipped scope entry (due to false predicate) at index 422,
>>>>> continuing at 471
>>>>> Skipped scope entry (due to false predicate) at index 472,
>>>>> continuing at 521
>>>>> Skipped scope entry (due to false predicate) at index 522,
>>>>> continuing at 571
>>>>> Skipped scope entry (due to false predicate) at index 572,
>>>>> continuing at 639
>>>>> Skipped scope entry (due to false predicate) at index 640,
>>>>> continuing at 707
>>>>> Skipped scope entry (due to false predicate) at index 708,
>>>>> continuing at 775
>>>>> Skipped scope entry (due to false predicate) at index 776,
>>>>> continuing at 804
>>>>> Skipped scope entry (due to false predicate) at index 805,
>>>>> continuing at 833
>>>>> Skipped scope entry (due to false predicate) at index 834,
>>>>> continuing at 862
>>>>> Skipped scope entry (due to false predicate) at index 863,
>>>>> continuing at 891
>>>>> Skipped scope entry (due to false predicate) at index 892,
>>>>> continuing at 920
>>>>> Skipped scope entry (due to false predicate) at index 921,
>>>>> continuing at 949
>>>>> Skipped scope entry (due to false predicate) at index 950,
>>>>> continuing at 987
>>>>> Skipped scope entry (due to false predicate) at index 988,
>>>>> continuing at 1025
>>>>> Match failed at index 12
>>>>> Continuing at 1026
>>>>> OpcodeSwitch from 1029 to 5725
>>>>> Match failed at index 5743
>>>>> Continuing at 5772
>>>>> Match failed at index 5776
>>>>> Continuing at 5805
>>>>> Match failed at index 5809
>>>>> Continuing at 5838
>>>>> Match failed at index 5842
>>>>> Continuing at 5911
>>>>> Match failed at index 5915
>>>>> Continuing at 5953
>>>>> Match failed at index 5957
>>>>> Continuing at 5995
>>>>> Match failed at index 5999
>>>>> Continuing at 6037
>>>>> Match failed at index 6041
>>>>> Continuing at 6084
>>>>> Match failed at index 6088
>>>>> Continuing at 6131
>>>>> Skipped scope entry (due to false predicate) at index 6138,
>>>>> continuing at 6181
>>>>> Skipped scope entry (due to false predicate) at index 6182,
>>>>> continuing at 6228
>>>>> Skipped scope entry (due to false predicate) at index 6235,
>>>>> continuing at 6384
>>>>> Match failed at index 6388
>>>>> Continuing at 6419
>>>>> Match failed at index 6423
>>>>> Continuing at 6454
>>>>> Match failed at index 6458
>>>>> Continuing at 6489
>>>>> Continuing at 6490
>>>>> Continuing at 6491
>>>>> Continuing at 6492
>>>>> Match failed at index 6514
>>>>> Continuing at 6545
>>>>> Match failed at index 6562
>>>>> Continuing at 6593
>>>>> Match failed at index 6610
>>>>> Continuing at 6641
>>>>> Continuing at 6642
>>>>> Match failed at index 6658
>>>>> Continuing at 6772
>>>>> Match failed at index 6788
>>>>> Continuing at 6902
>>>>> Continuing at 13636
>>>>> Match failed at index 13640
>>>>> Continuing at 14940
>>>>> Match failed at index 14943
>>>>> Continuing at 15415
>>>>> Match failed at index 15417
>>>>> Continuing at 15570
>>>>> Match failed at index 15571
>>>>> Continuing at 15598
>>>>> Match failed at index 15599
>>>>> Continuing at 15716
>>>>> Match failed at index 15719
>>>>> Continuing at 15837
>>>>> Match failed at index 15840
>>>>> Continuing at 16198
>>>>> Skipped scope entry (due to false predicate) at index 16203,
>>>>> continuing at 16285
>>>>> Skipped scope entry (due to false predicate) at index 16286,
>>>>> continuing at 16394
>>>>> Skipped scope entry (due to false predicate) at index 16395,
>>>>> continuing at 16464
>>>>> Skipped scope entry (due to false predicate) at index 16465,
>>>>> continuing at 16487
>>>>> Skipped scope entry (due to false predicate) at index 16488,
>>>>> continuing at 16510
>>>>> Skipped scope entry (due to false predicate) at index 16511,
>>>>> continuing at 16533
>>>>> Skipped scope entry (due to false predicate) at index 16534,
>>>>> continuing at 16556
>>>>> Skipped scope entry (due to false predicate) at index 16557,
>>>>> continuing at 16680
>>>>> Skipped scope entry (due to false predicate) at index 16681,
>>>>> continuing at 16804
>>>>> Skipped scope entry (due to false predicate) at index 16805,
>>>>> continuing at 16890
>>>>> Skipped scope entry (due to false predicate) at index 16891,
>>>>> continuing at 16976
>>>>> Skipped scope entry (due to false predicate) at index 16978,
>>>>> continuing at 17169
>>>>> Skipped scope entry (due to false predicate) at index 17171,
>>>>> continuing at 17342
>>>>> Skipped scope entry (due to false predicate) at index 17344,
>>>>> continuing at 17497
>>>>> Skipped scope entry (due to false predicate) at index 17499,
>>>>> continuing at 17632
>>>>> Skipped scope entry (due to false predicate) at index 17634,
>>>>> continuing at 17801
>>>>> Skipped scope entry (due to false predicate) at index 17803,
>>>>> continuing at 17944
>>>>> Skipped scope entry (due to false predicate) at index 17946,
>>>>> continuing at 18074
>>>>> Skipped scope entry (due to false predicate) at index 18075,
>>>>> continuing at 18178
>>>>> Skipped scope entry (due to false predicate) at index 18179,
>>>>> continuing at 18253
>>>>> Skipped scope entry (due to false predicate) at index 18254,
>>>>> continuing at 18278
>>>>> Skipped scope entry (due to false predicate) at index 18279,
>>>>> continuing at 18303
>>>>> Skipped scope entry (due to false predicate) at index 18304,
>>>>> continuing at 18328
>>>>> Skipped scope entry (due to false predicate) at index 18329,
>>>>> continuing at 18376
>>>>> Skipped scope entry (due to false predicate) at index 18377,
>>>>> continuing at 18424
>>>>> Skipped scope entry (due to false predicate) at index 18425,
>>>>> continuing at 18520
>>>>> Skipped scope entry (due to false predicate) at index 18521,
>>>>> continuing at 18636
>>>>> Skipped scope entry (due to false predicate) at index 18637,
>>>>> continuing at 18661
>>>>> Skipped scope entry (due to false predicate) at index 18662,
>>>>> continuing at 18711
>>>>> Skipped scope entry (due to false predicate) at index 18712,
>>>>> continuing at 18736
>>>>> Skipped scope entry (due to false predicate) at index 18737,
>>>>> continuing at 18770
>>>>> Skipped scope entry (due to false predicate) at index 18771,
>>>>> continuing at 18856
>>>>> Skipped scope entry (due to false predicate) at index 18857,
>>>>> continuing at 18942
>>>>> Skipped scope entry (due to false predicate) at index 18943,
>>>>> continuing at 19028
>>>>> Match failed at index 16201
>>>>> Continuing at 19029
>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65 x i32]*
>>>>> @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11,
>>>>> undef:i64
>>>>> t7: v64i32 = add t6, t4
>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t11,
>>>>> undef:i64
>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>> @c> 0
>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>> t3: i64 = undef
>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t13,
>>>>> undef:i64
>>>>> t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>> @b> 0
>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>> t3: i64 = undef
>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>> t3: i64 = undef
>>>>> In function: foo
>>>>>
>>>>>
>>>>>
>>>>> What could be the reason of this?? Please correct me.
>>>>> I am stuck at this point....
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli <
>>>>> efriedma at codeaurora.org> wrote:
>>>>>
>>>>>> The word "fold" is used all over LLVM. It generally refers to
>>>>>> transformations which delete an instruction.
>>>>>>
>>>>>> If you're asking about http://llvm.org/docs/CodeGener
>>>>>> ator.html#instruction-folding , it just means an instruction which
>>>>>> was produced by the "instruction folding" transform; there isn't anything
>>>>>> special about the instruction itself.
>>>>>>
>>>>>> -Eli
>>>>>>
>>>>>>
>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote:
>>>>>>
>>>>>> What is meant by folded instructions in LLVM?
>>>>>> How they work?
>>>>>>
>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank You.
>>>>>>>
>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <
>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>
>>>>>>>> Yes, that error is from instruction selection. I think your
>>>>>>>> legalization changes worked fine.
>>>>>>>>
>>>>>>>> ~Craig
>>>>>>>>
>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> also i further run the following command;
>>>>>>>>> llc -debug filer-knl_o3.ll
>>>>>>>>>
>>>>>>>>> and its output is attached here. by looking at the output can we
>>>>>>>>> say that legalization runs fine and the error is due to instruction
>>>>>>>>> selection/ pattern matching which is not yet implemented?
>>>>>>>>>
>>>>>>>>> so do i need to worry and try to correct it at this stage or
>>>>>>>>> should i move forward to implement instruction selection/ pattern matching?
>>>>>>>>>
>>>>>>>>> Please guide me.
>>>>>>>>>
>>>>>>>>> Thank You
>>>>>>>>>
>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thank You. well i have seen these links. but they dont cover the
>>>>>>>>>> problem that i have mentioned. actually i am doing all the things step by
>>>>>>>>>> step.
>>>>>>>>>>
>>>>>>>>>> so i havent yet worked with instruction selection phase/ files.
>>>>>>>>>> rather before that i am trying to do legalization by allowing vector
>>>>>>>>>> elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil now,
>>>>>>>>>> i.e registerinfo.td to define register class to be called in
>>>>>>>>>> legalization. and most importantly i am dealing with file
>>>>>>>>>> X86ISelLowering.cpp.
>>>>>>>>>>
>>>>>>>>>> Now is there any relation in this and instruction selection.
>>>>>>>>>> since instruction selection comes after combine and legalize so i havent
>>>>>>>>>> yet worked on it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please correct me, I am stuck here.
>>>>>>>>>>
>>>>>>>>>> Thank You again
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli <
>>>>>>>>>> efriedma at codeaurora.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAnLLVMBackend.html
>>>>>>>>>>> and http://llvm.org/docs/CodeGenerator.html ?
>>>>>>>>>>> http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-s
>>>>>>>>>>> elector describes how to define a store instruction.
>>>>>>>>>>>
>>>>>>>>>>> -Eli
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
>>>>>>>>>>>
>>>>>>>>>>> Please correct me i m stuck at this point.
>>>>>>>>>>>
>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" <hahmed2305 at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>> i am experimenting with the increase in register/ vector width
>>>>>>>>>>> to 64 elements of 32 bits instead of 16 in x86 backend.
>>>>>>>>>>> for eg.
>>>>>>>>>>> i have a loop with 65 iterations;
>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the backend breaks
>>>>>>>>>>> the v64i32 into 4 v16i32. i want it to retain v64i32. like if there are 128
>>>>>>>>>>> elements in loop then it should break it into 2 v64i32 instructions.
>>>>>>>>>>>
>>>>>>>>>>> in order to do this i have made necessary changes in
>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the
>>>>>>>>>>> command -view-dag-combine2-dags i get the required output in
>>>>>>>>>>> graph but the following error on console:
>>>>>>>>>>>
>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast ([65 x
>>>>>>>>>>> i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
>>>>>>>>>>> t12, undef:i64
>>>>>>>>>>> t7: v64i32 = add t6, t4
>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>>>>>>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>>>>>>>>> undef:i64
>>>>>>>>>>> t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>>>> i32]* @c> 0
>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>>>>>>> t3: i64 = undef
>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>>>>>>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>>>>>>>>> undef:i64
>>>>>>>>>>> t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>>>> i32]* @b> 0
>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>>>>>>> t3: i64 = undef
>>>>>>>>>>> t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>>>>>>>> @a> 0
>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
>>>>>>>>>>> t3: i64 = undef
>>>>>>>>>>> In function: foo
>>>>>>>>>>>
>>>>>>>>>>> The dag after legalization is also attached here.
>>>>>>>>>>>
>>>>>>>>>>> the source is vector sum of 65 elements.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kindly correct me.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>
>>>>>>
>>>>> --
>>>> ~Craig
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/41b5b433/attachment.html>
More information about the llvm-dev
mailing list