[llvm-dev] Error in v64i32 type in x86 backend
Craig Topper via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 7 21:38:45 PDT 2017
IIC_XADD_REG is used to associate latency and other information for use by
the instruction scheduling pass.
You're missing a pattern in the square bracket to match an add node. You
also need two VR2048 registers in the 'ins'
~Craig
On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:
> Can you please tell whether following add is correct to add 2 64xi32
> numbers.
>
> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048
> :$src),
> "VADD_256B\t{$src, $dst|$dst, $src}", [],
> IIC_XADD_REG>, TB;
>
> what is llc_xadd_reg here?
>
>
>
> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper <craig.topper at gmail.com>
> wrote:
>
>> Change the i32 in the store pattern to v64i32.
>>
>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> Thank you. i understood how avx512 vector instructions are written in
>>> x86instravx512. i need to define my vector instructions so i wrote;
>>>
>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins
>>> i32mem:$src),
>>> "vmov_256B_rm\t{$src, $dst|$dst, $src}",
>>> [(set VR2048:$dst, (v64i32 (scalar_to_vector
>>> (loadi32 addr:$src))))],
>>> IIC_MOV_MEM>, EVEX;
>>>
>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins i32mem:$dst,
>>> VR2048:$src),
>>> "vmov_256B_mr\t{$src, $dst|$dst, $src}",
>>> [(store (i32 (bitconvert VR2048:$src)), addr:$dst)],
>>> IIC_MOV_MEM>, EVEX;
>>>
>>> in x86instrinfo.td;
>>>
>>> when i build i got these instructions in X86GenInstrInfo.
>>> but still my instruction is not selected when i run input file in debug
>>> mode; getting following errors;
>>>
>>>
>>> ===== Instruction selection begins: BB#1 'vector.body'
>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to <64 x
>>> i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, undef:i64
>>>
>>> ISEL: Starting pattern match on root node: t9: ch = store<ST256[bitcast
>>> ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7,
>>> t11, undef:i64
>>>
>>> Skipped scope entry (due to false predicate) at index 14, continuing
>>> at 81
>>> Skipped scope entry (due to false predicate) at index 82, continuing
>>> at 149
>>> Skipped scope entry (due to false predicate) at index 150, continuing
>>> at 217
>>> Skipped scope entry (due to false predicate) at index 218, continuing
>>> at 267
>>> Skipped scope entry (due to false predicate) at index 268, continuing
>>> at 317
>>> Skipped scope entry (due to false predicate) at index 318, continuing
>>> at 367
>>> Skipped scope entry (due to false predicate) at index 368, continuing
>>> at 394
>>> Skipped scope entry (due to false predicate) at index 395, continuing
>>> at 421
>>> Skipped scope entry (due to false predicate) at index 422, continuing
>>> at 471
>>> Skipped scope entry (due to false predicate) at index 472, continuing
>>> at 521
>>> Skipped scope entry (due to false predicate) at index 522, continuing
>>> at 571
>>> Skipped scope entry (due to false predicate) at index 572, continuing
>>> at 639
>>> Skipped scope entry (due to false predicate) at index 640, continuing
>>> at 707
>>> Skipped scope entry (due to false predicate) at index 708, continuing
>>> at 775
>>> Skipped scope entry (due to false predicate) at index 776, continuing
>>> at 804
>>> Skipped scope entry (due to false predicate) at index 805, continuing
>>> at 833
>>> Skipped scope entry (due to false predicate) at index 834, continuing
>>> at 862
>>> Skipped scope entry (due to false predicate) at index 863, continuing
>>> at 891
>>> Skipped scope entry (due to false predicate) at index 892, continuing
>>> at 920
>>> Skipped scope entry (due to false predicate) at index 921, continuing
>>> at 949
>>> Skipped scope entry (due to false predicate) at index 950, continuing
>>> at 987
>>> Skipped scope entry (due to false predicate) at index 988, continuing
>>> at 1025
>>> Match failed at index 12
>>> Continuing at 1026
>>> OpcodeSwitch from 1029 to 5725
>>> Match failed at index 5743
>>> Continuing at 5772
>>> Match failed at index 5776
>>> Continuing at 5805
>>> Match failed at index 5809
>>> Continuing at 5838
>>> Match failed at index 5842
>>> Continuing at 5911
>>> Match failed at index 5915
>>> Continuing at 5953
>>> Match failed at index 5957
>>> Continuing at 5995
>>> Match failed at index 5999
>>> Continuing at 6037
>>> Match failed at index 6041
>>> Continuing at 6084
>>> Match failed at index 6088
>>> Continuing at 6131
>>> Skipped scope entry (due to false predicate) at index 6138, continuing
>>> at 6181
>>> Skipped scope entry (due to false predicate) at index 6182, continuing
>>> at 6228
>>> Skipped scope entry (due to false predicate) at index 6235, continuing
>>> at 6384
>>> Match failed at index 6388
>>> Continuing at 6419
>>> Match failed at index 6423
>>> Continuing at 6454
>>> Match failed at index 6458
>>> Continuing at 6489
>>> Continuing at 6490
>>> Continuing at 6491
>>> Continuing at 6492
>>> Match failed at index 6514
>>> Continuing at 6545
>>> Match failed at index 6562
>>> Continuing at 6593
>>> Match failed at index 6610
>>> Continuing at 6641
>>> Continuing at 6642
>>> Match failed at index 6658
>>> Continuing at 6772
>>> Match failed at index 6788
>>> Continuing at 6902
>>> Continuing at 13636
>>> Match failed at index 13640
>>> Continuing at 14940
>>> Match failed at index 14943
>>> Continuing at 15415
>>> Match failed at index 15417
>>> Continuing at 15570
>>> Match failed at index 15571
>>> Continuing at 15598
>>> Match failed at index 15599
>>> Continuing at 15716
>>> Match failed at index 15719
>>> Continuing at 15837
>>> Match failed at index 15840
>>> Continuing at 16198
>>> Skipped scope entry (due to false predicate) at index 16203,
>>> continuing at 16285
>>> Skipped scope entry (due to false predicate) at index 16286,
>>> continuing at 16394
>>> Skipped scope entry (due to false predicate) at index 16395,
>>> continuing at 16464
>>> Skipped scope entry (due to false predicate) at index 16465,
>>> continuing at 16487
>>> Skipped scope entry (due to false predicate) at index 16488,
>>> continuing at 16510
>>> Skipped scope entry (due to false predicate) at index 16511,
>>> continuing at 16533
>>> Skipped scope entry (due to false predicate) at index 16534,
>>> continuing at 16556
>>> Skipped scope entry (due to false predicate) at index 16557,
>>> continuing at 16680
>>> Skipped scope entry (due to false predicate) at index 16681,
>>> continuing at 16804
>>> Skipped scope entry (due to false predicate) at index 16805,
>>> continuing at 16890
>>> Skipped scope entry (due to false predicate) at index 16891,
>>> continuing at 16976
>>> Skipped scope entry (due to false predicate) at index 16978,
>>> continuing at 17169
>>> Skipped scope entry (due to false predicate) at index 17171,
>>> continuing at 17342
>>> Skipped scope entry (due to false predicate) at index 17344,
>>> continuing at 17497
>>> Skipped scope entry (due to false predicate) at index 17499,
>>> continuing at 17632
>>> Skipped scope entry (due to false predicate) at index 17634,
>>> continuing at 17801
>>> Skipped scope entry (due to false predicate) at index 17803,
>>> continuing at 17944
>>> Skipped scope entry (due to false predicate) at index 17946,
>>> continuing at 18074
>>> Skipped scope entry (due to false predicate) at index 18075,
>>> continuing at 18178
>>> Skipped scope entry (due to false predicate) at index 18179,
>>> continuing at 18253
>>> Skipped scope entry (due to false predicate) at index 18254,
>>> continuing at 18278
>>> Skipped scope entry (due to false predicate) at index 18279,
>>> continuing at 18303
>>> Skipped scope entry (due to false predicate) at index 18304,
>>> continuing at 18328
>>> Skipped scope entry (due to false predicate) at index 18329,
>>> continuing at 18376
>>> Skipped scope entry (due to false predicate) at index 18377,
>>> continuing at 18424
>>> Skipped scope entry (due to false predicate) at index 18425,
>>> continuing at 18520
>>> Skipped scope entry (due to false predicate) at index 18521,
>>> continuing at 18636
>>> Skipped scope entry (due to false predicate) at index 18637,
>>> continuing at 18661
>>> Skipped scope entry (due to false predicate) at index 18662,
>>> continuing at 18711
>>> Skipped scope entry (due to false predicate) at index 18712,
>>> continuing at 18736
>>> Skipped scope entry (due to false predicate) at index 18737,
>>> continuing at 18770
>>> Skipped scope entry (due to false predicate) at index 18771,
>>> continuing at 18856
>>> Skipped scope entry (due to false predicate) at index 18857,
>>> continuing at 18942
>>> Skipped scope entry (due to false predicate) at index 18943,
>>> continuing at 19028
>>> Match failed at index 16201
>>> Continuing at 19029
>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65 x i32]* @c
>>> to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, undef:i64
>>> t7: v64i32 = add t6, t4
>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>>> i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t11, undef:i64
>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c>
>>> 0
>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>> t3: i64 = undef
>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>>> i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t13, undef:i64
>>> t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b>
>>> 0
>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>> t3: i64 = undef
>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>> t3: i64 = undef
>>> In function: foo
>>>
>>>
>>>
>>> What could be the reason of this?? Please correct me.
>>> I am stuck at this point....
>>>
>>>
>>>
>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli <efriedma at codeaurora.org>
>>> wrote:
>>>
>>>> The word "fold" is used all over LLVM. It generally refers to
>>>> transformations which delete an instruction.
>>>>
>>>> If you're asking about http://llvm.org/docs/CodeGener
>>>> ator.html#instruction-folding , it just means an instruction which was
>>>> produced by the "instruction folding" transform; there isn't anything
>>>> special about the instruction itself.
>>>>
>>>> -Eli
>>>>
>>>>
>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote:
>>>>
>>>> What is meant by folded instructions in LLVM?
>>>> How they work?
>>>>
>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>> wrote:
>>>>
>>>>> Thank You.
>>>>>
>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Yes, that error is from instruction selection. I think your
>>>>>> legalization changes worked fine.
>>>>>>
>>>>>> ~Craig
>>>>>>
>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> also i further run the following command;
>>>>>>> llc -debug filer-knl_o3.ll
>>>>>>>
>>>>>>> and its output is attached here. by looking at the output can we say
>>>>>>> that legalization runs fine and the error is due to instruction selection/
>>>>>>> pattern matching which is not yet implemented?
>>>>>>>
>>>>>>> so do i need to worry and try to correct it at this stage or should
>>>>>>> i move forward to implement instruction selection/ pattern matching?
>>>>>>>
>>>>>>> Please guide me.
>>>>>>>
>>>>>>> Thank You
>>>>>>>
>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank You. well i have seen these links. but they dont cover the
>>>>>>>> problem that i have mentioned. actually i am doing all the things step by
>>>>>>>> step.
>>>>>>>>
>>>>>>>> so i havent yet worked with instruction selection phase/ files.
>>>>>>>> rather before that i am trying to do legalization by allowing vector
>>>>>>>> elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil now,
>>>>>>>> i.e registerinfo.td to define register class to be called in
>>>>>>>> legalization. and most importantly i am dealing with file
>>>>>>>> X86ISelLowering.cpp.
>>>>>>>>
>>>>>>>> Now is there any relation in this and instruction selection. since
>>>>>>>> instruction selection comes after combine and legalize so i havent yet
>>>>>>>> worked on it.
>>>>>>>>
>>>>>>>>
>>>>>>>> Please correct me, I am stuck here.
>>>>>>>>
>>>>>>>> Thank You again
>>>>>>>>
>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli <
>>>>>>>> efriedma at codeaurora.org> wrote:
>>>>>>>>
>>>>>>>>> Have you read http://llvm.org/docs/WritingAnLLVMBackend.html and
>>>>>>>>> http://llvm.org/docs/CodeGenerator.html ?
>>>>>>>>> http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-
>>>>>>>>> selector describes how to define a store instruction.
>>>>>>>>>
>>>>>>>>> -Eli
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
>>>>>>>>>
>>>>>>>>> Please correct me i m stuck at this point.
>>>>>>>>>
>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" <hahmed2305 at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>> i am experimenting with the increase in register/ vector width to
>>>>>>>>> 64 elements of 32 bits instead of 16 in x86 backend.
>>>>>>>>> for eg.
>>>>>>>>> i have a loop with 65 iterations;
>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the backend breaks
>>>>>>>>> the v64i32 into 4 v16i32. i want it to retain v64i32. like if there are 128
>>>>>>>>> elements in loop then it should break it into 2 v64i32 instructions.
>>>>>>>>>
>>>>>>>>> in order to do this i have made necessary changes in
>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the
>>>>>>>>> command -view-dag-combine2-dags i get the required output in
>>>>>>>>> graph but the following error on console:
>>>>>>>>>
>>>>>>>>> LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast ([65 x
>>>>>>>>> i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
>>>>>>>>> t12, undef:i64
>>>>>>>>> t7: v64i32 = add t6, t4
>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>>>>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>>>>>>> undef:i64
>>>>>>>>> t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>> i32]* @c> 0
>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>>>>> t3: i64 = undef
>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>>>>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>>>>>>> undef:i64
>>>>>>>>> t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>> i32]* @b> 0
>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>>>>> t3: i64 = undef
>>>>>>>>> t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>>>>>> @a> 0
>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
>>>>>>>>> t3: i64 = undef
>>>>>>>>> In function: foo
>>>>>>>>>
>>>>>>>>> The dag after legalization is also attached here.
>>>>>>>>>
>>>>>>>>> the source is vector sum of 65 elements.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Kindly correct me.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>> --
>> ~Craig
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/77c5b124/attachment.html>
More information about the llvm-dev
mailing list