[llvm-dev] Error in v64i32 type in x86 backend

Fri Jul 7 21:59:48 PDT 2017

You don't want RI. That's used for instructions that need a reg prefix. You
need to use $src1 and $src2 in the assembly string too. It also looks like
you have two closing ] brackets.

~Craig

On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:

> Thank you;
> i have changed as follows.is it fine now?
>
> def  VADD_256B  : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1,
> VR2048:$src2),
>                    "VADD_256B\t{$src, $dst|$dst, $src}", [(set VR2048:$dst,
> (add VR2048:$src1, VR2048:$src2))]]>;
>
> Also here i have changed class RI to I. Does it make any difference?
>
>
>
> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper <craig.topper at gmail.com>
> wrote:
>
>> IIC_XADD_REG is used to associate latency and other information for use
>> by the instruction scheduling pass.
>>
>> You're missing a pattern in the square bracket to match an add node. You
>> also need two VR2048 registers in the 'ins'
>>
>> ~Craig
>>
>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> Can you please tell whether following add is correct to add 2 64xi32
>>> numbers.
>>>
>>> def VADD_256B  : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048
>>> :$src),
>>>                    "VADD_256B\t{$src, $dst|$dst, $src}", [],
>>> IIC_XADD_REG>, TB;
>>>
>>> what is llc_xadd_reg here?
>>>
>>>
>>>
>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper <craig.topper at gmail.com>
>>> wrote:
>>>
>>>> Change the i32 in the store pattern to v64i32.
>>>>
>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed <hahmed2305 at gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you. i understood how avx512 vector instructions are written in
>>>>> x86instravx512. i need to define my vector instructions so i wrote;
>>>>>
>>>>>  def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins
>>>>> i32mem:$src),
>>>>>                     "vmov_256B_rm\t{$src, $dst|$dst, $src}",
>>>>>                     [(set VR2048:$dst, (v64i32 (scalar_to_vector
>>>>> (loadi32 addr:$src))))],
>>>>>                     IIC_MOV_MEM>, EVEX;
>>>>>
>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins i32mem:$dst,
>>>>> VR2048:$src),
>>>>>                     "vmov_256B_mr\t{$src, $dst|$dst, $src}",
>>>>>                     [(store (i32 (bitconvert VR2048:$src)),
>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX;
>>>>>
>>>>> in x86instrinfo.td;
>>>>>
>>>>> when i build i got these instructions in X86GenInstrInfo.
>>>>> but still my instruction is not selected when i run input file in
>>>>> debug mode; getting following errors;
>>>>>
>>>>>
>>>>> ===== Instruction selection begins: BB#1 'vector.body'
>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, undef:i64
>>>>>
>>>>> ISEL: Starting pattern match on root node: t9: ch =
>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)>
>>>>> t8, t7, t11, undef:i64
>>>>>
>>>>>   Skipped scope entry (due to false predicate) at index 14, continuing
>>>>> at 81
>>>>>   Skipped scope entry (due to false predicate) at index 82, continuing
>>>>> at 149
>>>>>   Skipped scope entry (due to false predicate) at index 150,
>>>>> continuing at 217
>>>>>   Skipped scope entry (due to false predicate) at index 218,
>>>>> continuing at 267
>>>>>   Skipped scope entry (due to false predicate) at index 268,
>>>>> continuing at 317
>>>>>   Skipped scope entry (due to false predicate) at index 318,
>>>>> continuing at 367
>>>>>   Skipped scope entry (due to false predicate) at index 368,
>>>>> continuing at 394
>>>>>   Skipped scope entry (due to false predicate) at index 395,
>>>>> continuing at 421
>>>>>   Skipped scope entry (due to false predicate) at index 422,
>>>>> continuing at 471
>>>>>   Skipped scope entry (due to false predicate) at index 472,
>>>>> continuing at 521
>>>>>   Skipped scope entry (due to false predicate) at index 522,
>>>>> continuing at 571
>>>>>   Skipped scope entry (due to false predicate) at index 572,
>>>>> continuing at 639
>>>>>   Skipped scope entry (due to false predicate) at index 640,
>>>>> continuing at 707
>>>>>   Skipped scope entry (due to false predicate) at index 708,
>>>>> continuing at 775
>>>>>   Skipped scope entry (due to false predicate) at index 776,
>>>>> continuing at 804
>>>>>   Skipped scope entry (due to false predicate) at index 805,
>>>>> continuing at 833
>>>>>   Skipped scope entry (due to false predicate) at index 834,
>>>>> continuing at 862
>>>>>   Skipped scope entry (due to false predicate) at index 863,
>>>>> continuing at 891
>>>>>   Skipped scope entry (due to false predicate) at index 892,
>>>>> continuing at 920
>>>>>   Skipped scope entry (due to false predicate) at index 921,
>>>>> continuing at 949
>>>>>   Skipped scope entry (due to false predicate) at index 950,
>>>>> continuing at 987
>>>>>   Skipped scope entry (due to false predicate) at index 988,
>>>>> continuing at 1025
>>>>>   Match failed at index 12
>>>>>   Continuing at 1026
>>>>>   OpcodeSwitch from 1029 to 5725
>>>>>   Match failed at index 5743
>>>>>   Continuing at 5772
>>>>>   Match failed at index 5776
>>>>>   Continuing at 5805
>>>>>   Match failed at index 5809
>>>>>   Continuing at 5838
>>>>>   Match failed at index 5842
>>>>>   Continuing at 5911
>>>>>   Match failed at index 5915
>>>>>   Continuing at 5953
>>>>>   Match failed at index 5957
>>>>>   Continuing at 5995
>>>>>   Match failed at index 5999
>>>>>   Continuing at 6037
>>>>>   Match failed at index 6041
>>>>>   Continuing at 6084
>>>>>   Match failed at index 6088
>>>>>   Continuing at 6131
>>>>>   Skipped scope entry (due to false predicate) at index 6138,
>>>>> continuing at 6181
>>>>>   Skipped scope entry (due to false predicate) at index 6182,
>>>>> continuing at 6228
>>>>>   Skipped scope entry (due to false predicate) at index 6235,
>>>>> continuing at 6384
>>>>>   Match failed at index 6388
>>>>>   Continuing at 6419
>>>>>   Match failed at index 6423
>>>>>   Continuing at 6454
>>>>>   Match failed at index 6458
>>>>>   Continuing at 6489
>>>>>   Continuing at 6490
>>>>>   Continuing at 6491
>>>>>   Continuing at 6492
>>>>>   Match failed at index 6514
>>>>>   Continuing at 6545
>>>>>   Match failed at index 6562
>>>>>   Continuing at 6593
>>>>>   Match failed at index 6610
>>>>>   Continuing at 6641
>>>>>   Continuing at 6642
>>>>>   Match failed at index 6658
>>>>>   Continuing at 6772
>>>>>   Match failed at index 6788
>>>>>   Continuing at 6902
>>>>>   Continuing at 13636
>>>>>   Match failed at index 13640
>>>>>   Continuing at 14940
>>>>>   Match failed at index 14943
>>>>>   Continuing at 15415
>>>>>   Match failed at index 15417
>>>>>   Continuing at 15570
>>>>>   Match failed at index 15571
>>>>>   Continuing at 15598
>>>>>   Match failed at index 15599
>>>>>   Continuing at 15716
>>>>>   Match failed at index 15719
>>>>>   Continuing at 15837
>>>>>   Match failed at index 15840
>>>>>   Continuing at 16198
>>>>>   Skipped scope entry (due to false predicate) at index 16203,
>>>>> continuing at 16285
>>>>>   Skipped scope entry (due to false predicate) at index 16286,
>>>>> continuing at 16394
>>>>>   Skipped scope entry (due to false predicate) at index 16395,
>>>>> continuing at 16464
>>>>>   Skipped scope entry (due to false predicate) at index 16465,
>>>>> continuing at 16487
>>>>>   Skipped scope entry (due to false predicate) at index 16488,
>>>>> continuing at 16510
>>>>>   Skipped scope entry (due to false predicate) at index 16511,
>>>>> continuing at 16533
>>>>>   Skipped scope entry (due to false predicate) at index 16534,
>>>>> continuing at 16556
>>>>>   Skipped scope entry (due to false predicate) at index 16557,
>>>>> continuing at 16680
>>>>>   Skipped scope entry (due to false predicate) at index 16681,
>>>>> continuing at 16804
>>>>>   Skipped scope entry (due to false predicate) at index 16805,
>>>>> continuing at 16890
>>>>>   Skipped scope entry (due to false predicate) at index 16891,
>>>>> continuing at 16976
>>>>>   Skipped scope entry (due to false predicate) at index 16978,
>>>>> continuing at 17169
>>>>>   Skipped scope entry (due to false predicate) at index 17171,
>>>>> continuing at 17342
>>>>>   Skipped scope entry (due to false predicate) at index 17344,
>>>>> continuing at 17497
>>>>>   Skipped scope entry (due to false predicate) at index 17499,
>>>>> continuing at 17632
>>>>>   Skipped scope entry (due to false predicate) at index 17634,
>>>>> continuing at 17801
>>>>>   Skipped scope entry (due to false predicate) at index 17803,
>>>>> continuing at 17944
>>>>>   Skipped scope entry (due to false predicate) at index 17946,
>>>>> continuing at 18074
>>>>>   Skipped scope entry (due to false predicate) at index 18075,
>>>>> continuing at 18178
>>>>>   Skipped scope entry (due to false predicate) at index 18179,
>>>>> continuing at 18253
>>>>>   Skipped scope entry (due to false predicate) at index 18254,
>>>>> continuing at 18278
>>>>>   Skipped scope entry (due to false predicate) at index 18279,
>>>>> continuing at 18303
>>>>>   Skipped scope entry (due to false predicate) at index 18304,
>>>>> continuing at 18328
>>>>>   Skipped scope entry (due to false predicate) at index 18329,
>>>>> continuing at 18376
>>>>>   Skipped scope entry (due to false predicate) at index 18377,
>>>>> continuing at 18424
>>>>>   Skipped scope entry (due to false predicate) at index 18425,
>>>>> continuing at 18520
>>>>>   Skipped scope entry (due to false predicate) at index 18521,
>>>>> continuing at 18636
>>>>>   Skipped scope entry (due to false predicate) at index 18637,
>>>>> continuing at 18661
>>>>>   Skipped scope entry (due to false predicate) at index 18662,
>>>>> continuing at 18711
>>>>>   Skipped scope entry (due to false predicate) at index 18712,
>>>>> continuing at 18736
>>>>>   Skipped scope entry (due to false predicate) at index 18737,
>>>>> continuing at 18770
>>>>>   Skipped scope entry (due to false predicate) at index 18771,
>>>>> continuing at 18856
>>>>>   Skipped scope entry (due to false predicate) at index 18857,
>>>>> continuing at 18942
>>>>>   Skipped scope entry (due to false predicate) at index 18943,
>>>>> continuing at 19028
>>>>>   Match failed at index 16201
>>>>>   Continuing at 19029
>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65 x i32]*
>>>>> @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11,
>>>>> undef:i64
>>>>>   t7: v64i32 = add t6, t4
>>>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t11,
>>>>> undef:i64
>>>>>       t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>> @c> 0
>>>>>         t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>       t3: i64 = undef
>>>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t13,
>>>>> undef:i64
>>>>>       t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>> @b> 0
>>>>>         t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>       t3: i64 = undef
>>>>>   t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
>>>>>     t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>   t3: i64 = undef
>>>>> In function: foo
>>>>>
>>>>>
>>>>>
>>>>> What could be the reason of this?? Please correct me.
>>>>> I am stuck at this point....
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli <
>>>>> efriedma at codeaurora.org> wrote:
>>>>>
>>>>>> The word "fold" is used all over LLVM.  It generally refers to
>>>>>> transformations which delete an instruction.
>>>>>>
>>>>>> If you're asking about http://llvm.org/docs/CodeGener
>>>>>> ator.html#instruction-folding , it just means an instruction which
>>>>>> was produced by the "instruction folding" transform; there isn't anything
>>>>>> special about the instruction itself.
>>>>>>
>>>>>> -Eli
>>>>>>
>>>>>>
>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote:
>>>>>>
>>>>>> What is meant by folded instructions in LLVM?
>>>>>> How they work?
>>>>>>
>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank You.
>>>>>>>
>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <
>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>
>>>>>>>> Yes, that error is from instruction selection. I think your
>>>>>>>> legalization changes worked fine.
>>>>>>>>
>>>>>>>> ~Craig
>>>>>>>>
>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> also i further run the following command;
>>>>>>>>> llc -debug filer-knl_o3.ll
>>>>>>>>>
>>>>>>>>> and its output is attached here. by looking at the output can we
>>>>>>>>> say that legalization runs fine and the error is due to instruction
>>>>>>>>> selection/ pattern matching which is not yet implemented?
>>>>>>>>>
>>>>>>>>> so do i need to worry and try to correct it at this stage or
>>>>>>>>> should i move forward to implement instruction selection/ pattern matching?
>>>>>>>>>
>>>>>>>>> Please guide me.
>>>>>>>>>
>>>>>>>>> Thank You
>>>>>>>>>
>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thank You. well i have seen these links. but they dont cover the
>>>>>>>>>> problem that i have mentioned. actually i am doing all the things step by
>>>>>>>>>> step.
>>>>>>>>>>
>>>>>>>>>> so i havent yet worked with instruction selection phase/ files.
>>>>>>>>>> rather before that i am trying to do legalization by allowing vector
>>>>>>>>>> elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil now,
>>>>>>>>>> i.e registerinfo.td to define register class to be called in
>>>>>>>>>> legalization. and most importantly i am dealing with file
>>>>>>>>>> X86ISelLowering.cpp.
>>>>>>>>>>
>>>>>>>>>> Now is there any relation in this and instruction selection.
>>>>>>>>>> since instruction selection comes after combine and legalize so i havent
>>>>>>>>>> yet worked on it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please correct me, I am stuck here.
>>>>>>>>>>
>>>>>>>>>> Thank You again
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli <
>>>>>>>>>> efriedma at codeaurora.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAnLLVMBackend.html
>>>>>>>>>>> and http://llvm.org/docs/CodeGenerator.html ?
>>>>>>>>>>> http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-s
>>>>>>>>>>> elector describes how to define a store instruction.
>>>>>>>>>>>
>>>>>>>>>>> -Eli
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
>>>>>>>>>>>
>>>>>>>>>>> Please correct me i m stuck at this point.
>>>>>>>>>>>
>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" <hahmed2305 at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>> i am experimenting with the increase in register/ vector width
>>>>>>>>>>> to 64 elements of 32 bits instead of 16 in x86 backend.
>>>>>>>>>>> for eg.
>>>>>>>>>>> i have a loop with 65 iterations;
>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the backend breaks
>>>>>>>>>>> the v64i32 into 4 v16i32. i want it to retain v64i32. like if there are 128
>>>>>>>>>>> elements in loop then it should break it into 2 v64i32 instructions.
>>>>>>>>>>>
>>>>>>>>>>> in order to do this i have made necessary changes in
>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the
>>>>>>>>>>> command -view-dag-combine2-dags i get the required output in
>>>>>>>>>>> graph but the following error on console:
>>>>>>>>>>>
>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast ([65 x
>>>>>>>>>>> i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
>>>>>>>>>>> t12, undef:i64
>>>>>>>>>>>   t7: v64i32 = add t6, t4
>>>>>>>>>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>>>>>>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>>>>>>>>> undef:i64
>>>>>>>>>>>       t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>>>> i32]* @c> 0
>>>>>>>>>>>         t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>>>>>>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>>>>>>>>> undef:i64
>>>>>>>>>>>       t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>>>> i32]* @b> 0
>>>>>>>>>>>         t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>>   t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
>>>>>>>>>>> @a> 0
>>>>>>>>>>>     t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
>>>>>>>>>>>   t3: i64 = undef
>>>>>>>>>>> In function: foo
>>>>>>>>>>>
>>>>>>>>>>> The dag after legalization is also attached here.
>>>>>>>>>>>
>>>>>>>>>>> the source is vector sum of 65 elements.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kindly correct me.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>
>>>>>>
>>>>> --
>>>> ~Craig
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/41b5b433/attachment.html>