[llvm-dev] VBROADCAST Implementation Issues
hameeza ahmed via llvm-dev
llvm-dev at lists.llvm.org
Mon Aug 7 10:39:02 PDT 2017
Where to add this line?
Sorry I didnt understand it.
On Mon, Aug 7, 2017 at 10:37 PM, Craig Topper <craig.topper at gmail.com>
wrote:
> You need this line from AVX512 code to tell the register allocation system
> that $src1/$dst and $mask/$mask_wb to use the same register. And the early
> clobber tells it that $dst and $src2 cannot use the same register.
>
> let Constraints = "@earlyclobber $dst, $src1 = $dst, $mask = $mask_wb"
>
> ~Craig
>
> On Mon, Aug 7, 2017 at 10:19 AM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> Thank You. Still getting errors.I have modified my instructions as you
>> said as follows:
>>
>>
>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst,
>> VK64WM:$mask_wb), (ins VR_2048:$src1, VK64WM:$mask, i2048mem:$src2),
>> "GATHER_256B\t{$src2, {$dst} {${mask}}|${dst}
>> {${mask}}, $src2}",
>> [(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32
>> (masked_gather (VR_2048:$src1), VK64WM:$mask,
>> addr:$src2)))],
>> IIC_MOV_MEM>, TA;
>>
>> def: Pat<(v64f32 (masked_gather (VR_2048:$src1),
>> (VK64WM:$mask),(addr:$src2))), (GATHER_256B VR_2048:$src1, VK64WM:$mask,
>> addr:$src2)>;
>>
>>
>> Now getting this error:
>>
>> llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void
>> llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier():
>> Assertion `numPhysicalOperands >= 2 + additionalOperands &&
>> numPhysicalOperands <= 4 + additionalOperands && "Unexpected number of
>> operands for MRMSrcMemFrm"' failed.
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Aug 7, 2017 at 8:23 PM, Craig Topper <craig.topper at gmail.com>
>> wrote:
>>
>>> masked_gather takes 3 inputs. not just an address. See the AVX512
>>> pattern is pasted earlier
>>>
>>> ~Craig
>>>
>>> On Mon, Aug 7, 2017 at 1:54 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>> wrote:
>>>
>>>> Changed it to;
>>>>
>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64:$mask),
>>>> (ins i2048mem:$src),
>>>> "GATHER_256B\t{$src, {$dst}{${mask}}|${dst}
>>>> {${mask}}, $src}",
>>>> [(set VR_2048:$dst, VK64:$mask, (v64i32
>>>> (masked_gather addr:$src)))],
>>>> IIC_MOV_MEM>, TA;
>>>> def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B addr:$src)>;
>>>> Now getting following error:
>>>>
>>>> Unhandled memory encoding VK64
>>>> Unhandled memory encoding
>>>> UNREACHABLE executed at /utils/TableGen/X86RecognizableInstr.cpp:1347!
>>>>
>>>> What to do?
>>>>
>>>>
>>>> On Mon, Aug 7, 2017 at 1:20 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>>> wrote:
>>>>
>>>>> i am getting this error
>>>>> error: Variable not defined: '_'
>>>>> for _.KRCWM
>>>>> what to do?
>>>>>
>>>>> On Mon, Aug 7, 2017 at 1:13 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>> I did as you said,
>>>>>>
>>>>>> Please tell me whether the following correct now??
>>>>>>
>>>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst,
>>>>>> _.KRCWM:$mask_wb), (VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
>>>>>> "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst}
>>>>>> {${mask}}, $src2}"),
>>>>>> [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
>>>>>> (GatherNode (VR_2048:$src1), _.KRCWM:$mask,
>>>>>> VR_2048:$src2))],
>>>>>> IIC_MOV_MEM>, TA;
>>>>>> def: Pat<(v64f32 (GatherNode addr:$src2)), (GATHER_256B addr:$src2)>;
>>>>>>
>>>>>> Thank You
>>>>>>
>>>>>> On Mon, Aug 7, 2017 at 2:57 AM, Craig Topper <craig.topper at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> masked_gather returns two results. The data and the modified mask.
>>>>>>> Note the $dst and the $mask_wb in the pattern below.
>>>>>>>
>>>>>>> multiclass avx512_gather<bits<8> opc, string OpcodeStr,
>>>>>>> X86VectorVTInfo _,
>>>>>>> X86MemOperand memop, PatFrag GatherNode> {
>>>>>>> let Constraints = "@earlyclobber $dst, $src1 = $dst, $mask =
>>>>>>> $mask_wb",
>>>>>>> ExeDomain = _.ExeDomain in
>>>>>>> def rm : AVX5128I<opc, MRMSrcMem, (outs _.RC:$dst,
>>>>>>> _.KRCWM:$mask_wb),
>>>>>>> (ins _.RC:$src1, _.KRCWM:$mask, memop:$src2),
>>>>>>> !strconcat(OpcodeStr#_.Suffix,
>>>>>>> "\t{$src2, ${dst} {${mask}}|${dst} {${mask}}, $src2}"),
>>>>>>> [(set _.RC:$dst, _.KRCWM:$mask_wb,
>>>>>>> (GatherNode (_.VT _.RC:$src1), _.KRCWM:$mask,
>>>>>>> vectoraddr:$src2))]>, EVEX, EVEX_K,
>>>>>>> EVEX_CD8<_.EltSize, CD8VT1>;
>>>>>>> }
>>>>>>>
>>>>>>> ~Craig
>>>>>>>
>>>>>>> On Sun, Aug 6, 2017 at 2:21 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> i want to implement gather for v64i32. i wrote following code.
>>>>>>>>
>>>>>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
>>>>>>>> i2048mem:$src),
>>>>>>>> "GATHER_256B\t{$src, $dst|$dst, $src}",
>>>>>>>> [(set VR_2048:$dst, (v64i32 (masked_gather
>>>>>>>> addr:$src)))],
>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>> def: Pat<(v64f32 (masked_gather addr:$src)),
>>>>>>>> (GATHER_256B addr:$src)>;
>>>>>>>>
>>>>>>>> Also i wrote this line in isellowering.h
>>>>>>>>
>>>>>>>> setOperationAction(ISD::MGATHER,
>>>>>>>> MVT::v64i32, Legal);
>>>>>>>>
>>>>>>>> But I am getting following error:
>>>>>>>>
>>>>>>>> llvm-tblgen: /utils/TableGen/CodeGenDAGPatterns.cpp:2134:
>>>>>>>> llvm::TreePatternNode *llvm::TreePattern::ParseTreePattern(llvm::Init
>>>>>>>> *, llvm::StringRef): Assertion `New->getNumTypes() == 1 && "FIXME:
>>>>>>>> Unhandled"' failed.
>>>>>>>>
>>>>>>>> What is my mistake?
>>>>>>>>
>>>>>>>> Please help me.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Aug 7, 2017 at 12:03 AM, hameeza ahmed <
>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I am trying to implement vector shuffle for v64i32. Is the
>>>>>>>>> following correct?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> def VSHUFFLE_256B : I<0xE8, MRMDestReg, (outs VR_2048:$dst),
>>>>>>>>> (ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1,
>>>>>>>>> $src2, $dst|$dst, $src1, $src2}",
>>>>>>>>> [(set VR_2048:$dst, (shufflevector (v64i32 VR_2048:$src1), (v64i32
>>>>>>>>> VR_2048:$src2)))]>, TA;
>>>>>>>>>
>>>>>>>>> Please help.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Aug 6, 2017 at 11:48 PM, hameeza ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> i managed to get rid of above error for VT.is2048BitVector()).
>>>>>>>>>>
>>>>>>>>>> this was implemented already.
>>>>>>>>>>
>>>>>>>>>> now will try define other vectors like VT.is4096BitVector()).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Aug 6, 2017 at 11:11 PM, hameeza ahmed <
>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thank you. actually i have to implement both i32 and i64. so i
>>>>>>>>>>> implemented two instructions now one broadcastS other broadcastD. Although
>>>>>>>>>>> while doing broadcast from memory to register i was getting no such error
>>>>>>>>>>> with 1 instruction and other patterns i64, i32 etc. but then also i
>>>>>>>>>>> implemented its 2 versions single and double.
>>>>>>>>>>>
>>>>>>>>>>> Actually, i am trying to compile matrix multiplication code for
>>>>>>>>>>> greater size vector. There i need to include many new instructions in my
>>>>>>>>>>> backend like shuffle, gather etc. For now i am getting the following error.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Legalizing: t208: v64i32 = BUILD_VECTOR Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
>>>>>>>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:5525: llvm::SDValue
>>>>>>>>>>> getOnesVector(llvm::EVT, const llvm::X86Subtarget &, llvm::SelectionDAG &,
>>>>>>>>>>> const llvm::SDLoc &): Assertion `(VT.is128BitVector() ||
>>>>>>>>>>> VT.is256BitVector() || VT.is512BitVector()) && "Expected a 128/256/512-bit
>>>>>>>>>>> vector type"' failed.
>>>>>>>>>>>
>>>>>>>>>>> i tried including is2048Bit Vector() and others. also in
>>>>>>>>>>> vectortype.h i included these types for EVT but was unable to compile
>>>>>>>>>>> backend and getting errors.
>>>>>>>>>>>
>>>>>>>>>>> Please help.
>>>>>>>>>>>
>>>>>>>>>>> Thank You
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Aug 6, 2017 at 8:42 PM, Craig Topper <
>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You need a new instruction. And your scalar register size needs
>>>>>>>>>>>> to match your vector element size. So GR32 instead of GR64
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Aug 6, 2017 at 5:44 AM hameeza ahmed <
>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry to disturb,
>>>>>>>>>>>>> Now i want to implement instruction to broadcast scalar
>>>>>>>>>>>>> register content to vector.
>>>>>>>>>>>>>
>>>>>>>>>>>>> like this;
>>>>>>>>>>>>> vpbroadcastq zmm0, rsi
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tried implementing it as follows;
>>>>>>>>>>>>>
>>>>>>>>>>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs VR_2048:$dst),
>>>>>>>>>>>>> (ins GR64:$src),
>>>>>>>>>>>>> "BROADCASTR_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>>>>>> [(set VR_2048:$dst, (v64i32 (X86VBroadcast
>>>>>>>>>>>>> GR64:$src)))],
>>>>>>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast GR64:$src)),
>>>>>>>>>>>>> (BROADCASTR_256B GR64:$src)>;
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is it fine? Also do i need to define a new instruction for
>>>>>>>>>>>>> this like BROADCASTR_256B? can i use the previous instruction
>>>>>>>>>>>>> BROADCAST_256B (the one that broadcast memory scalar to vector) and just
>>>>>>>>>>>>> define new pattern?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please help.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 5:10 AM, hameeza ahmed <
>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank You so much.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Wao you are simply genius.
>>>>>>>>>>>>>> initially I didnt include load in both the main instruction
>>>>>>>>>>>>>> and pattern so i included in both as follows:
>>>>>>>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs VR_2048:$dst),
>>>>>>>>>>>>>> (ins i2048mem:$src),
>>>>>>>>>>>>>> "BROADCAST_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>>>>>>> [(set VR_2048:$dst, (v64i32
>>>>>>>>>>>>>> (X86VBroadcast (loadi32 addr:$src))))],
>>>>>>>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>> And it worked perfectly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank You again.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 4:28 AM, Craig Topper <
>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Your pattern needs to be
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 2:47 PM, hameeza ahmed <
>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> it runs fine with v64i32. but with the following pattern
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> i am getting error.
>>>>>>>>>>>>>>>> What is wrong with this pattern?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 2:01 AM, hameeza ahmed <
>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> in x86 it is;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> def : Pat<(int_x86_avx512_vbroadcast_ss_512 addr:$src),
>>>>>>>>>>>>>>>>> (VBROADCASTSSZm addr:$src)>;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> mine is
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:59 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> for v16f32 it is defined as;
>>>>>>>>>>>>>>>>>> : Pat<(v16f32 (X86VBroadcast (v16f32 VR512:$src))),
>>>>>>>>>>>>>>>>>> (VBROADCASTSSZr (EXTRACT_SUBREG (v16f32
>>>>>>>>>>>>>>>>>> VR512:$src), sub_xmm))>;
>>>>>>>>>>>>>>>>>> which is similar to mine.
>>>>>>>>>>>>>>>>>> Why its not working then?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:45 AM, Craig Topper <
>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> You need a pattern for v64f32 too.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:37 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> as you said; these are instructions that i defined in
>>>>>>>>>>>>>>>>>>>> instrinfo.td
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>>>> VR_2048:$dst), (ins i2048mem:$src),
>>>>>>>>>>>>>>>>>>>> "BROADCAST_256B\t{$src, $dst|$dst,
>>>>>>>>>>>>>>>>>>>> $src}",
>>>>>>>>>>>>>>>>>>>> [(set VR_2048:$dst, (v64i32
>>>>>>>>>>>>>>>>>>>> (X86VBroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:28 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I did as you said;
>>>>>>>>>>>>>>>>>>>>> now getting this error:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t63: v64f32 =
>>>>>>>>>>>>>>>>>>>>> X86ISD::VBROADCAST t62
>>>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t65,
>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>> t65: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>> t64: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:14 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Add VT.is2048BitVector() to the assert?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:11 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> added the setoperationaction line in
>>>>>>>>>>>>>>>>>>>>>>> isellowering.cpp. now getting the following error.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:6801:
>>>>>>>>>>>>>>>>>>>>>>> llvm::SDValue LowerVectorBroadcast(llvm::BuildVectorSDNode
>>>>>>>>>>>>>>>>>>>>>>> *, const llvm::X86Subtarget &, llvm::SelectionDAG &): Assertion
>>>>>>>>>>>>>>>>>>>>>>> `(VT.is128BitVector() || VT.is256BitVector() || VT.is512BitVector()) &&
>>>>>>>>>>>>>>>>>>>>>>> "Unsupported vector type for broadcast."' failed.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> What should I do?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:36 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Well first have you done this for your type
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> setOperationAction(ISD::BUILD_VECTOR, v64i32,
>>>>>>>>>>>>>>>>>>>>>>>> Custom);
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:29 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> How to do this task??
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:24 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> It looks like X86TargetLowering::LowerBUILD_VECTOR
>>>>>>>>>>>>>>>>>>>>>>>>>> is not creating a broadcast node for your wider vector type.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:19 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank You.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I made your mentioned changes and included
>>>>>>>>>>>>>>>>>>>>>>>>>>> broadcast instruction in instructioninfo.td.
>>>>>>>>>>>>>>>>>>>>>>>>>>> but i made no changes in isellowering.cpp file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Still getting the following error.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t29: v64f32 =
>>>>>>>>>>>>>>>>>>>>>>>>>>> BUILD_VECTOR t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>>> t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>> t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>>> t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>> t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>>> t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>> t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>> .................
>>>>>>>>>>>>>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> How to resolve this?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Please help..
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 11:19 PM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> You need to use X86VBroadcast not "vbroadcast"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 10:50 AM, hameeza ahmed
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> i have a c code which multiplies vector with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> constant something like this;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float con=0.2;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for (k = 0; k < N; k++) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for (i = 1; i <= N-2; i++)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for (j = 1; j <= N-2; j++)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> b[i][j] = con * (a[i][j] + a[i-1][j]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> + a[i+1][j] + a[i][j-1] + a[i][j+1]);
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now in LLVM IR I m getting;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> %22 = fmul <64 x float> %21, <float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but its assembly in x86 gives;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .LCPI0_0:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .long 1045220557 # float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0.200000003
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vbroadcastss zmm1, dword ptr [rip + .LCPI0_0]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vmulps zmm2, zmm2, zmm1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it lowered the above IR code into
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vbroadcastss?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> What would be the pattern here to match?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to implement similar broadcast for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vector of 64 elements.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> i tried the following code;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> def BROADCAST_DWORD : I<0x60, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VREGG:$dst), (ins immem:$src),
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "BROADCAST_DWORD\t{$src,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> $dst|$dst, $src}",
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [(set VREGG:$dst, (v64i32
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (vbroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please help me. I am stuck at this point.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170807/d557d8fd/attachment-0001.html>
More information about the llvm-dev
mailing list