[llvm-dev] VBROADCAST Implementation Issues

hameeza ahmed via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 7 10:57:17 PDT 2017


Now getting this error:
/lib/Target/X86/X86InstrInfo.td:3318:1: error: In GATHER_256B: Unrecognized
node 'VR_2048'!




On Mon, Aug 7, 2017 at 10:53 PM, Craig Topper <craig.topper at gmail.com>
wrote:

> You need to add EVEX_K and EVEX_4V to the end of your instruction after TA.
>
> ~Craig
>
> On Mon, Aug 7, 2017 at 10:47 AM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> Thank You. Now getting this error:
>>
>> Unhandled memory encoding VK64WM
>> Unhandled memory encoding
>>
>>
>> On Mon, Aug 7, 2017 at 10:43 PM, Craig Topper <craig.topper at gmail.com>
>> wrote:
>>
>>> Right before your "def GATHER_256B" add the 'let' line like so
>>>
>>> let Constraints = "@earlyclobber $dst, $src1 = $dst, $mask = $mask_wb" in
>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst,
>>> VK64WM:$mask_wb), (ins VR_2048:$src1, VK64WM:$mask,  i2048mem:$src2),
>>>                     "GATHER_256B\t{$src2, {$dst} {${mask}}|${dst}
>>> {${mask}}, $src2}",
>>>                     [(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32
>>> (masked_gather  (VR_2048:$src1), VK64WM:$mask,
>>>                      addr:$src2)))],
>>>                     IIC_MOV_MEM>, TA;
>>>
>>> def: Pat<(v64f32 (masked_gather (VR_2048:$src1),
>>> (VK64WM:$mask),(addr:$src2))), (GATHER_256B VR_2048:$src1, VK64WM:$mask,
>>> addr:$src2)>;
>>>
>>> ~Craig
>>>
>>> On Mon, Aug 7, 2017 at 10:39 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>> wrote:
>>>
>>>> Where to add this line?
>>>> Sorry I didnt understand it.
>>>>
>>>> On Mon, Aug 7, 2017 at 10:37 PM, Craig Topper <craig.topper at gmail.com>
>>>> wrote:
>>>>
>>>>> You need this line from AVX512 code to tell the register allocation
>>>>> system that $src1/$dst and $mask/$mask_wb to use the same register. And the
>>>>> early clobber tells it that $dst and $src2 cannot use the same register.
>>>>>
>>>>> let Constraints = "@earlyclobber $dst, $src1 = $dst, $mask = $mask_wb"
>>>>>
>>>>> ~Craig
>>>>>
>>>>> On Mon, Aug 7, 2017 at 10:19 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank You. Still getting errors.I have modified my instructions as
>>>>>> you said as follows:
>>>>>>
>>>>>>
>>>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst,
>>>>>> VK64WM:$mask_wb), (ins VR_2048:$src1, VK64WM:$mask,  i2048mem:$src2),
>>>>>>                     "GATHER_256B\t{$src2, {$dst} {${mask}}|${dst}
>>>>>> {${mask}}, $src2}",
>>>>>>                     [(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32
>>>>>> (masked_gather  (VR_2048:$src1), VK64WM:$mask,
>>>>>>                      addr:$src2)))],
>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>
>>>>>> def: Pat<(v64f32 (masked_gather (VR_2048:$src1),
>>>>>> (VK64WM:$mask),(addr:$src2))), (GATHER_256B VR_2048:$src1, VK64WM:$mask,
>>>>>> addr:$src2)>;
>>>>>>
>>>>>>
>>>>>> Now getting this error:
>>>>>>
>>>>>> llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void
>>>>>> llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier():
>>>>>> Assertion `numPhysicalOperands >= 2 + additionalOperands &&
>>>>>> numPhysicalOperands <= 4 + additionalOperands && "Unexpected number of
>>>>>> operands for MRMSrcMemFrm"' failed.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Aug 7, 2017 at 8:23 PM, Craig Topper <craig.topper at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> masked_gather takes 3 inputs. not just an address. See the AVX512
>>>>>>> pattern is pasted earlier
>>>>>>>
>>>>>>> ~Craig
>>>>>>>
>>>>>>> On Mon, Aug 7, 2017 at 1:54 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Changed it to;
>>>>>>>>
>>>>>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst,
>>>>>>>> VK64:$mask), (ins i2048mem:$src),
>>>>>>>>                     "GATHER_256B\t{$src, {$dst}{${mask}}|${dst}
>>>>>>>> {${mask}}, $src}",
>>>>>>>>                     [(set VR_2048:$dst, VK64:$mask, (v64i32
>>>>>>>> (masked_gather addr:$src)))],
>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>> def: Pat<(v64f32 (masked_gather addr:$src)),
>>>>>>>> (GATHER_256B addr:$src)>;
>>>>>>>> Now getting following error:
>>>>>>>>
>>>>>>>> Unhandled memory encoding VK64
>>>>>>>> Unhandled memory encoding
>>>>>>>> UNREACHABLE executed at /utils/TableGen/X86Recognizabl
>>>>>>>> eInstr.cpp:1347!
>>>>>>>>
>>>>>>>> What to do?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Aug 7, 2017 at 1:20 PM, hameeza ahmed <hahmed2305 at gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> i am getting this error
>>>>>>>>> error: Variable not defined: '_'
>>>>>>>>> for _.KRCWM
>>>>>>>>> what to do?
>>>>>>>>>
>>>>>>>>> On Mon, Aug 7, 2017 at 1:13 PM, hameeza ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>> I did as you said,
>>>>>>>>>>
>>>>>>>>>> Please tell me whether the following correct now??
>>>>>>>>>>
>>>>>>>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst,
>>>>>>>>>> _.KRCWM:$mask_wb), (VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
>>>>>>>>>>                     "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst}
>>>>>>>>>> {${mask}}, $src2}"),
>>>>>>>>>>                     [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
>>>>>>>>>> (GatherNode  (VR_2048:$src1), _.KRCWM:$mask,
>>>>>>>>>>                      VR_2048:$src2))],
>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>> def: Pat<(v64f32 (GatherNode addr:$src2)),
>>>>>>>>>> (GATHER_256B addr:$src2)>;
>>>>>>>>>>
>>>>>>>>>> Thank You
>>>>>>>>>>
>>>>>>>>>> On Mon, Aug 7, 2017 at 2:57 AM, Craig Topper <
>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> masked_gather returns two results. The data and the modified
>>>>>>>>>>> mask. Note the $dst and the $mask_wb in the pattern below.
>>>>>>>>>>>
>>>>>>>>>>> multiclass avx512_gather<bits<8> opc, string OpcodeStr,
>>>>>>>>>>> X86VectorVTInfo _,
>>>>>>>>>>>                          X86MemOperand memop, PatFrag
>>>>>>>>>>> GatherNode> {
>>>>>>>>>>>   let Constraints = "@earlyclobber $dst, $src1 = $dst, $mask =
>>>>>>>>>>> $mask_wb",
>>>>>>>>>>>       ExeDomain = _.ExeDomain in
>>>>>>>>>>>   def rm  : AVX5128I<opc, MRMSrcMem, (outs _.RC:$dst,
>>>>>>>>>>> _.KRCWM:$mask_wb),
>>>>>>>>>>>             (ins _.RC:$src1, _.KRCWM:$mask, memop:$src2),
>>>>>>>>>>>             !strconcat(OpcodeStr#_.Suffix,
>>>>>>>>>>>             "\t{$src2, ${dst} {${mask}}|${dst} {${mask}},
>>>>>>>>>>> $src2}"),
>>>>>>>>>>>             [(set _.RC:$dst, _.KRCWM:$mask_wb,
>>>>>>>>>>>               (GatherNode  (_.VT _.RC:$src1), _.KRCWM:$mask,
>>>>>>>>>>>                      vectoraddr:$src2))]>, EVEX, EVEX_K,
>>>>>>>>>>>              EVEX_CD8<_.EltSize, CD8VT1>;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> ~Craig
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Aug 6, 2017 at 2:21 PM, hameeza ahmed <
>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> i want to implement gather for v64i32. i wrote following code.
>>>>>>>>>>>>
>>>>>>>>>>>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
>>>>>>>>>>>> i2048mem:$src),
>>>>>>>>>>>>                     "GATHER_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>>>>>                     [(set VR_2048:$dst, (v64i32 (masked_gather
>>>>>>>>>>>> addr:$src)))],
>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>> def: Pat<(v64f32 (masked_gather addr:$src)),
>>>>>>>>>>>> (GATHER_256B addr:$src)>;
>>>>>>>>>>>>
>>>>>>>>>>>> Also i wrote this line in isellowering.h
>>>>>>>>>>>>
>>>>>>>>>>>>               setOperationAction(ISD::MGATHER,
>>>>>>>>>>>> MVT::v64i32, Legal);
>>>>>>>>>>>>
>>>>>>>>>>>> But I am getting following error:
>>>>>>>>>>>>
>>>>>>>>>>>> llvm-tblgen: /utils/TableGen/CodeGenDAGPatterns.cpp:2134:
>>>>>>>>>>>> llvm::TreePatternNode *llvm::TreePattern::ParseTreePattern(llvm::Init
>>>>>>>>>>>> *, llvm::StringRef): Assertion `New->getNumTypes() == 1 && "FIXME:
>>>>>>>>>>>> Unhandled"' failed.
>>>>>>>>>>>>
>>>>>>>>>>>> What is my mistake?
>>>>>>>>>>>>
>>>>>>>>>>>> Please help me.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Aug 7, 2017 at 12:03 AM, hameeza ahmed <
>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I am trying to implement vector shuffle for v64i32. Is the
>>>>>>>>>>>>> following correct?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> def VSHUFFLE_256B  : I<0xE8, MRMDestReg, (outs VR_2048:$dst),
>>>>>>>>>>>>> (ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1,
>>>>>>>>>>>>> $src2, $dst|$dst, $src1, $src2}",
>>>>>>>>>>>>> [(set VR_2048:$dst, (shufflevector (v64i32 VR_2048:$src1),
>>>>>>>>>>>>> (v64i32 VR_2048:$src2)))]>, TA;
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please help.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 11:48 PM, hameeza ahmed <
>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> i managed to get rid of above error for
>>>>>>>>>>>>>> VT.is2048BitVector()).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> this was implemented already.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> now will try define other vectors like VT.is4096BitVector()).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 11:11 PM, hameeza ahmed <
>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank you. actually i have to implement both i32 and i64. so
>>>>>>>>>>>>>>> i implemented two instructions now one broadcastS other broadcastD.
>>>>>>>>>>>>>>> Although while doing broadcast from memory to register i was getting no
>>>>>>>>>>>>>>> such error with 1 instruction and other patterns i64, i32 etc. but then
>>>>>>>>>>>>>>> also i implemented its 2 versions single and double.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Actually, i am trying to compile matrix multiplication code
>>>>>>>>>>>>>>> for greater size vector. There i need to include many new instructions in
>>>>>>>>>>>>>>> my backend like shuffle, gather etc. For now i am getting the following
>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Legalizing: t208: v64i32 = BUILD_VECTOR Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>>>>>>>>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
>>>>>>>>>>>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:5525:
>>>>>>>>>>>>>>> llvm::SDValue getOnesVector(llvm::EVT, const llvm::X86Subtarget &,
>>>>>>>>>>>>>>> llvm::SelectionDAG &, const llvm::SDLoc &): Assertion `(VT.is128BitVector()
>>>>>>>>>>>>>>> || VT.is256BitVector() || VT.is512BitVector()) && "Expected a
>>>>>>>>>>>>>>> 128/256/512-bit vector type"' failed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  i tried including is2048Bit Vector() and others. also in
>>>>>>>>>>>>>>> vectortype.h i included these types for EVT but was unable to compile
>>>>>>>>>>>>>>> backend and getting errors.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 8:42 PM, Craig Topper <
>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You need a new instruction. And your scalar register size
>>>>>>>>>>>>>>>> needs to match your vector element size. So GR32 instead of GR64
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 5:44 AM hameeza ahmed <
>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sorry to disturb,
>>>>>>>>>>>>>>>>> Now i want to implement instruction to broadcast scalar
>>>>>>>>>>>>>>>>> register content to vector.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> like this;
>>>>>>>>>>>>>>>>> vpbroadcastq zmm0, rsi
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I tried implementing it as follows;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs
>>>>>>>>>>>>>>>>> VR_2048:$dst), (ins GR64:$src),
>>>>>>>>>>>>>>>>>                     "BROADCASTR_256B\t{$src, $dst|$dst,
>>>>>>>>>>>>>>>>> $src}",
>>>>>>>>>>>>>>>>>                     [(set VR_2048:$dst, (v64i32
>>>>>>>>>>>>>>>>> (X86VBroadcast  GR64:$src)))],
>>>>>>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast GR64:$src)),
>>>>>>>>>>>>>>>>> (BROADCASTR_256B GR64:$src)>;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is it fine? Also do i need to define a new instruction for
>>>>>>>>>>>>>>>>> this like BROADCASTR_256B? can i use the previous instruction
>>>>>>>>>>>>>>>>> BROADCAST_256B (the one that broadcast memory scalar to vector) and just
>>>>>>>>>>>>>>>>> define new pattern?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 5:10 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thank You so much.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Wao you are simply genius.
>>>>>>>>>>>>>>>>>> initially I didnt include load in both the main
>>>>>>>>>>>>>>>>>> instruction and pattern so i included in both as follows:
>>>>>>>>>>>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>> VR_2048:$dst), (ins i2048mem:$src),
>>>>>>>>>>>>>>>>>>                     "BROADCAST_256B\t{$src, $dst|$dst,
>>>>>>>>>>>>>>>>>> $src}",
>>>>>>>>>>>>>>>>>>                     [(set VR_2048:$dst, (v64i32
>>>>>>>>>>>>>>>>>> (X86VBroadcast (loadi32 addr:$src))))],
>>>>>>>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>> And it worked perfectly.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thank You again.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 4:28 AM, Craig Topper <
>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Your pattern needs to be
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 2:47 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> it runs fine with v64i32. but with the following pattern
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> i am getting error.
>>>>>>>>>>>>>>>>>>>> What is wrong with this pattern?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 2:01 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> in x86 it is;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> def : Pat<(int_x86_avx512_vbroadcast_ss_512
>>>>>>>>>>>>>>>>>>>>> addr:$src),
>>>>>>>>>>>>>>>>>>>>>           (VBROADCASTSSZm addr:$src)>;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> mine is
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:59 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> for v16f32 it is defined as;
>>>>>>>>>>>>>>>>>>>>>> : Pat<(v16f32 (X86VBroadcast (v16f32 VR512:$src))),
>>>>>>>>>>>>>>>>>>>>>>           (VBROADCASTSSZr (EXTRACT_SUBREG (v16f32
>>>>>>>>>>>>>>>>>>>>>> VR512:$src), sub_xmm))>;
>>>>>>>>>>>>>>>>>>>>>> which is similar to mine.
>>>>>>>>>>>>>>>>>>>>>> Why its not working then?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:45 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> You need a pattern for v64f32 too.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:37 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> as you said; these are instructions that i defined
>>>>>>>>>>>>>>>>>>>>>>>> in instrinfo.td
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>>>>>>>> VR_2048:$dst), (ins i2048mem:$src),
>>>>>>>>>>>>>>>>>>>>>>>>                     "BROADCAST_256B\t{$src,
>>>>>>>>>>>>>>>>>>>>>>>> $dst|$dst, $src}",
>>>>>>>>>>>>>>>>>>>>>>>>                     [(set VR_2048:$dst, (v64i32
>>>>>>>>>>>>>>>>>>>>>>>> (X86VBroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:28 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I did as you said;
>>>>>>>>>>>>>>>>>>>>>>>>> now getting this error:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t63: v64f32 =
>>>>>>>>>>>>>>>>>>>>>>>>> X86ISD::VBROADCAST t62
>>>>>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0, t65,
>>>>>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>     t65: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>       t64: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>     t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:14 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Add VT.is2048BitVector() to the assert?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:11 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> added the setoperationaction line in
>>>>>>>>>>>>>>>>>>>>>>>>>>> isellowering.cpp. now getting the following error.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:6801:
>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm::SDValue LowerVectorBroadcast(llvm::BuildVectorSDNode
>>>>>>>>>>>>>>>>>>>>>>>>>>> *, const llvm::X86Subtarget &, llvm::SelectionDAG &): Assertion
>>>>>>>>>>>>>>>>>>>>>>>>>>> `(VT.is128BitVector() || VT.is256BitVector() || VT.is512BitVector()) &&
>>>>>>>>>>>>>>>>>>>>>>>>>>> "Unsupported vector type for broadcast."' failed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> What should I do?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:36 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Well first have you done this for your type
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> setOperationAction(ISD::BUILD_VECTOR, v64i32,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Custom);
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:29 PM, hameeza ahmed
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How to do this task??
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:24 AM, Craig Topper
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It looks like X86TargetLowering::LowerBUILD_VECTOR
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is not creating a broadcast node for your wider vector type.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:19 PM, hameeza
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ahmed <hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank You.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I made your mentioned changes and included
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> broadcast instruction in instructioninfo.td.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but i made no changes in isellowering.cpp file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Still getting the following error.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t29: v64f32 =
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> BUILD_VECTOR t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t64, undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>       t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t64, undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>       t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> t64, undef:i64
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>       t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    .................
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How to resolve this?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please help..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 11:19 PM, Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Topper <craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> You need to use X86VBroadcast not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "vbroadcast"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 10:50 AM, hameeza
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ahmed <hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> i have a c code which multiplies vector
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with constant something like this;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float con=0.2;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    for (k = 0; k < N; k++) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>        for (i = 1; i <= N-2; i++)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>            for (j = 1; j <= N-2; j++)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         b[i][j] = con * (a[i][j] +
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a[i-1][j] + a[i+1][j] + a[i][j-1] + a[i][j+1]);
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now in LLVM IR I m getting;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>  %22 = fmul <64 x float> %21, <float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but its assembly in x86 gives;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .LCPI0_0:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .long 1045220557              # float
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0.200000003
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vbroadcastss zmm1, dword ptr [rip +
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .LCPI0_0]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vmulps zmm2, zmm2, zmm1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it lowered the above IR code into
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vbroadcastss?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> What would be the pattern here to match?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to implement similar broadcast for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vector of 64 elements.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> i tried the following code;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> def BROADCAST_DWORD : I<0x60, MRMSrcMem,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (outs VREGG:$dst), (ins immem:$src),
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "BROADCAST_DWORD\t{$src, $dst|$dst, $src}",
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                     [(set VREGG:$dst,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (v64i32 (vbroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please help me. I am stuck at this point.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170807/ae7e356e/attachment-0001.html>


More information about the llvm-dev mailing list