[llvm-dev] VBROADCAST Implementation Issues

hameeza ahmed via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 7 01:13:39 PDT 2017


Hello,
I did as you said,

Please tell me whether the following correct now??

def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb),
(VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
                    "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}},
$src2}"),
                    [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
(GatherNode  (VR_2048:$src1), _.KRCWM:$mask,
                     VR_2048:$src2))],
                    IIC_MOV_MEM>, TA;
def: Pat<(v64f32 (GatherNode addr:$src2)), (GATHER_256B addr:$src2)>;

Thank You

On Mon, Aug 7, 2017 at 2:57 AM, Craig Topper <craig.topper at gmail.com> wrote:

> masked_gather returns two results. The data and the modified mask. Note
> the $dst and the $mask_wb in the pattern below.
>
> multiclass avx512_gather<bits<8> opc, string OpcodeStr, X86VectorVTInfo _,
>                          X86MemOperand memop, PatFrag GatherNode> {
>   let Constraints = "@earlyclobber $dst, $src1 = $dst, $mask = $mask_wb",
>       ExeDomain = _.ExeDomain in
>   def rm  : AVX5128I<opc, MRMSrcMem, (outs _.RC:$dst, _.KRCWM:$mask_wb),
>             (ins _.RC:$src1, _.KRCWM:$mask, memop:$src2),
>             !strconcat(OpcodeStr#_.Suffix,
>             "\t{$src2, ${dst} {${mask}}|${dst} {${mask}}, $src2}"),
>             [(set _.RC:$dst, _.KRCWM:$mask_wb,
>               (GatherNode  (_.VT _.RC:$src1), _.KRCWM:$mask,
>                      vectoraddr:$src2))]>, EVEX, EVEX_K,
>              EVEX_CD8<_.EltSize, CD8VT1>;
> }
>
> ~Craig
>
> On Sun, Aug 6, 2017 at 2:21 PM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> i want to implement gather for v64i32. i wrote following code.
>>
>> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
>> i2048mem:$src),
>>                     "GATHER_256B\t{$src, $dst|$dst, $src}",
>>                     [(set VR_2048:$dst, (v64i32 (masked_gather
>> addr:$src)))],
>>                     IIC_MOV_MEM>, TA;
>> def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B addr:$src)>;
>>
>> Also i wrote this line in isellowering.h
>>
>>               setOperationAction(ISD::MGATHER,             MVT::v64i32,
>> Legal);
>>
>> But I am getting following error:
>>
>> llvm-tblgen: /utils/TableGen/CodeGenDAGPatterns.cpp:2134:
>> llvm::TreePatternNode *llvm::TreePattern::ParseTreePattern(llvm::Init *,
>> llvm::StringRef): Assertion `New->getNumTypes() == 1 && "FIXME: Unhandled"'
>> failed.
>>
>> What is my mistake?
>>
>> Please help me.
>>
>>
>> On Mon, Aug 7, 2017 at 12:03 AM, hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> I am trying to implement vector shuffle for v64i32. Is the following
>>> correct?
>>>
>>>
>>> def VSHUFFLE_256B  : I<0xE8, MRMDestReg, (outs VR_2048:$dst),
>>> (ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1, $src2,
>>> $dst|$dst, $src1, $src2}",
>>> [(set VR_2048:$dst, (shufflevector (v64i32 VR_2048:$src1), (v64i32
>>> VR_2048:$src2)))]>, TA;
>>>
>>> Please help.
>>>
>>>
>>>
>>>
>>> On Sun, Aug 6, 2017 at 11:48 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>> wrote:
>>>
>>>> i managed to get rid of above error for VT.is2048BitVector()).
>>>>
>>>> this was implemented already.
>>>>
>>>> now will try define other vectors like VT.is4096BitVector()).
>>>>
>>>>
>>>>
>>>> On Sun, Aug 6, 2017 at 11:11 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you. actually i have to implement both i32 and i64. so i
>>>>> implemented two instructions now one broadcastS other broadcastD. Although
>>>>> while doing broadcast from memory to register i was getting no such error
>>>>> with 1 instruction and other patterns i64, i32 etc. but then also i
>>>>> implemented its 2 versions single and double.
>>>>>
>>>>> Actually, i am trying to compile matrix multiplication code for
>>>>> greater size vector. There i need to include many new instructions in my
>>>>> backend like shuffle, gather etc. For now i am getting the following error.
>>>>>
>>>>>
>>>>> Legalizing: t208: v64i32 = BUILD_VECTOR Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:5525: llvm::SDValue
>>>>> getOnesVector(llvm::EVT, const llvm::X86Subtarget &, llvm::SelectionDAG &,
>>>>> const llvm::SDLoc &): Assertion `(VT.is128BitVector() ||
>>>>> VT.is256BitVector() || VT.is512BitVector()) && "Expected a 128/256/512-bit
>>>>> vector type"' failed.
>>>>>
>>>>>  i tried including is2048Bit Vector() and others. also in vectortype.h
>>>>> i included these types for EVT but was unable to compile backend and
>>>>> getting errors.
>>>>>
>>>>> Please help.
>>>>>
>>>>> Thank You
>>>>>
>>>>>
>>>>> On Sun, Aug 6, 2017 at 8:42 PM, Craig Topper <craig.topper at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> You need a new instruction. And your scalar register size needs to
>>>>>> match your vector element size. So GR32 instead of GR64
>>>>>>
>>>>>> On Sun, Aug 6, 2017 at 5:44 AM hameeza ahmed <hahmed2305 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sorry to disturb,
>>>>>>> Now i want to implement instruction to broadcast scalar register
>>>>>>> content to vector.
>>>>>>>
>>>>>>> like this;
>>>>>>> vpbroadcastq zmm0, rsi
>>>>>>>
>>>>>>>
>>>>>>> I tried implementing it as follows;
>>>>>>>
>>>>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs VR_2048:$dst), (ins
>>>>>>> GR64:$src),
>>>>>>>                     "BROADCASTR_256B\t{$src, $dst|$dst, $src}",
>>>>>>>                     [(set VR_2048:$dst, (v64i32 (X86VBroadcast
>>>>>>>  GR64:$src)))],
>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> def: Pat<(v64f32 (X86VBroadcast GR64:$src)),
>>>>>>> (BROADCASTR_256B GR64:$src)>;
>>>>>>>
>>>>>>>
>>>>>>> Is it fine? Also do i need to define a new instruction for this like
>>>>>>> BROADCASTR_256B? can i use the previous instruction BROADCAST_256B (the one
>>>>>>> that broadcast memory scalar to vector) and just define new pattern?
>>>>>>>
>>>>>>> Please help.
>>>>>>>
>>>>>>> Thank You
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Aug 6, 2017 at 5:10 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank You so much.
>>>>>>>>
>>>>>>>> Wao you are simply genius.
>>>>>>>> initially I didnt include load in both the main instruction and
>>>>>>>> pattern so i included in both as follows:
>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs VR_2048:$dst), (ins
>>>>>>>> i2048mem:$src),
>>>>>>>>                     "BROADCAST_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>                     [(set VR_2048:$dst, (v64i32 (X86VBroadcast (
>>>>>>>> loadi32 addr:$src))))],
>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>
>>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>> And it worked perfectly.
>>>>>>>>
>>>>>>>> Thank You again.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Aug 6, 2017 at 4:28 AM, Craig Topper <
>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Your pattern needs to be
>>>>>>>>>
>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>
>>>>>>>>> ~Craig
>>>>>>>>>
>>>>>>>>> On Sat, Aug 5, 2017 at 2:47 PM, hameeza ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> it runs fine with v64i32. but with the following pattern
>>>>>>>>>>
>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>
>>>>>>>>>> i am getting error.
>>>>>>>>>> What is wrong with this pattern?
>>>>>>>>>>
>>>>>>>>>> On Sun, Aug 6, 2017 at 2:01 AM, hameeza ahmed <
>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> in x86 it is;
>>>>>>>>>>>
>>>>>>>>>>> def : Pat<(int_x86_avx512_vbroadcast_ss_512 addr:$src),
>>>>>>>>>>>           (VBROADCASTSSZm addr:$src)>;
>>>>>>>>>>>
>>>>>>>>>>> mine is
>>>>>>>>>>>
>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:59 AM, hameeza ahmed <
>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> for v16f32 it is defined as;
>>>>>>>>>>>> : Pat<(v16f32 (X86VBroadcast (v16f32 VR512:$src))),
>>>>>>>>>>>>           (VBROADCASTSSZr (EXTRACT_SUBREG (v16f32 VR512:$src),
>>>>>>>>>>>> sub_xmm))>;
>>>>>>>>>>>> which is similar to mine.
>>>>>>>>>>>> Why its not working then?
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:45 AM, Craig Topper <
>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> You need a pattern for v64f32 too.
>>>>>>>>>>>>>
>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:37 PM, hameeza ahmed <
>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> as you said; these are instructions that i defined in
>>>>>>>>>>>>>> instrinfo.td
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs VR_2048:$dst),
>>>>>>>>>>>>>> (ins i2048mem:$src),
>>>>>>>>>>>>>>                     "BROADCAST_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>>>>>>>                     [(set VR_2048:$dst, (v64i32
>>>>>>>>>>>>>> (X86VBroadcast addr:$src)))],
>>>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:28 AM, hameeza ahmed <
>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I did as you said;
>>>>>>>>>>>>>>> now getting this error:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t63: v64f32 = X86ISD::VBROADCAST
>>>>>>>>>>>>>>> t62
>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0, t65, undef:i64
>>>>>>>>>>>>>>>     t65: i64 = X86ISD::Wrapper TargetConstantPool:i64<float
>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>       t64: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>     t8: i64 = undef
>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:14 AM, Craig Topper <
>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Add VT.is2048BitVector() to the assert?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:11 PM, hameeza ahmed <
>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> added the setoperationaction line in isellowering.cpp. now
>>>>>>>>>>>>>>>>> getting the following error.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:6801:
>>>>>>>>>>>>>>>>> llvm::SDValue LowerVectorBroadcast(llvm::BuildVectorSDNode
>>>>>>>>>>>>>>>>> *, const llvm::X86Subtarget &, llvm::SelectionDAG &): Assertion
>>>>>>>>>>>>>>>>> `(VT.is128BitVector() || VT.is256BitVector() || VT.is512BitVector()) &&
>>>>>>>>>>>>>>>>> "Unsupported vector type for broadcast."' failed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What should I do?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:36 AM, Craig Topper <
>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Well first have you done this for your type
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> setOperationAction(ISD::BUILD_VECTOR, v64i32, Custom);
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:29 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> How to do this task??
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:24 AM, Craig Topper <
>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It looks like X86TargetLowering::LowerBUILD_VECTOR is
>>>>>>>>>>>>>>>>>>>> not creating a broadcast node for your wider vector type.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:19 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank You.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I made your mentioned changes and included broadcast
>>>>>>>>>>>>>>>>>>>>> instruction in instructioninfo.td. but i made no
>>>>>>>>>>>>>>>>>>>>> changes in isellowering.cpp file.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Still getting the following error.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t29: v64f32 = BUILD_VECTOR
>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62
>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>     t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>       t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>     t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>     t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>       t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>     t8: i64 = undef
>>>>>>>>>>>>>>>>>>>>>   t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>>>>     t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>       t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>>>>    .................
>>>>>>>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> How to resolve this?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please help..
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 11:19 PM, Craig Topper <
>>>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> You need to use X86VBroadcast not "vbroadcast"
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 10:50 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> i have a c code which multiplies vector with
>>>>>>>>>>>>>>>>>>>>>>> constant something like this;
>>>>>>>>>>>>>>>>>>>>>>> float con=0.2;
>>>>>>>>>>>>>>>>>>>>>>>    for (k = 0; k < N; k++) {
>>>>>>>>>>>>>>>>>>>>>>>        for (i = 1; i <= N-2; i++)
>>>>>>>>>>>>>>>>>>>>>>>            for (j = 1; j <= N-2; j++)
>>>>>>>>>>>>>>>>>>>>>>>         b[i][j] = con * (a[i][j] + a[i-1][j] +
>>>>>>>>>>>>>>>>>>>>>>> a[i+1][j] + a[i][j-1] + a[i][j+1]);
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> now in LLVM IR I m getting;
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>  %22 = fmul <64 x float> %21, <float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> but its assembly in x86 gives;
>>>>>>>>>>>>>>>>>>>>>>> .LCPI0_0:
>>>>>>>>>>>>>>>>>>>>>>> .long 1045220557              # float 0.200000003
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> vbroadcastss zmm1, dword ptr [rip + .LCPI0_0]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> vmulps zmm2, zmm2, zmm1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> how does it lowered the above IR code into
>>>>>>>>>>>>>>>>>>>>>>> vbroadcastss?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> What would be the pattern here to match?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I want to implement similar broadcast for vector of
>>>>>>>>>>>>>>>>>>>>>>> 64 elements.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> i tried the following code;
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> def BROADCAST_DWORD : I<0x60, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>>>>>>> VREGG:$dst), (ins immem:$src),
>>>>>>>>>>>>>>>>>>>>>>>                     "BROADCAST_DWORD\t{$src,
>>>>>>>>>>>>>>>>>>>>>>> $dst|$dst, $src}",
>>>>>>>>>>>>>>>>>>>>>>>                     [(set VREGG:$dst, (v64i32
>>>>>>>>>>>>>>>>>>>>>>> (vbroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>>>>>                     IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Please help me. I am stuck at this point.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>> ~Craig
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170807/a5db2fe6/attachment.html>


More information about the llvm-dev mailing list