[llvm-dev] VBROADCAST Implementation Issues
hameeza ahmed via llvm-dev
llvm-dev at lists.llvm.org
Sun Aug 6 14:21:59 PDT 2017
i want to implement gather for v64i32. i wrote following code.
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
i2048mem:$src),
"GATHER_256B\t{$src, $dst|$dst, $src}",
[(set VR_2048:$dst, (v64i32 (masked_gather
addr:$src)))],
IIC_MOV_MEM>, TA;
def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B addr:$src)>;
Also i wrote this line in isellowering.h
setOperationAction(ISD::MGATHER, MVT::v64i32,
Legal);
But I am getting following error:
llvm-tblgen: /utils/TableGen/CodeGenDAGPatterns.cpp:2134:
llvm::TreePatternNode *llvm::TreePattern::ParseTreePattern(llvm::Init *,
llvm::StringRef): Assertion `New->getNumTypes() == 1 && "FIXME: Unhandled"'
failed.
What is my mistake?
Please help me.
On Mon, Aug 7, 2017 at 12:03 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote:
> I am trying to implement vector shuffle for v64i32. Is the following
> correct?
>
>
> def VSHUFFLE_256B : I<0xE8, MRMDestReg, (outs VR_2048:$dst),
> (ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1, $src2,
> $dst|$dst, $src1, $src2}",
> [(set VR_2048:$dst, (shufflevector (v64i32 VR_2048:$src1), (v64i32
> VR_2048:$src2)))]>, TA;
>
> Please help.
>
>
>
>
> On Sun, Aug 6, 2017 at 11:48 PM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> i managed to get rid of above error for VT.is2048BitVector()).
>>
>> this was implemented already.
>>
>> now will try define other vectors like VT.is4096BitVector()).
>>
>>
>>
>> On Sun, Aug 6, 2017 at 11:11 PM, hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> Thank you. actually i have to implement both i32 and i64. so i
>>> implemented two instructions now one broadcastS other broadcastD. Although
>>> while doing broadcast from memory to register i was getting no such error
>>> with 1 instruction and other patterns i64, i32 etc. but then also i
>>> implemented its 2 versions single and double.
>>>
>>> Actually, i am trying to compile matrix multiplication code for greater
>>> size vector. There i need to include many new instructions in my backend
>>> like shuffle, gather etc. For now i am getting the following error.
>>>
>>>
>>> Legalizing: t208: v64i32 = BUILD_VECTOR Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
>>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
>>> llc: /lib/Target/X86/X86ISelLowering.cpp:5525: llvm::SDValue
>>> getOnesVector(llvm::EVT, const llvm::X86Subtarget &, llvm::SelectionDAG &,
>>> const llvm::SDLoc &): Assertion `(VT.is128BitVector() ||
>>> VT.is256BitVector() || VT.is512BitVector()) && "Expected a 128/256/512-bit
>>> vector type"' failed.
>>>
>>> i tried including is2048Bit Vector() and others. also in vectortype.h i
>>> included these types for EVT but was unable to compile backend and getting
>>> errors.
>>>
>>> Please help.
>>>
>>> Thank You
>>>
>>>
>>> On Sun, Aug 6, 2017 at 8:42 PM, Craig Topper <craig.topper at gmail.com>
>>> wrote:
>>>
>>>> You need a new instruction. And your scalar register size needs to
>>>> match your vector element size. So GR32 instead of GR64
>>>>
>>>> On Sun, Aug 6, 2017 at 5:44 AM hameeza ahmed <hahmed2305 at gmail.com>
>>>> wrote:
>>>>
>>>>> Sorry to disturb,
>>>>> Now i want to implement instruction to broadcast scalar register
>>>>> content to vector.
>>>>>
>>>>> like this;
>>>>> vpbroadcastq zmm0, rsi
>>>>>
>>>>>
>>>>> I tried implementing it as follows;
>>>>>
>>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs VR_2048:$dst), (ins
>>>>> GR64:$src),
>>>>> "BROADCASTR_256B\t{$src, $dst|$dst, $src}",
>>>>> [(set VR_2048:$dst, (v64i32 (X86VBroadcast
>>>>> GR64:$src)))],
>>>>> IIC_MOV_MEM>, TA;
>>>>>
>>>>>
>>>>>
>>>>> def: Pat<(v64f32 (X86VBroadcast GR64:$src)),
>>>>> (BROADCASTR_256B GR64:$src)>;
>>>>>
>>>>>
>>>>> Is it fine? Also do i need to define a new instruction for this like
>>>>> BROADCASTR_256B? can i use the previous instruction BROADCAST_256B (the one
>>>>> that broadcast memory scalar to vector) and just define new pattern?
>>>>>
>>>>> Please help.
>>>>>
>>>>> Thank You
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 6, 2017 at 5:10 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank You so much.
>>>>>>
>>>>>> Wao you are simply genius.
>>>>>> initially I didnt include load in both the main instruction and
>>>>>> pattern so i included in both as follows:
>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs VR_2048:$dst), (ins
>>>>>> i2048mem:$src),
>>>>>> "BROADCAST_256B\t{$src, $dst|$dst, $src}",
>>>>>> [(set VR_2048:$dst, (v64i32 (X86VBroadcast (
>>>>>> loadi32 addr:$src))))],
>>>>>> IIC_MOV_MEM>, TA;
>>>>>>
>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>> And it worked perfectly.
>>>>>>
>>>>>> Thank You again.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Aug 6, 2017 at 4:28 AM, Craig Topper <craig.topper at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Your pattern needs to be
>>>>>>>
>>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>
>>>>>>> ~Craig
>>>>>>>
>>>>>>> On Sat, Aug 5, 2017 at 2:47 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> it runs fine with v64i32. but with the following pattern
>>>>>>>>
>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>
>>>>>>>> i am getting error.
>>>>>>>> What is wrong with this pattern?
>>>>>>>>
>>>>>>>> On Sun, Aug 6, 2017 at 2:01 AM, hameeza ahmed <hahmed2305 at gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> in x86 it is;
>>>>>>>>>
>>>>>>>>> def : Pat<(int_x86_avx512_vbroadcast_ss_512 addr:$src),
>>>>>>>>> (VBROADCASTSSZm addr:$src)>;
>>>>>>>>>
>>>>>>>>> mine is
>>>>>>>>>
>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Aug 6, 2017 at 1:59 AM, hameeza ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> for v16f32 it is defined as;
>>>>>>>>>> : Pat<(v16f32 (X86VBroadcast (v16f32 VR512:$src))),
>>>>>>>>>> (VBROADCASTSSZr (EXTRACT_SUBREG (v16f32 VR512:$src),
>>>>>>>>>> sub_xmm))>;
>>>>>>>>>> which is similar to mine.
>>>>>>>>>> Why its not working then?
>>>>>>>>>>
>>>>>>>>>> On Sun, Aug 6, 2017 at 1:45 AM, Craig Topper <
>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> You need a pattern for v64f32 too.
>>>>>>>>>>>
>>>>>>>>>>> ~Craig
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:37 PM, hameeza ahmed <
>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> as you said; these are instructions that i defined in
>>>>>>>>>>>> instrinfo.td
>>>>>>>>>>>>
>>>>>>>>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs VR_2048:$dst),
>>>>>>>>>>>> (ins i2048mem:$src),
>>>>>>>>>>>> "BROADCAST_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>>>>> [(set VR_2048:$dst, (v64i32 (X86VBroadcast
>>>>>>>>>>>> addr:$src)))],
>>>>>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:28 AM, hameeza ahmed <
>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I did as you said;
>>>>>>>>>>>>> now getting this error:
>>>>>>>>>>>>>
>>>>>>>>>>>>> LLVM ERROR: Cannot select: t63: v64f32 = X86ISD::VBROADCAST t62
>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t65, undef:i64
>>>>>>>>>>>>> t65: i64 = X86ISD::Wrapper TargetConstantPool:i64<float
>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>> t64: i64 = TargetConstantPool<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:14 AM, Craig Topper <
>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Add VT.is2048BitVector() to the assert?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:11 PM, hameeza ahmed <
>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> added the setoperationaction line in isellowering.cpp. now
>>>>>>>>>>>>>>> getting the following error.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> llc: /lib/Target/X86/X86ISelLowering.cpp:6801:
>>>>>>>>>>>>>>> llvm::SDValue LowerVectorBroadcast(llvm::BuildVectorSDNode
>>>>>>>>>>>>>>> *, const llvm::X86Subtarget &, llvm::SelectionDAG &): Assertion
>>>>>>>>>>>>>>> `(VT.is128BitVector() || VT.is256BitVector() || VT.is512BitVector()) &&
>>>>>>>>>>>>>>> "Unsupported vector type for broadcast."' failed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What should I do?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:36 AM, Craig Topper <
>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Well first have you done this for your type
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> setOperationAction(ISD::BUILD_VECTOR, v64i32, Custom);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:29 PM, hameeza ahmed <
>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> How to do this task??
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at 12:24 AM, Craig Topper <
>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It looks like X86TargetLowering::LowerBUILD_VECTOR is
>>>>>>>>>>>>>>>>>> not creating a broadcast node for your wider vector type.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:19 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank You.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I made your mentioned changes and included broadcast
>>>>>>>>>>>>>>>>>>> instruction in instructioninfo.td. but i made no
>>>>>>>>>>>>>>>>>>> changes in isellowering.cpp file.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Still getting the following error.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t29: v64f32 = BUILD_VECTOR
>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>>> t62, t62, t62, t62
>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>> t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>> t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>> t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>> t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>>>>>>>>> t62: f32,ch = load<LD4[ConstantPool]> t0, t64,
>>>>>>>>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>>>>>>>> t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>> TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>> t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>> .................
>>>>>>>>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> How to resolve this?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please help..
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 11:19 PM, Craig Topper <
>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You need to use X86VBroadcast not "vbroadcast"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 10:50 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> i have a c code which multiplies vector with constant
>>>>>>>>>>>>>>>>>>>>> something like this;
>>>>>>>>>>>>>>>>>>>>> float con=0.2;
>>>>>>>>>>>>>>>>>>>>> for (k = 0; k < N; k++) {
>>>>>>>>>>>>>>>>>>>>> for (i = 1; i <= N-2; i++)
>>>>>>>>>>>>>>>>>>>>> for (j = 1; j <= N-2; j++)
>>>>>>>>>>>>>>>>>>>>> b[i][j] = con * (a[i][j] + a[i-1][j] +
>>>>>>>>>>>>>>>>>>>>> a[i+1][j] + a[i][j-1] + a[i][j+1]);
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> now in LLVM IR I m getting;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> %22 = fmul <64 x float> %21, <float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>> 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>> float 0x3FC99999A0000000>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> but its assembly in x86 gives;
>>>>>>>>>>>>>>>>>>>>> .LCPI0_0:
>>>>>>>>>>>>>>>>>>>>> .long 1045220557 # float 0.200000003
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> vbroadcastss zmm1, dword ptr [rip + .LCPI0_0]
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> vmulps zmm2, zmm2, zmm1
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> how does it lowered the above IR code into
>>>>>>>>>>>>>>>>>>>>> vbroadcastss?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> What would be the pattern here to match?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I want to implement similar broadcast for vector of 64
>>>>>>>>>>>>>>>>>>>>> elements.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> i tried the following code;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> def BROADCAST_DWORD : I<0x60, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>>>>> VREGG:$dst), (ins immem:$src),
>>>>>>>>>>>>>>>>>>>>> "BROADCAST_DWORD\t{$src,
>>>>>>>>>>>>>>>>>>>>> $dst|$dst, $src}",
>>>>>>>>>>>>>>>>>>>>> [(set VREGG:$dst, (v64i32
>>>>>>>>>>>>>>>>>>>>> (vbroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>>> IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please help me. I am stuck at this point.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>> --
>>>> ~Craig
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170807/6a188f2b/attachment-0001.html>
More information about the llvm-dev
mailing list