<div dir="ltr">It looks like X86TargetLowering::LowerBUILD_VECTOR is not creating a broadcast node for your wider vector type.</div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div>
<br><div class="gmail_quote">On Sat, Aug 5, 2017 at 12:19 PM, hameeza ahmed <span dir="ltr"><<a href="mailto:hahmed2305@gmail.com" target="_blank">hahmed2305@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Thank You.</div><div><br></div><div>I made your mentioned changes and included broadcast instruction in <a href="http://instructioninfo.td" target="_blank">instructioninfo.td</a>. but i made no changes in isellowering.cpp file.</div><div><br></div><div>Still getting the following error.</div><div><br></div><div><br></div><div><br></div><div><br></div><div><div>LLVM ERROR: Cannot select: t29: v64f32 = BUILD_VECTOR t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62</div><div> t62: f32,ch = load<LD4[ConstantPool]> t0, t64, undef:i64</div><div> t64: i64 = X86ISD::Wrapper TargetConstantPool:i64<float 0x3FC99999A0000000> 0</div><div> t63: i64 = TargetConstantPool<float 0x3FC99999A0000000> 0</div><div> t8: i64 = undef</div><div> t62: f32,ch = load<LD4[ConstantPool]> t0, t64, undef:i64</div><div> t64: i64 = X86ISD::Wrapper TargetConstantPool:i64<float 0x3FC99999A0000000> 0</div><div> t63: i64 = TargetConstantPool<float 0x3FC99999A0000000> 0</div><div> t8: i64 = undef</div><div> t62: f32,ch = load<LD4[ConstantPool]> t0, t64, undef:i64</div><div> t64: i64 = X86ISD::Wrapper TargetConstantPool:i64<float 0x3FC99999A0000000> 0</div><div> t63: i64 = TargetConstantPool<float 0x3FC99999A0000000> 0</div><div> .................</div><div>In function: stencil</div></div><div><br></div><div><br></div><div><br></div><div><br></div><div>How to resolve this?</div><div><br></div><div>Please help..</div><div><div class="h5"><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Aug 5, 2017 at 11:19 PM, Craig Topper <span dir="ltr"><<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">You need to use <span class="m_-928832443093832218gmail-m_2615204192921729368gmail-s1">X86VBro</span><span class="m_-928832443093832218gmail-m_2615204192921729368gmail-s2">adcast not "vbroadcast"</span></div><div class="gmail_extra"><span class="m_-928832443093832218gmail-HOEnZb"><font color="#888888"><br clear="all"><div><div class="m_-928832443093832218gmail-m_2615204192921729368gmail_signature">~Craig</div></div></font></span><div><div class="m_-928832443093832218gmail-h5">
<br><div class="gmail_quote">On Sat, Aug 5, 2017 at 10:50 AM, hameeza ahmed <span dir="ltr"><<a href="mailto:hahmed2305@gmail.com" target="_blank">hahmed2305@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello,<div><br></div><div>i have a c code which multiplies vector with constant something like this;</div><div><div>float con=0.2;</div><div> for (k = 0; k < N; k++) {</div><div> for (i = 1; i <= N-2; i++)</div><div> for (j = 1; j <= N-2; j++)</div><div> <span style="white-space:pre-wrap"> </span> b[i][j] = con * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] + a[i][j+1]);</div></div><div><br></div><div><br></div><div>now in LLVM IR I m getting;</div><div><br></div><div> %22 = fmul <64 x float> %21, <float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000><br></div><div><br></div><div>but its assembly in x86 gives;</div><div><div>.LCPI0_0:</div><div><span style="white-space:pre-wrap"> </span>.long<span style="white-space:pre-wrap"> </span>1045220557 # float 0.200000003</div></div><div><br></div><div><span style="white-space:pre-wrap"> </span>vbroadcastss<span style="white-space:pre-wrap"> </span>zmm1, dword ptr [rip + .LCPI0_0]<br></div><div><br></div><div>vmulps<span style="white-space:pre-wrap"> </span>zmm2, zmm2, zmm1<br></div><div><br></div><div>how does it lowered the above IR code into vbroadcastss?</div><div><br></div><div>What would be the pattern here to match?</div><div><br></div><div>I want to implement similar broadcast for vector of 64 elements.</div><div><br></div><div>i tried the following code;</div><div><br></div><div><div>def BROADCAST_DWORD : I<0x60, MRMSrcMem, (outs VREGG:$dst), (ins immem:$src),</div><div> "BROADCAST_DWORD\t{$src, $dst|$dst, $src}",</div><div> [(set VREGG:$dst, (v64i32 (vbroadcast addr:$src)))],</div><div> IIC_MOV_MEM>, TA;</div></div><div><br></div><div>Please help me. I am stuck at this point.</div><div><br></div><div>Thank You</div><div>Regards</div><div><br></div></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>