How does foldMemoryOpImpl work for those patterns that are using subreg? Won't it only create a v4i32 constant pool entry?<br><br><div class="gmail_quote">On Fri, Jan 13, 2012 at 10:42 AM, Jakob Stoklund Olesen <span dir="ltr"><<a href="mailto:stoklund@2pi.dk">stoklund@2pi.dk</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>

On Jan 13, 2012, at 10:29 AM, Craig Topper wrote:<br>

<br>

> Well I didn't understand the FIXME comment about the JIT issue around the AVX_SET0PSY and AVX_SET0PDY. So I did it this way. If you can shed any light on that FIXME that would be great.<br>

<br>

</div>The (non-MC) JIT requires encoding bits on any instruction that reaches the end of codegen. The ExpandPostRAPseudos pass avoids this problem by expanding pseudos before they reach the JIT. Such pseudos don't need encoding information.<br>


<br>

That's why V_SET0 can be marked as a proper pseudo-instruction:<br>

<br>

def V_SET0 : I<0, Pseudo, (outs VR128:$dst), (ins), "", []>;<br>

<div class="im"><br>

> What about the V_SET0 having VR128, do we need a separate one for VR256 or is there some way to do this?<br>

<br>

</div>You can use (SUBREG_TO_REG (i32 0), (V_SET0), sub_xmm).<br>

<br>

There are already patterns like that:<br>

<div class="im"><br>

// AVX has no support for 256-bit integer instructions, but since the 128-bit<br>

// VPXOR instruction writes zero to its upper part, it's safe build zeros.<br>

</div>def : Pat<(v8i32 immAllZerosV), (SUBREG_TO_REG (i32 0), (V_SET0), sub_xmm)>;<br>

def : Pat<(bc_v8i32 (v8f32 immAllZerosV)),<br>

          (SUBREG_TO_REG (i32 0), (V_SET0), sub_xmm)>;<br>

<br>

def : Pat<(v4i64 immAllZerosV), (SUBREG_TO_REG (i64 0), (V_SET0), sub_xmm)>;<br>

def : Pat<(bc_v4i64 (v8f32 immAllZerosV)),<br>

          (SUBREG_TO_REG (i64 0), (V_SET0), sub_xmm)>;<br>

<br>

/jakob<br>

<br>

</blockquote></div><br><br clear="all"><br>-- <br>~Craig<br>