<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div><blockquote type="cite"><blockquote type="cite"><font class="Apple-style-span" color="#006312"><br></font><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">So given two patterns that match the same thing, what's the<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">tiebreaker?<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">I thought it was order in the .td file but that doesn't appear to be<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">the<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">case. I put my pattern first and it isn't selected. I change the<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">other<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">pattern slightly so it won't match anything and then my pattern gets<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">used (so I know my pattern is valid).<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Also, I really wanted to express this pattern as transforming from one<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">DAG to another, not down to machine instructions. I saw this in<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">x86InstSSE.td:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">// FIXME: may not be able to eliminate this movss with coalescing the<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">src and<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">// dest register classes are different. We really want to write this<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">pattern<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">// like this:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">// def : Pat<(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))),<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">// (f32 FR32:$src)>;<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">(this is actually a very useful and important pattern, I wish it was<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">available!)<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Right. It would be nice to be able to eliminate the unnecessary<br></blockquote><blockquote type="cite">movss. It hasn't shown up on my radar so I haven't really thought out<br></blockquote><blockquote type="cite">the right way to model this. I can see a couple of options:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">1. Treat these instructions as cross register class copies. The src<br></blockquote><blockquote type="cite">and dst classes are different (VR128 and FR32) but "compatible".<br></blockquote><blockquote type="cite">2. Model it as extract_subreg which coalescer can eliminate.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">#2 is conceptually correct. The problem is 128 bit XMM0 is the same<br></blockquote><blockquote type="cite">register as 32 bit (or 64 bit) XMM0. So it's not possible to define<br></blockquote><blockquote type="cite">the super-register / sub-register relationship.<br></blockquote><br>I don't understand the problem with subregs here. Is it just a<br>naming issue? That can be solved by introducing alternate names,<br>like XMM0_32 and XMM0_64, for each of the subregs. They could<br>still be printed as "xmm0" in the assembly output of course.</div></blockquote><div><br></div>Right. That's a workable solution. However, it still adds complexity:</div><div><br></div><div>XMM0_32 = MOVPS2SSrr XMM0</div><div><br></div><div>We need to teach the allocator that the two registers are the "same" and this is a identity copy.</div><div><br></div><div>Evan</div><div><br><blockquote type="cite"><div><br><br>Dan<br><br><br>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br></div></blockquote></div><br></body></html>