<div dir="ltr"><div><div>Thanks all. Between Owen's GPU description and Mehdi's test cases, I can see how this patch went off the rails.<br><br></div>I'm back to wondering if we can still do this as a DAG combine with the help of a target hook:<br><br>TLI.getReassociationLimit(Opcode, EVT)<br><br></div><div>For some operation on some data type, does it make sense to attempt to extract some ILP? By default, we'd make this 0. For a machine that has no exposed superscalar / pipelining ILP opportunities, it would always return 0. If non-zero, the number would be a value that's based on the number of registers and/or issue width and/or pipe stages for the given operation. Something like the 'vectorization factor' or 'interleave factor' used by the vectorizers?<br><br></div><div>unsigned CombineCount = 0;<br></div><div>while (CombineCount < TLI.getReassociationLimit(Opcode, EVT))<br></div><div>  if (tryTheCombine(Opcode, EVT) <br></div><div>    CombineCount++;<br></div><div>  <br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Apr 30, 2015 at 1:25 PM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><br><div class="gmail_quote"><div><div class="h5">On Thu, Apr 30, 2015 at 12:24 PM Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><blockquote type="cite"><div>On Apr 30, 2015, at 12:04 PM, Owen Anderson <<a href="mailto:resistor@mac.com" target="_blank">resistor@mac.com</a>> wrote:</div><br><div><div style="word-wrap:break-word"><br><div><blockquote type="cite"><div>On Apr 30, 2015, at 8:41 AM, Sanjay Patel <<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>> wrote:</div><br><div><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">So to me, an in-order machine is still superscalar and pipelined. You have to expose ILP or you die a high-frequency death.</span></div></blockquote><br></div><div>Many (most?) GPUs hide latencies via massive hyper threading rather than exploiting per-thread ILP.  The hardware presents a model where every instruction has unit latency, because the real latency is entirely hidden by hyper threading.  Using more registers eats up the finite pool of storage in the chip, limiting the number of threads that can run concurrently, and ultimately reducing the hardware’s ability to hyper thread, killing performance.</div><div><br></div><div>This isn’t just a concern for GPUs, though.  Even superscalar CPUs are not necessarily uniformly superscalar.  I’m aware of plenty of lower power designs that can multi-issue integer instructions but not floating point, for instance.</div></div></div></blockquote><div><br></div></div></div><div style="word-wrap:break-word"><div><div>How would OOO change anything with respect to this transformation?</div><div><br></div></div></div></blockquote><div><br></div></div></div><div>Basically using a simplifying assumption of OoO is "really large multiple issue".</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div></div><div>— </div></div></div><div style="word-wrap:break-word"><div><div>Mehdi</div><div><br></div></div></div><span class="">_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</span></blockquote></div></div>

<br>_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

<br></blockquote></div><br></div>