<div dir="ltr"><br><br><div class="gmail_quote">On Thu, Apr 30, 2015 at 12:24 PM Mehdi Amini <<a href="mailto:mehdi.amini@apple.com">mehdi.amini@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><blockquote type="cite"><div>On Apr 30, 2015, at 12:04 PM, Owen Anderson <<a href="mailto:resistor@mac.com" target="_blank">resistor@mac.com</a>> wrote:</div><br><div><div style="word-wrap:break-word"><br><div><blockquote type="cite"><div>On Apr 30, 2015, at 8:41 AM, Sanjay Patel <<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>> wrote:</div><br><div><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">So to me, an in-order machine is still superscalar and pipelined. You have to expose ILP or you die a high-frequency death.</span></div></blockquote><br></div><div>Many (most?) GPUs hide latencies via massive hyper threading rather than exploiting per-thread ILP.  The hardware presents a model where every instruction has unit latency, because the real latency is entirely hidden by hyper threading.  Using more registers eats up the finite pool of storage in the chip, limiting the number of threads that can run concurrently, and ultimately reducing the hardware’s ability to hyper thread, killing performance.</div><div><br></div><div>This isn’t just a concern for GPUs, though.  Even superscalar CPUs are not necessarily uniformly superscalar.  I’m aware of plenty of lower power designs that can multi-issue integer instructions but not floating point, for instance.</div></div></div></blockquote><div><br></div></div></div><div style="word-wrap:break-word"><div><div>How would OOO change anything with respect to this transformation?</div><div><br></div></div></div></blockquote><div><br></div><div>Basically using a simplifying assumption of OoO is "really large multiple issue".</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div></div><div>— </div></div></div><div style="word-wrap:break-word"><div><div>Mehdi</div><div><br></div></div></div>_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</blockquote></div></div>