[llvm] r236031 - transform fadd chains to increase parallelism

Thu Apr 30 12:25:31 PDT 2015

On Thu, Apr 30, 2015 at 12:24 PM Mehdi Amini <mehdi.amini at apple.com> wrote:

> On Apr 30, 2015, at 12:04 PM, Owen Anderson <resistor at mac.com> wrote:
>
>
> On Apr 30, 2015, at 8:41 AM, Sanjay Patel <spatel at rotateright.com> wrote:
>
> So to me, an in-order machine is still superscalar and pipelined. You have
> to expose ILP or you die a high-frequency death.
>
>
> Many (most?) GPUs hide latencies via massive hyper threading rather than
> exploiting per-thread ILP.  The hardware presents a model where every
> instruction has unit latency, because the real latency is entirely hidden
> by hyper threading.  Using more registers eats up the finite pool of
> storage in the chip, limiting the number of threads that can run
> concurrently, and ultimately reducing the hardware’s ability to hyper
> thread, killing performance.
>
> This isn’t just a concern for GPUs, though.  Even superscalar CPUs are not
> necessarily uniformly superscalar.  I’m aware of plenty of lower power
> designs that can multi-issue integer instructions but not floating point,
> for instance.
>
>
> How would OOO change anything with respect to this transformation?
>
>
Basically using a simplifying assumption of OoO is "really large multiple
issue".

-eric

> —
> Mehdi
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150430/d18b4357/attachment.html>