[llvm] r236031 - transform fadd chains to increase parallelism

Thu Apr 30 12:14:58 PDT 2015

On Thu, Apr 30, 2015 at 12:08 PM Owen Anderson <resistor at mac.com> wrote:

>
> On Apr 30, 2015, at 8:41 AM, Sanjay Patel <spatel at rotateright.com> wrote:
>
> So to me, an in-order machine is still superscalar and pipelined. You have
> to expose ILP or you die a high-frequency death.
>
>
> Many (most?) GPUs hide latencies via massive hyper threading rather than
> exploiting per-thread ILP.  The hardware presents a model where every
> instruction has unit latency, because the real latency is entirely hidden
> by hyper threading.  Using more registers eats up the finite pool of
> storage in the chip, limiting the number of threads that can run
> concurrently, and ultimately reducing the hardware’s ability to hyper
> thread, killing performance.
>
> This isn’t just a concern for GPUs, though.  Even superscalar CPUs are not
> necessarily uniformly superscalar.  I’m aware of plenty of lower power
> designs that can multi-issue integer instructions but not floating point,
> for instance.
>
>
This is a good point. We might want to have TLI grow the knowledge of
instruction issue rates/out of order-ness so that we can identify how much
parallelism we should go for here. I definitely agree that this should be
target dependent.

-eric

> —Owen
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150430/be83aab7/attachment.html>