[PATCH] Break dependencies in large loops containing reductions (LoopVectorize)

Wed Feb 11 13:58:16 PST 2015

In http://reviews.llvm.org/D7514#122201, @ohsallen wrote:

> > I think we might want to separate the current single number into two numbers: one for ILP and once for latency. But I'm not exactly sure what you're suggesting.
>
>
> It seems we don't really care about the latency alone, we just want to know about ILP. As I see it, the current number takes into account ILP (that is, latency and number of functional units), and **somehow** register pressure, like we don't want to put a big number like 12 in there.

There is a separate register-pressure heuristic, and already uses a different TTI interface to get the number of available registers. Look at the calculateRegisterUsage() function.

> What I need is a number which only takes into account ILP, and that is because the cost function I propose already takes register pressure into account. Typically I want to have 12 for http://reviews.llvm.org/P7/http://reviews.llvm.org/P8. What I am suggesting is to add a new TTI function to return the max ILP available.

Maybe, but even if we only care about latency combined with ILP, we sometimes are only about ILP independent of latency. Thus, I think having interfaces that return (average) instruction latency, (average) ILP, and whether or not the processor can speculate future loop iterations without dependencies (true for OoO, false for in-order, generically) etc. is the most straightforward from the backend modeling perspective.

http://reviews.llvm.org/D7514

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/