[PATCH] Break dependencies in large loops containing reductions (LoopVectorize)

Wed Feb 11 14:46:09 PST 2015

In http://reviews.llvm.org/D7514#122228, @ohsallen wrote:

> > There is a separate register-pressure heuristic, and already uses a different TTI interface to get the number of available registers. Look at the calculateRegisterUsage() function.
>
>
> Right, but then why not setting 12 as the max interleave factor for POWER7/POWER8? From our previous discussion (http://reviews.llvm.org/D7503), I understood that you didn't want to put 12 because of potential spillings. I feel like register pressure and available ILP should be completely separated concerns.

No, I simply didn't want to put 12 without benchmarking it first, because if you put 12, you're really depending on that heuristic to do a good job (because the number of allocatable registers is only a small multiple of that).

> 

> 

> > Maybe, but even if we only care about latency combined with ILP, we sometimes are only about ILP independent of latency. Thus, I think having interfaces that return (average) instruction latency, (average) ILP, and whether or not the processor can speculate future loop iterations without dependencies (true for OoO, false for in-order, generically) etc. is the most straightforward from the backend modeling perspective.

> 

> 

> This sound great, I would be happy to implement that. I feel like when you talk about ILP, you're talking about number of functional units, am I right?

Yes.

> I will try to put something together without changing too much the current policy.

> 

> Thanks

Sounds good, thanks!

http://reviews.llvm.org/D7514

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/