[PATCH] Break dependencies in large loops containing reductions (LoopVectorize)
hfinkel at anl.gov
hfinkel at anl.gov
Wed Feb 11 14:46:09 PST 2015
In http://reviews.llvm.org/D7514#122228, @ohsallen wrote:
> > There is a separate register-pressure heuristic, and already uses a different TTI interface to get the number of available registers. Look at the calculateRegisterUsage() function.
>
>
> Right, but then why not setting 12 as the max interleave factor for POWER7/POWER8? From our previous discussion (http://reviews.llvm.org/D7503), I understood that you didn't want to put 12 because of potential spillings. I feel like register pressure and available ILP should be completely separated concerns.
No, I simply didn't want to put 12 without benchmarking it first, because if you put 12, you're really depending on that heuristic to do a good job (because the number of allocatable registers is only a small multiple of that).
>
>
> > Maybe, but even if we only care about latency combined with ILP, we sometimes are only about ILP independent of latency. Thus, I think having interfaces that return (average) instruction latency, (average) ILP, and whether or not the processor can speculate future loop iterations without dependencies (true for OoO, false for in-order, generically) etc. is the most straightforward from the backend modeling perspective.
>
>
> This sound great, I would be happy to implement that. I feel like when you talk about ILP, you're talking about number of functional units, am I right?
Yes.
> I will try to put something together without changing too much the current policy.
>
> Thanks
Sounds good, thanks!
http://reviews.llvm.org/D7514
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list