[PATCH/RFC] Pre-increment preparation pass

Mon Feb 4 20:52:30 PST 2013

On Feb 4, 2013, at 7:33 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> If you don't mind, I'd appreciate some more specific advice. First, is the current implementation of LSR capable of performing this transformation:
>>> for (int i = 0; i < N; ++i) {
>>> x[i] = y[i]
>>> }
>>> needs to be transformed to look more like this:
>>> T *a = x[-1], *b = y[-1];
>>> for (int i = 0; i < N; ++i) {
>>> *++a = *++b;
>>> }
> or is this what the "straight-line address-chain formation pass" you imagine would do? If LSR can do this, what contributes to the decision of whether or not it should be done? In some sense, this is the most important part because this is what enables using the pre-increment forms in the first place. Convincing LSR to otherwise form the chains in unrolled loops seems to be a function of the chain cost functions. Where should I start looking to see how to modify those?

Now, regarding modifying the LSR pass itself. The chain heuristics are in isProfitableIncrement and isProfitableChain.

It should be able to handle the above loop. I might be more likely to handle an unrolled version, but that's a matter of heuristics.

I haven't looked at how LSR handles vectorized loops yet… you may find some interesting behavior.

-Andy