[PATCH] D18537: Don't vectorize loops when everything will be scalarized

Tue Mar 29 14:15:13 PDT 2016

hfinkel added a comment.

In http://reviews.llvm.org/D18537#386178, @nadav wrote:

> Hal, I am not sure I understand the problem. Is the problem register pressure or the fact that store <8 x i32> is more expensive than 8 times store i32?

It is really just register pressure. Since there is no legal vector type of i32 in this configuration, everything is just scalarized (or perhaps I should say expanded to avoid overloading terminology here -- the point is that it is type legalization, not operation legalization).

> This looks like a problem with the PPC cost model that does not take into account the cost of scalarization.

No, there is no scalarization cost, because nothing is ever a vector. In fact, if you take my test case and turn off interleaving, you get pretty nice-looking code (which is even interleaved in practice, because that's what type legalization gives us). However, between the vector expansion (type legalization) and the interleaving the targets generally requests, the register pressure is too high.

Also, FWIW, Sanjay says that this patch also fixes PR26837 (which applies to SSE).

http://reviews.llvm.org/D18537