[PATCH] D41096: [X86] Initial support for prefer-vector-width function attribute

Mon Dec 11 17:57:03 PST 2017

hfinkel added a comment.

In https://reviews.llvm.org/D41096#951928, @craig.topper wrote:

> The loop vectorizer definitely creates wider vectors even though its told not to.
>
> For one it only considers the scalar types of loads, stores, and phis when determining the VF factor. So if all your loads/stores use i32, but some operations like compares or address calculations use i64 types due to zext/sext, the vectorizer doesn't see them when determining VF. I don't know enough about the vectorizer to say if that should be fixed or not.

Interesting. Under normal circumstances, vectorizing for the smallest/smaller type can make sense. This way you maximally use the vector lanes at all point in the calculation, and the larger types just take more than one underlying register. If using wider vectors affects the clock rate, for example, there's a large unaccounted-for cost (it's not a splitting cost, but an overall, potentially-large, penalty).

> There also the interleaved load/store optimization in the vector that very deliberately creates large loads, stores, and shuffles.
> 
> Good point on the ABI requirements.

https://reviews.llvm.org/D41096