[PATCH] D39575: [X86] Add subtarget features prefer-avx256 and prefer-avx128 and use them to limit vector width presented by TTI

Tue Nov 7 05:45:29 PST 2017

RKSimon added a comment.

In https://reviews.llvm.org/D39575#916093, @hfinkel wrote:

> Not to get too far off topic, but could you elaborate somewhere? Are there bug reports? If it's 4x the cost, and only 2x the width, I'm surprised that we'd get that wrong (assuming that's true for most of the instructions in the loop). I'm curious whether is a deficiency with the register-pressure estimation heuristic in the vectorizer (which matters only for interleaving, but perhaps that's part of the problem?).

I'll create a proper bug for this, but an example is in some umin reduction code I'm working on: https://godbolt.org/g/Cb1Crb - with as many subvector extract/inserts ops as there are vpmin calls

https://reviews.llvm.org/D39575