[PATCH] D39575: [X86] Add subtarget features prefer-avx256 and prefer-avx128 and use them to limit vector width presented by TTI
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 7 05:45:29 PST 2017
RKSimon added a comment.
In https://reviews.llvm.org/D39575#916093, @hfinkel wrote:
> Not to get too far off topic, but could you elaborate somewhere? Are there bug reports? If it's 4x the cost, and only 2x the width, I'm surprised that we'd get that wrong (assuming that's true for most of the instructions in the loop). I'm curious whether is a deficiency with the register-pressure estimation heuristic in the vectorizer (which matters only for interleaving, but perhaps that's part of the problem?).
I'll create a proper bug for this, but an example is in some umin reduction code I'm working on: https://godbolt.org/g/Cb1Crb - with as many subvector extract/inserts ops as there are vpmin calls
https://reviews.llvm.org/D39575
More information about the llvm-commits
mailing list