[PATCH] D27692: [x86] use a single shufps when it can save instructions
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 13 14:30:24 PST 2016
RKSimon added a comment.
I'd like to propose the following:
1 - we get this patch and https://reviews.llvm.org/D27684 approved and committed, providing v4i32 lowering to shufps and avoiding some of the more unnecessary domain switches.
2 - get shufps lowering added to target shuffle combining, I added shufpd recently and it's just been the domain issues that I wanted to tidyup up before adding shufps as well
3 - add support for v8i32 (and v16i32?) lowering to shufps
4 - other missing domain switch patterns (scalar stores and vpermilps/vpshufd come to mind)
5 - add support for domain switching to target shuffle combine when the shuffle depth is 3 or more - this will allow pshufd use on pre-AVX targets and seems to introduce some good uses of insertps as well.
That seems within scope for 4.0 and doesn't involve anything too exotic. After 4.0 we should be in a better position to begin work on moving some of this work to MC combines to better make use of specific scheduler models
https://reviews.llvm.org/D27692
More information about the llvm-commits
mailing list