[PATCH] D27692: [x86] use a single shufps when it can save instructions
Andrea Di Biagio via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 14 04:41:14 PST 2016
andreadb added a comment.
In https://reviews.llvm.org/D27692#621550, @RKSimon wrote:
> I'd like to propose the following:
> 1 - we get this patch and https://reviews.llvm.org/D27684 approved and committed, providing v4i32 lowering to shufps and avoiding some of the more unnecessary domain switches.
> 2 - get shufps lowering added to target shuffle combining, I added shufpd recently and it's just been the domain issues that I wanted to tidyup up before adding shufps as well
> 3 - add support for v8i32 (and v16i32?) lowering to shufps
> 4 - other missing domain switch patterns (scalar stores and vpermilps/vpshufd come to mind)
> 5 - add support for domain switching to target shuffle combine when the shuffle depth is 3 or more - this will allow pshufd use on pre-AVX targets and seems to introduce some good uses of insertps as well.
> That seems within scope for 4.0 and doesn't involve anything too exotic. After 4.0 we should be in a better position to begin work on moving some of this work to MC combines to better make use of specific scheduler models
FWIW, this sounds like a very good plan to me too.
More information about the llvm-commits