[PATCH] [X86, SSE] instcombine common cases of insertps intrinsics into shuffles

Simon Pilgrim llvm-dev at redking.me.uk
Mon Apr 6 07:09:18 PDT 2015


In http://reviews.llvm.org/D8833#151845, @spatel wrote:

> In http://reviews.llvm.org/D8833#151824, @RKSimon wrote:
>
> > This is looking pretty good. If the reason you haven't used the zmask more is to avoid the need for multiple shuffle stages is it worthwhile checking if the zmask (only) overrides the insertion destination, or cases where the 2 operands point to the same variable?
>
>
> I thought about the case where the zmask overrides the insert, but I figured that was pretty far-fetched. I didn't consider the case where both inputs are the same. Let me know if you think those are worth chasing as stand-alone cases or if it's better to just solve the zmask case in general in the backend. Even if the oddball cases are worthy, I'd prefer to solve them in a follow-on patch just for the sake of patch minimalism.


I have ended up using both in downcoding, mainly due to register pressure problems:

- zeroing out lanes using insertps as we don't have a spare register for the xorps/blendps pattern
- reuse of the first operand - shuffle X00X

Its the second case that I think would be particular useful here but I'd be happy to see any improvements in a follow up patch.


http://reviews.llvm.org/D8833

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list