[PATCH] D31960: [InstSimplify] fold identity shuffles (recursing if needed)

Filipe Cabecinhas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 19 06:53:00 PDT 2017


filcab accepted this revision.
filcab added a comment.

Changes LGTM. Thanks!
(Sorry for not replying earlier)



================
Comment at: lib/Analysis/InstructionSimplify.cpp:4098
+  if (MaskVal == -1)
+    return nullptr;
+
----------------
spatel wrote:
> filcab wrote:
> > But we might end up with shuffles we'd like to remove if we always skip shuffles with undef, no?
> > Do you want to add a shuffle similar to your `identity_mask` tests which shows that we don't remove if there's an undef (assuming it's as easy as making one of the last shuffle mask's elements an undef. No need to go out of your way for a negative test)?
> > 
> Yes, this is conservative as a first step, so it will miss some patterns. I wanted to make sure that I have my undef story straight before enhancing this (eg, D31980). 
> 
> Also, what is the expected behavior in this case?
> 
>   define <4 x float> @elts_test_vpermilvar_ps(<4 x float> %a0, i32 %a1) {
>   ; CHECK-LABEL: @elts_test_vpermilvar_ps(
>   ; CHECK-NEXT:    ret <4 x float> %a0
>   ;
>   %1 = insertelement <4 x i32> <i32 0, i32 1, i32 2, i32 3>, i32 %a1, i32 3
>   %2 = tail call <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float> %a0, <4 x i32> %1)
>   %3 = shufflevector <4 x float> %2, <4 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
>   ret <4 x float> %3
>   }
> 
> If we simplify the shuffle to %2, we can no longer eliminate the x86 vperm?
Being in IR, I'd guess one way is to add target hooks like we have on SDAG to ask the target "is this a shuffle-like operation? If so, what's the mask?".

It will not be fun, though :(

Dealing with `undef` has a few edge cases, so being conservative here and getting that sorted out in a different patch sounds good.


https://reviews.llvm.org/D31960





More information about the llvm-commits mailing list