[PATCH][x86] Teach how to combine a vselect into a movss/movsd.
Juergen Ributzka
juergen at apple.com
Fri Jan 17 11:43:49 PST 2014
Hi Andrea,
the patch LGTM (but I am not the code owner).
+ SDValue NewLHS = InvertOperands ? RHS : LHS;
+ SDValue NewRHS = InvertOperands ? LHS : RHS;
^ std::swap would be nicer here
Cheers,
Juergen
On Jan 16, 2014, at 5:42 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> Hi,
>
> this patch teaches the x86 backend how to combine vselect dag nodes
> into movss/movsd when possible.
>
> If the vector type of the operands of the vselect is either
> MVT::v4i32 or MVT::v4f32, then we can fold according to the following rules:
>
> 1. fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B);
> 2. fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A)
>
> If the vector type of the operands of the vselect is either
> MVT::v2i64 or MVT::v2f64 (and we have SSE2) , then we can fold
> according to the following rules:
>
> 3. fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B)
> 4. fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A)
>
> I added extra test cases to file 'test/CodeGen/X86/vselect.ll' in
> order to verify that we correctly select movss/movsd instructions.
>
> Before this change, the backend only knew how to lower a shufflevector
> into a X86Movss/X86Movsd, but not how to do the same with vselect dag
> nodes.
> For that reason, all the ISel patterns introduced at r197145
> http://llvm.org/viewvc/llvm-project?view=revision&revision=197145
> were only matched if the X86Movss/X86Movsd were obtained from the
> custom lowering of a shufflevector.
>
> With this change, the backend is now able to combine vselect into
> X86Movss and therefore it can reuse the patterns from revision 197145
> to further simplify packed vector arithmetic operations.
>
> I added new test-cases in 'test/CodeGen/X86/sse-scalar-fp-arith-2.ll'
> to verify that now we correctly select SSE/AVX scalar fp instructions
> from a packed arithmetic instruction followed by a vselect.
>
> After this change, the following tests started failing because they
> always expected blendvps/blendvpd instructions in the output assembly:
> test/CodeGen/X86/sse2-blend.ll
> test/CodeGen/X86/avx-blend.ll
> test/CodeGen/X86/blend-msb.ll
> test/CodeGen/X86/sse41-blend.ll
>
> Now the backend knows how to efficiently emit movss/movsd and
> therefore all the failing cases are expected failures (that is because
> the backend knows how to select movss/movsd and not only
> blendvps/blendvpd).
>
> I modified those failing tests so that - when possible - the generated
> assembly still contains the expected blendvps/blendvpd(see for example
> how I changed avx-blend.ll).
> In all other cases I just changed the CHECK lines to verify that we
> produce a movss/movsd.
>
> Please let me know if ok to submit.
>
> Thanks,
> Andrea Di Biagio
> SN Systems - Sony Computer Entertainment Group.
> <patch-vselect.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list