[PATCH][X86] Fix for a a poor code generation bug affecting addss/mulss and other SSE scalar fp arithmetic instructions
Jim Grosbach
grosbach at apple.com
Fri Dec 6 15:52:25 PST 2013
It won’t work to just recognize the insert directly in the patterns and avoid the DAGcombine entirely? I’d prefer to avoid adding extra target ISD nodes if we can help it. They inhibit optimization (other DAGcombines) and make other codegen patterns harder to generalize.
-Jim
On Dec 6, 2013, at 1:40 PM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> Hi,
>
> This patch fixes a poor code generation bug affecting SSE scalar fp
> instructions like addss/mulss.
> The problem has been originally reported here:
> http://comments.gmane.org/gmane.comp.compilers.llvm.devel/68542
>
> At the moment, the x86 backend tends to emit unnecessary vector insert
> instructions immediately after sse scalar fp instructions.
>
> Example:
> /////////////////////////////////
> __m128 foo(__m128 A, __m128 B) {
> A[0] += B[0];
> return A;
> }
> /////////////////////////////////
>
> produces the following sequence:
> addss %xmm0, %xmm1
> movss %xmm1, %xmm0
>
> Instead of:
> addss %xmm1, %xmm0
>
> This patch addresses the problem at ISel stage introducing a target
> specific combine rule to fold patterns like this one:
>
> a0 : f32 = extract_vector_elt ( A, 0)
> b0 : f32 = extract_vector_elt ( B, 0)
> r0 : f32 = fadd a0, b0
> result : v4f32 = insert_vector_elt ( A, r0, 0 )
>
> into a single 'addss' instruction.
>
> Please let me know what you think.
>
> Thanks,
> Andrea Di Biagio
> SN Systems - Sony Computer Entertainment Group
> <patch.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list