[PATCH][X86] Improve the lowering of ISD::BITCAST from MVT::f64 to MVT::v4i16 / MVT::v8i8.

Thu May 22 09:30:40 PDT 2014

Hi Nadav,

On Thu, May 22, 2014 at 4:55 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Andrea,
>
> Sorry for the delay in code review. I did not review the patch very carefully just but it looks good and you should commit it.

No problem at all! :-)
Thanks for the feedback.

This patch is now committed at revision 209451.

Cheers,
Andrea

>
> Thanks,
> Nadav
>
> On May 22, 2014, at 8:38 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
>> ping x2.
>>
>>
>> On Thu, May 15, 2014 at 3:46 PM, Andrea Di Biagio
>> <andrea.dibiagio at gmail.com> wrote:
>>> ping.
>>>
>>> Thanks,
>>> Andrea
>>>
>>> On Thu, May 8, 2014 at 5:00 PM, Andrea Di Biagio
>>> <andrea.dibiagio at gmail.com> wrote:
>>>> Hi,
>>>>
>>>> This patch teaches the x86 backend how to efficiently lower
>>>> ISD::BITCAST dag nodes from MVT::f64 to MVT::v4i16 (and vice versa)
>>>> and from MVT::f64 to MVT::v8i8 (and vice versa).
>>>>
>>>> This improves the patch committed at revision 208107
>>>> (http://llvm.org/viewvc/llvm-project?view=revision&revision=208107 ).
>>>> Revision 208107 teached the backend how to efficiently lower a bitcast
>>>> dag node from f64 to v2i32 without introducing the redundant
>>>> store+reload sequence to bitconvert f64 to i64.
>>>>
>>>> This patch expands the logic from revision 208107 to also handle
>>>> MVT::v4i16 and MVT::v8i8. Also, this patch correctly propagates Undef
>>>> values when performing the widening of a vector (example: when
>>>> widening from v2i32 to v4i32, the upper 64bits of the resulting vector
>>>> are 'undef').
>>>>
>>>> I had to modify test ret-mmx.ll because this new patch correctly
>>>> propagates undef values in the resulting vector when widening from
>>>> v2i32 to v4i32.
>>>> The effect is that now function @t4 in 'test/CodeGen/X86/ret-mmx.ll'
>>>> produces the sequence:
>>>>
>>>>  movl $1, %eax
>>>>  movd %eax, %xmm0
>>>>
>>>> rather than:
>>>>
>>>>  movsd .LCPI3_0(%rip), %xmm0
>>>>
>>>> With
>>>> .LCPI3_0:
>>>>    .long 1
>>>>    .long 0
>>>>    .long 1
>>>>    .long 1
>>>>
>>>> For consistency, I moved all the test cases from test
>>>> 'lower-bitcast-v2i32.ll' to a new test file called
>>>> 'CodeGen/X86/lower-bitcast.ll'. This new test adds extra test cases to
>>>> verify that we don't emit redundant stack store+reload of f64 values.
>>>>
>>>>
>>>> Please let me know if ok to submit.
>>>>
>>>> Thanks,
>>>> Andrea Di Biagio
>> <patch-lower-bitcast.diff>
>