[PATCH][X86] Improve the lowering of ISD::BITCAST from MVT::f64 to MVT::v4i16 / MVT::v8i8.

Thu May 22 08:55:34 PDT 2014

Hi Andrea, 

Sorry for the delay in code review. I did not review the patch very carefully just but it looks good and you should commit it. 

Thanks,
Nadav

On May 22, 2014, at 8:38 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:

> ping x2.
> 
> 
> On Thu, May 15, 2014 at 3:46 PM, Andrea Di Biagio
> <andrea.dibiagio at gmail.com> wrote:
>> ping.
>> 
>> Thanks,
>> Andrea
>> 
>> On Thu, May 8, 2014 at 5:00 PM, Andrea Di Biagio
>> <andrea.dibiagio at gmail.com> wrote:
>>> Hi,
>>> 
>>> This patch teaches the x86 backend how to efficiently lower
>>> ISD::BITCAST dag nodes from MVT::f64 to MVT::v4i16 (and vice versa)
>>> and from MVT::f64 to MVT::v8i8 (and vice versa).
>>> 
>>> This improves the patch committed at revision 208107
>>> (http://llvm.org/viewvc/llvm-project?view=revision&revision=208107 ).
>>> Revision 208107 teached the backend how to efficiently lower a bitcast
>>> dag node from f64 to v2i32 without introducing the redundant
>>> store+reload sequence to bitconvert f64 to i64.
>>> 
>>> This patch expands the logic from revision 208107 to also handle
>>> MVT::v4i16 and MVT::v8i8. Also, this patch correctly propagates Undef
>>> values when performing the widening of a vector (example: when
>>> widening from v2i32 to v4i32, the upper 64bits of the resulting vector
>>> are 'undef').
>>> 
>>> I had to modify test ret-mmx.ll because this new patch correctly
>>> propagates undef values in the resulting vector when widening from
>>> v2i32 to v4i32.
>>> The effect is that now function @t4 in 'test/CodeGen/X86/ret-mmx.ll'
>>> produces the sequence:
>>> 
>>>  movl $1, %eax
>>>  movd %eax, %xmm0
>>> 
>>> rather than:
>>> 
>>>  movsd .LCPI3_0(%rip), %xmm0
>>> 
>>> With
>>> .LCPI3_0:
>>>    .long 1
>>>    .long 0
>>>    .long 1
>>>    .long 1
>>> 
>>> For consistency, I moved all the test cases from test
>>> 'lower-bitcast-v2i32.ll' to a new test file called
>>> 'CodeGen/X86/lower-bitcast.ll'. This new test adds extra test cases to
>>> verify that we don't emit redundant stack store+reload of f64 values.
>>> 
>>> 
>>> Please let me know if ok to submit.
>>> 
>>> Thanks,
>>> Andrea Di Biagio
> <patch-lower-bitcast.diff>