[PATCH] D22840: [InstCombine] Split int64 store into separate int32 stores

Tue Jul 26 23:27:52 PDT 2016

On Tue, Jul 26, 2016 at 11:19 PM Xinliang David Li <davidxl at google.com>
wrote:

> On Tue, Jul 26, 2016 at 11:11 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>> On Tue, Jul 26, 2016 at 11:08 PM Xinliang David Li via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Hmm, why does Clang change the return type from from struct to i64?  Set
>>> aside whether it is always beneficial to do it or not, it certainly feels
>>> like the wrong place to make such a change.
>>>
>>
>> It's necessary to get the right ABI out of LLVM.
>>
>>
> what ABI issue and what can be wrong without using i64? I might have
> missed something obvious here.
>

The ABI specifies that a struct such as this is returned in a single 64-bit
integer register.

Consider this C++ test case:
long long a();

struct S { int i; float f; };
S b();

void f(long long *ptr1, S *ptr2) {
  *ptr1 = a();
  *ptr2 = b();
}

It compiles to:
        callq   _Z1av
        movq    %rax, (%rbx)
        callq   _Z1bv
        movq    %rax, (%r14)

Both need to be in the single register.

But if you consider the following IR:
declare i64 @a()
declare {i32, float} @b()

define void @f(i64* %ptr1, {i32, float}* %ptr2) {
entry:
  %calla = call i64 @a()
  store i64 %calla, i64* %ptr1
  %callb = call {i32, float} @b()
  store {i32, float} %callb, {i32, float}* %ptr2
  ret void
}

The LLVM lowering of the call to @b isn't what you want:
        callq   a
        movq    %rax, (%r14)
        callq   b
        movl    %eax, (%rbx)
        movss   %xmm0, 4(%rbx)

In order to get the ABI's semantics, we have to map the type into something
that will actually get lowered in that exact manner.

-Chandler

>
> David
>
>
>
>> Lot's of folks have hypothesized about how to separate the ABI concerns
>> from the LLVM IR representation but it has so far proved a really tricky
>> problem and its not clear there is a good solution.
>>
>>
>
>
>
>>
>>> David
>>>
>>> On Tue, Jul 26, 2016 at 10:40 PM, Wei Mi <wmi at google.com> wrote:
>>>
>>>> wmi added a comment.
>>>>
>>>> Matt, David and Chandler, thanks for the review.
>>>>
>>>> I only looked at the code for x86-64, powerpc and aarch64 targets. For
>>>> those targets, separate stores seemed better. I am not familiar with target
>>>> like AMDGPU. I guess wider store is generally preferred than multiple
>>>> narrower stores on AMDGPU? Since it may be undesirable on some targets, I
>>>> agree it is more appropriate to implement it in SDAG pass. I will update
>>>> the patch.
>>>>
>>>>
>>>> Repository:
>>>>   rL LLVM
>>>>
>>>> https://reviews.llvm.org/D22840
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160727/8ad21ea4/attachment.html>