[PATCH] D22840: [InstCombine] Split int64 store into separate int32 stores

Wed Jul 27 00:15:41 PDT 2016

On Tue, Jul 26, 2016 at 11:27 PM, Chandler Carruth <chandlerc at google.com>
wrote:

> On Tue, Jul 26, 2016 at 11:19 PM Xinliang David Li <davidxl at google.com>
> wrote:
>
>> On Tue, Jul 26, 2016 at 11:11 PM, Chandler Carruth <chandlerc at google.com>
>> wrote:
>>
>>> On Tue, Jul 26, 2016 at 11:08 PM Xinliang David Li via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>>> Hmm, why does Clang change the return type from from struct to i64?
>>>> Set aside whether it is always beneficial to do it or not, it certainly
>>>> feels like the wrong place to make such a change.
>>>>
>>>
>>> It's necessary to get the right ABI out of LLVM.
>>>
>>>
>> what ABI issue and what can be wrong without using i64? I might have
>> missed something obvious here.
>>
>
> The ABI specifies that a struct such as this is returned in a single
> 64-bit integer register.
>

yes, at least for system V ABI for x86_64.

>
> Consider this C++ test case:
> long long a();
>
> struct S { int i; float f; };
> S b();
>
> void f(long long *ptr1, S *ptr2) {
>   *ptr1 = a();
>   *ptr2 = b();
> }
>
> It compiles to:
>         callq   _Z1av
>         movq    %rax, (%rbx)
>         callq   _Z1bv
>         movq    %rax, (%r14)
>
> Both need to be in the single register.
>
> But if you consider the following IR:
> declare i64 @a()
> declare {i32, float} @b()
>
> define void @f(i64* %ptr1, {i32, float}* %ptr2) {
> entry:
>   %calla = call i64 @a()
>   store i64 %calla, i64* %ptr1
>   %callb = call {i32, float} @b()
>   store {i32, float} %callb, {i32, float}* %ptr2
>   ret void
> }
>
> The LLVM lowering of the call to @b isn't what you want:
>         callq   a
>         movq    %rax, (%r14)
>         callq   b
>         movl    %eax, (%rbx)
>         movss   %xmm0, 4(%rbx)
>
> In order to get the ABI's semantics, we have to map the type into
> something that will actually get lowered in that exact manner.
>

Still, the FE does not seem to be the right place to fix the bug
(considering multiple different FEs targeting LLVM). It seems to me this
bug is either due to the call return lowering either happens too early or
needs to be target dependent.

David

>
> -Chandler
>
>
>>
>> David
>>
>>
>>
>>> Lot's of folks have hypothesized about how to separate the ABI concerns
>>> from the LLVM IR representation but it has so far proved a really tricky
>>> problem and its not clear there is a good solution.
>>>
>>>
>>
>>
>>
>>>
>>>> David
>>>>
>>>> On Tue, Jul 26, 2016 at 10:40 PM, Wei Mi <wmi at google.com> wrote:
>>>>
>>>>> wmi added a comment.
>>>>>
>>>>> Matt, David and Chandler, thanks for the review.
>>>>>
>>>>> I only looked at the code for x86-64, powerpc and aarch64 targets. For
>>>>> those targets, separate stores seemed better. I am not familiar with target
>>>>> like AMDGPU. I guess wider store is generally preferred than multiple
>>>>> narrower stores on AMDGPU? Since it may be undesirable on some targets, I
>>>>> agree it is more appropriate to implement it in SDAG pass. I will update
>>>>> the patch.
>>>>>
>>>>>
>>>>> Repository:
>>>>>   rL LLVM
>>>>>
>>>>> https://reviews.llvm.org/D22840
>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160727/00bfb0fb/attachment.html>