[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Daniel Sanders via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 17 13:38:22 PST 2017
It seems we've hit the lists 100KB size limit. Re-sending my email with the quotes trimmed.
> On 17 Nov 2017, at 10:34, Daniel Sanders <daniel_l_sanders at apple.com> wrote:
>
> Does the MIR for this one have a G_BITCAST from one vector type to another? It sounds like we haven't implemented the fact that bitcasts are sometimes shuffles on big-endian.
>
>> On 17 Nov 2017, at 06:56, Oliver Stannard via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> Hi Quentin,
>>
>> It seems that we also get the calling convention wrong for vector types on big-endian:
>> #include <arm_neon.h>
>> int32x2_t load_vector(int32x2_t *p) {
>> return *p;
>> }
>>
>> Global-isel generates this:
>> // armclang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
>> _Z11load_vectorP11__Int32x2_t: // @_Z11load_vectorP11__Int32x2_t
>> // BB#0: // %entry
>> sub sp, sp, #16 // =16
>> str x0, [sp, #8]
>> ldr x0, [sp, #8]
>> ld1 { v0.2s }, [x0]
>> add sp, sp, #16 // =16
>> ret
>>
>> With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector.
>>
>> Oliver
>>
>> From: Oliver Stannard
>> Sent: 17 November 2017 13:32
>> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>'
>> Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls
>> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>>
>> Hi Quentin,
>>
>> At Kristof’s suggestion, I tried running our ABI test suite for a big-endian AArch64 target, and this found an ABI mismatch between global-isel and regular -O0. Here’s a reproducer for the first one I’ve investigated:
>>
>> struct foo {
>> float first;
>> float second;
>> };
>> float get_first(foo p) {
>> return p.first;
>> }
>>
>> This is the code that global-isel currently generates:
>> // /work/llvm/build/bin/clang --target=aarch64--none-eabi -march=armv8-a -c callees.cpp -O0 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
>>
>> _Z9get_first3foo: // @_Z9get_first3foo
>> // BB#0: // %entry
>> sub sp, sp, #16 // =16
>> // implicit-def: %X8
>> fmov w9, s0
>> mov w10, w9
>> bfxil x8, x10, #0, #32
>> fmov w9, s1
>> mov w10, w9
>> bfi x8, x10, #32, #32
>> add x10, sp, #8 // =8
>> str x8, [sp, #8]
>> ldr w9, [x10]
>> fmov s0, w9
>> add sp, sp, #16 // =16
>> ret
>>
>> When run on a big-endian target, this incorrectly returns the second member of the struct, instead of the first.
>>
>> Oliver
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171117/a95186f5/attachment.html>
More information about the llvm-dev
mailing list