[Patch] [AArch64] Implement allowsUnalignedMemoryAccesses()

Wed Apr 9 11:27:56 PDT 2014

Jiangning,

I believe "ld1/st1 + rev" or "ldr" on big-endian should be controlled by
pattern matching and addressed by a separate patch... Can you point me to
the review thread you mentioned?

I looked into ARM64 backend. All vector types are promoted to v2i64/v2i32
and the pattern matcher emits ldr/str instructions.

My assumption is Linux by default is configured to support unaligned
accesses. Do you mean I should add -aarch64-no-strict-align to the command
line in my test? Can you elaborate a little bit about "the ld1/st1 CHECK
should be guaranteed for both little-endian and big-endian"?

Thanks,

Zhaoshi

From: Jiangning Liu [mailto:liujiangning1 at gmail.com] 
Sent: Tuesday, April 08, 2014 23:36
To: Zhaoshi
Cc: llvm-commits at cs.uiuc.edu for LLVM
Subject: Re: [Patch] [AArch64] Implement allowsUnalignedMemoryAccesses()

Hi Zhaoshi,

One more thing...

Currently the pattern match only generates ld1/st1, which is inefficient for
big-endian, and I think it is going to be changed soon to ldr/str by another
big-endian support patch. You may have noticed there is another patch review
thread relates to big-endian. I prefer your case stays with ld1/st1, and ask
another patch to fix that all together, because that's all about big-endian.

For performance wise, you can hack the followings in AArch64IntrNEON.td in
advance to verify ldr/str.

def : Pat<(v2f64 (load GPR64xsp:$addr)), (LD1_2D GPR64xsp:$addr)>;

def : Pat<(v2i64 (load GPR64xsp:$addr)), (LD1_2D GPR64xsp:$addr)>;

def : Pat<(v4f32 (load GPR64xsp:$addr)), (LD1_4S GPR64xsp:$addr)>;

def : Pat<(v4i32 (load GPR64xsp:$addr)), (LD1_4S GPR64xsp:$addr)>;

def : Pat<(v8i16 (load GPR64xsp:$addr)), (LD1_8H GPR64xsp:$addr)>;

def : Pat<(v16i8 (load GPR64xsp:$addr)), (LD1_16B GPR64xsp:$addr)>;

def : Pat<(v1f64 (load GPR64xsp:$addr)), (LD1_1D GPR64xsp:$addr)>;

def : Pat<(v1i64 (load GPR64xsp:$addr)), (LD1_1D GPR64xsp:$addr)>;

def : Pat<(v2f32 (load GPR64xsp:$addr)), (LD1_2S GPR64xsp:$addr)>;

def : Pat<(v2i32 (load GPR64xsp:$addr)), (LD1_2S GPR64xsp:$addr)>;

def : Pat<(v4i16 (load GPR64xsp:$addr)), (LD1_4H GPR64xsp:$addr)>;

def : Pat<(v8i8 (load GPR64xsp:$addr)), (LD1_8B GPR64xsp:$addr)>;

This is just in case the performance impact after fully enabling big-endian
on AArch64.

Thanks,

-Jiangning

2014-04-09 12:45 GMT+08:00 Jiangning Liu <liujiangning1 at gmail.com>:

Hi Zhaoshi,

For both little-endian and big-endian, we need to generate instructions as
if ldr/str. For little-endian, ldr/str is the same as ld1/st1, while for
big-endian they have different behaviours. So if we generate ld1/st1 for
big-endian, we should have to generate "rev" following them to keep the
reversed in-register layout.

Your code shows you are saying Linux should be always configured as
non-strict mode for hardware, so for big-endian the choice between "ldr/str"
and "ld1/st1 + rev" will depend on cost estimate. Therefore, for your test
case unaligned-vector-ld1-st1.ll,

+; RUN: llc < %s -mtriple=aarch64_be-none-linux-gnu -mattr=+neon -o - |
FileCheck %s

It would be better if you can add -aarch64-strict-align in command line, and
the ld1/st1 CHECK should be guaranteed for both little-endian and
big-endian. Also, I don't see you have tests covering those three newly
added opts you added.

Thanks,

-Jiangning

2014-04-08 9:24 GMT+08:00 <zhaoshiz at codeaurora.org>:

Hi,

This patch should enable unaligned accesses of vector types on AArch64.
Please help review.

Thanks,
Zhaoshi

_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-- 

Thanks,

-Jiangning

-- 

Thanks,

-Jiangning

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140409/06eb81fc/attachment.html>