[llvm-commits] [PATCH] Use vld1/vst1 for unaligned load/store
Evan Cheng
evan.cheng at apple.com
Mon Sep 17 11:16:35 PDT 2012
On Sep 17, 2012, at 11:11 AM, Evan Cheng <evan.cheng at apple.com> wrote:
> Hi David,
>
> Thanks for working on this. This is a big omission that I was planning to look at. It's good you got to it first. Some comments though:
>
> bool ARMTargetLowering::allowsUnalignedMemoryAccesses(EVT VT) const {
> - if (!Subtarget->allowsUnalignedMem())
> - return false;
> + // The AllowsUnaliged flag models the SCTLR.A setting in ARM cpus
> + bool AllowsUnaligned = Subtarget->allowsUnalignedMem();
>
> switch (VT.getSimpleVT().SimpleTy) {
> default:
> @@ -9034,10 +9034,15 @@ bool ARMTargetLowering::allowsUnalignedMemoryAccesses(EVT VT) const {
> case MVT::i8:
> case MVT::i16:
> case MVT::i32:
> - return true;
> + // Unaligned access can use (for example) LRDB, LRDH, LDR
> + return AllowsUnaligned;
> case MVT::f64:
> - return Subtarget->hasNEON();
> - // FIXME: VLD1 etc with standard alignment is legal.
> + case MVT::v2f64:
> + // For any little-endian targets with neon, we can support unaligned ld/st
> + // of D and Q (e.g. {D0,D1}) registers by using vld1.i8/vst1.i8.
> + // A big-endian target may also explictly support unaligned accesses
> + return Subtarget->hasNEON() &&
> + (getTargetData()->isLittleEndian() || AllowsUnaligned);
> }
> }
>
> This part is not quite right:
> + return Subtarget->hasNEON() &&
> + (getTargetData()->isLittleEndian() || AllowsUnaligned);
>
> vld1 / vst1 requires alignment of element size. If not, then it's a fault unless SCTLR.A is 1. This should not require true for all little endian cpus with NEON. It should still be controlled by the subtarget feature.
Nevermind. I see what you mean. Yes, it is always possible to use vld1.8 / vst1.8 for unaligned access on little endian machines.
>
> I'll fix up your patch and commit it for you. Thanks.
>
> Evan
>
>
>
> On Sep 13, 2012, at 5:53 PM, David Peixotto <dpeixott at codeaurora.org> wrote:
>
>> This patch is the result of a discussion of unaligned vector loads/store on llvmdev: http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053082.html.
>>
>> The vld1 and vst1 variants in armv7 neon only require memory
>> alignment to the element size of the vector. Because of this
>> property, we can use a vld1.8 and vst1.8 to load/store f64 and v2f64
>> vectors to unaligned addresses on little-endian targets. This should
>> be faster than the target-independent codegen lowering that does an
>> aligned load/store to the stack and unaligned load/store of each
>> element of the vector.
>>
>> This patch includes two changes:
>> 1. Add new patterns for selecting vld1/vst1 for byte and half-word
>> aligned vector stores for v2f64 vectors.
>> 2. Allow unaligned load/store using vld1/vst1 for little-endian
>> arm targets that support NEON. The vld1/vst1 instructions will
>> be used to load/store f64 and v2f64 types aligned along byte
>> and half-word memory accesses.
>>
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>>
>>
>> <0001-Use-vld1-vst1-for-unaligned-load-store.patch>_______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120917/e1c6e22d/attachment.html>
More information about the llvm-commits
mailing list