[llvm-commits] [PATCH] Use vld1/vst1 for unaligned load/store

Evan Cheng evan.cheng at apple.com
Mon Sep 17 11:16:35 PDT 2012


On Sep 17, 2012, at 11:11 AM, Evan Cheng <evan.cheng at apple.com> wrote:

> Hi David,
> 
> Thanks for working on this. This is a big omission that I was planning to look at. It's good you got to it first. Some comments though:
> 
>  bool ARMTargetLowering::allowsUnalignedMemoryAccesses(EVT VT) const {                                                                                                                                                                                                       
> -  if (!Subtarget->allowsUnalignedMem())                                                                                                                                                                                                                                     
> -    return false;                                                                                                                                                                                                                                                           
> +  // The AllowsUnaliged flag models the SCTLR.A setting in ARM cpus                                                                                                                                                                                                         
> +  bool AllowsUnaligned = Subtarget->allowsUnalignedMem();                                                                                                                                                                                                                   
>                                                                                                                                                                                                                                                                              
>    switch (VT.getSimpleVT().SimpleTy) {                                                                                                                                                                                                                                      
>    default:                                                                                                                                                                                                                                                                  
> @@ -9034,10 +9034,15 @@ bool ARMTargetLowering::allowsUnalignedMemoryAccesses(EVT VT) const {
>    case MVT::i8:                                                                                                                                                                                                                                                             
>    case MVT::i16:                                                                                                                                                                                                                                                            
>    case MVT::i32:                                                                                                                                                                                                                                                            
> -    return true;                                                                                                                                                                                                                                                            
> +    // Unaligned access can use (for example) LRDB, LRDH, LDR                                                                                                                                                                                                               
> +    return AllowsUnaligned;                                                                                                                                                                                                                                                 
>    case MVT::f64:                                                                                                                                                                                                                                                            
> -    return Subtarget->hasNEON();                                                                                                                                                                                                                                            
> -  // FIXME: VLD1 etc with standard alignment is legal.                                                                                                                                                                                                                      
> +  case MVT::v2f64:                                                                                                                                                                                                                                                          
> +    // For any little-endian targets with neon, we can support unaligned ld/st                                                                                                                                                                                              
> +    // of D and Q (e.g. {D0,D1}) registers by using vld1.i8/vst1.i8.                                                                                                                                                                                                        
> +    // A big-endian target may also explictly support unaligned accesses                                                                                                                                                                                                    
> +    return Subtarget->hasNEON() &&                                                                                                                                                                                                                                          
> +           (getTargetData()->isLittleEndian() || AllowsUnaligned);                                                                                                                                                                                                          
>    }                                                                                                                                                                                                                                                                         
>  } 
> 
> This part is not quite right:
> +    return Subtarget->hasNEON() &&                                                                                                                                                                                                                                          
> +           (getTargetData()->isLittleEndian() || AllowsUnaligned);   
> 
> vld1 / vst1 requires alignment of element size. If not, then it's a fault unless SCTLR.A is 1.  This should not require true for all little endian cpus with NEON. It should still be controlled by the subtarget feature.

Nevermind. I see what you mean. Yes, it is always possible to use vld1.8 / vst1.8 for unaligned access on little endian machines.
> 
> I'll fix up your patch and commit it for you. Thanks.
> 
> Evan
> 
> 
> 
> On Sep 13, 2012, at 5:53 PM, David Peixotto <dpeixott at codeaurora.org> wrote:
> 
>> This patch is the result of a discussion of unaligned vector loads/store on llvmdev: http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053082.html.
>>  
>> The vld1 and vst1 variants in armv7 neon only require memory
>> alignment to the element size of the vector. Because of this
>> property, we can use a vld1.8 and vst1.8 to load/store f64 and v2f64
>> vectors to unaligned addresses on little-endian targets. This should
>> be faster than the target-independent codegen lowering that does an
>> aligned load/store to the stack and unaligned load/store of each
>> element of the vector.
>>  
>> This patch includes two changes:
>>   1. Add new patterns for selecting vld1/vst1 for byte and half-word
>>      aligned vector stores for v2f64 vectors.
>>   2. Allow unaligned load/store using vld1/vst1 for little-endian
>>      arm targets that support NEON.  The vld1/vst1 instructions will
>>      be used to load/store f64 and v2f64 types aligned along byte
>>      and half-word memory accesses.
>>  
>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>>  
>>  
>> <0001-Use-vld1-vst1-for-unaligned-load-store.patch>_______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120917/e1c6e22d/attachment.html>


More information about the llvm-commits mailing list