[llvm-dev] Aligned vector spills and variably sized stack frames

Fri Aug 28 16:21:00 PDT 2015


On 08/28/2015 04:00 PM, Philip Reames via llvm-dev wrote:
> I've run into a problem that I'm trying to figure out how to address 
> and would welcome ideas and feedback.
>
> Today, the vectorizer will nicely vectorize loops using the widest 
> legal vector type for the target.  On a reasonable recent machine, 
> this will often end up using AVX2 registers which are 32 bytes wide.
>
> If during register allocation, we decide to spill one of these 
> registers, we use the vmovaps instruction which requires the address 
> in memory accessed to be 32 byte aligned.  So far, so good.
>
> However, the C ABI generally only provides 16 bytes of alignment for 
> the stack on entry to the function.  To work around this, the backend 
> will create a variable sized frame with a dynamic amount of padding 
> inserted if required to ensure that a 32 byte aligned spill slot is 
> available.
>
> The problem I have is that my runtime's ABI really doesn't like 
> variably sized frames.  In particular, the assumption that stack 
> frames are fixed size - except during prolog and epilogue - is fairly 
> baked in.
>
> I'm weighing a couple of options for addressing this and want to 
> gather feedback on the perceived difficulty of each.  If someone has 
> another approach, I'm also very open to that.
>
> Option 1 - Fix my runtime to not expect mostly fixed size frames. This 
> isn't a small change to make, but given it's a strictly internal ABI, 
> I can probably get away with doing it.  Given things like 
> shrink-wrapping are coming down the pipe, it might also have secondary 
> benefits.  However, this is a relatively risky change to make for a 
> fairly corner case.
>
> Option 1a - I could change my ABI to use a 32 byte aligned frame. This 
> has many of the same problems as (1).
>
> Option 2 - Don't compile things which need to spill vector registers.  
> This is actually what we do today and has worked out fairly well in 
> practice.  This is what I'm hoping to move away from.
>
> Option 3 - Add an option in the x86 backend to not require aligned 
> spill slots for AVX2 registers.  In particular, the VMOVUPS 
> instruction can be used to spill vector registers into an 8 or 16 byte 
> aligned spill slot and not require dynamic frame realignment. This 
> seems like it might be useful in other context as well, but I can't 
> name any at the moment.
>
> One thing that occurs to me is that many spills are down rare paths.  
> Maybe it would make sense to only do dynamic alignment for hot 
> spill/reloads?  We could then simply override the heustic to always 
> use unaligned spills.
>
> I don't really have a sense for how hard (3) would be to implement. 
> Anyone have an intuition?
After sending this, I did another search and promptly discovered the 
existing "no-realign-stack" function attribute which seems to do exactly 
what I need.  Anyone know if this is robust?
>
> Philip
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev