[llvm-dev] Aligned vector spills and variably sized stack frames

Fri Aug 28 16:23:38 PDT 2015

----- Original Message -----
> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Friday, August 28, 2015 6:00:50 PM
> Subject: [llvm-dev] Aligned vector spills and variably sized stack frames
> 
> I've run into a problem that I'm trying to figure out how to address
> and
> would welcome ideas and feedback.
> 
> Today, the vectorizer will nicely vectorize loops using the widest
> legal
> vector type for the target.  On a reasonable recent machine, this
> will
> often end up using AVX2 registers which are 32 bytes wide.
> 
> If during register allocation, we decide to spill one of these
> registers, we use the vmovaps instruction which requires the address
> in
> memory accessed to be 32 byte aligned.  So far, so good.
> 
> However, the C ABI generally only provides 16 bytes of alignment for
> the
> stack on entry to the function.  To work around this, the backend
> will
> create a variable sized frame with a dynamic amount of padding
> inserted
> if required to ensure that a 32 byte aligned spill slot is available.
> 
> The problem I have is that my runtime's ABI really doesn't like
> variably
> sized frames.  In particular, the assumption that stack frames are
> fixed
> size - except during prolog and epilogue - is fairly baked in.
> 
> I'm weighing a couple of options for addressing this and want to
> gather
> feedback on the perceived difficulty of each.  If someone has another
> approach, I'm also very open to that.
> 
> Option 1 - Fix my runtime to not expect mostly fixed size frames.
> This
> isn't a small change to make, but given it's a strictly internal ABI,
> I
> can probably get away with doing it.  Given things like
> shrink-wrapping
> are coming down the pipe, it might also have secondary benefits.
> However, this is a relatively risky change to make for a fairly
> corner case.
> 
> Option 1a - I could change my ABI to use a 32 byte aligned frame.
> This
> has many of the same problems as (1).
> 
> Option 2 - Don't compile things which need to spill vector registers.
> This is actually what we do today and has worked out fairly well in
> practice.  This is what I'm hoping to move away from.
> 
> Option 3 - Add an option in the x86 backend to not require aligned
> spill
> slots for AVX2 registers.  In particular, the VMOVUPS instruction can
> be
> used to spill vector registers into an 8 or 16 byte aligned spill
> slot
> and not require dynamic frame realignment. This seems like it might
> be
> useful in other context as well, but I can't name any at the moment.
> 
> One thing that occurs to me is that many spills are down rare paths.
> Maybe it would make sense to only do dynamic alignment for hot
> spill/reloads?  We could then simply override the heustic to always
> use
> unaligned spills.
> 
> I don't really have a sense for how hard (3) would be to implement.
> Anyone have an intuition?

I suspect that implementing this would not be too difficult. There are essentially two things that need to be changed:

 1. Change the code in X86InstrInfo::storeRegToStackSlot / X86InstrInfo::loadRegFromStackSlot to do the right thing for underaligned stack slots (or, in general, under the control of some target feature, option, etc.) [specifically, you need to change the code in those functions to pass false to the isStackAligned parameter of getStoreRegOpcode and getLoadRegOpcode].

 2. The alignment necessary for register spills is generically specified in the target's *RegisterInfo.td file.(it's the third parameter of the RegisterClass TableGen type). You'd need to specify a way to override that based on some target feature, option, etc. if one does not already exist.

 -Hal

> 
> Philip
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory