[llvm-dev] Aligned vector spills and variably sized stack frames
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 28 16:21:00 PDT 2015
On 08/28/2015 04:00 PM, Philip Reames via llvm-dev wrote:
> I've run into a problem that I'm trying to figure out how to address
> and would welcome ideas and feedback.
>
> Today, the vectorizer will nicely vectorize loops using the widest
> legal vector type for the target. On a reasonable recent machine,
> this will often end up using AVX2 registers which are 32 bytes wide.
>
> If during register allocation, we decide to spill one of these
> registers, we use the vmovaps instruction which requires the address
> in memory accessed to be 32 byte aligned. So far, so good.
>
> However, the C ABI generally only provides 16 bytes of alignment for
> the stack on entry to the function. To work around this, the backend
> will create a variable sized frame with a dynamic amount of padding
> inserted if required to ensure that a 32 byte aligned spill slot is
> available.
>
> The problem I have is that my runtime's ABI really doesn't like
> variably sized frames. In particular, the assumption that stack
> frames are fixed size - except during prolog and epilogue - is fairly
> baked in.
>
> I'm weighing a couple of options for addressing this and want to
> gather feedback on the perceived difficulty of each. If someone has
> another approach, I'm also very open to that.
>
> Option 1 - Fix my runtime to not expect mostly fixed size frames. This
> isn't a small change to make, but given it's a strictly internal ABI,
> I can probably get away with doing it. Given things like
> shrink-wrapping are coming down the pipe, it might also have secondary
> benefits. However, this is a relatively risky change to make for a
> fairly corner case.
>
> Option 1a - I could change my ABI to use a 32 byte aligned frame. This
> has many of the same problems as (1).
>
> Option 2 - Don't compile things which need to spill vector registers.
> This is actually what we do today and has worked out fairly well in
> practice. This is what I'm hoping to move away from.
>
> Option 3 - Add an option in the x86 backend to not require aligned
> spill slots for AVX2 registers. In particular, the VMOVUPS
> instruction can be used to spill vector registers into an 8 or 16 byte
> aligned spill slot and not require dynamic frame realignment. This
> seems like it might be useful in other context as well, but I can't
> name any at the moment.
>
> One thing that occurs to me is that many spills are down rare paths.
> Maybe it would make sense to only do dynamic alignment for hot
> spill/reloads? We could then simply override the heustic to always
> use unaligned spills.
>
> I don't really have a sense for how hard (3) would be to implement.
> Anyone have an intuition?
After sending this, I did another search and promptly discovered the
existing "no-realign-stack" function attribute which seems to do exactly
what I need. Anyone know if this is robust?
>
> Philip
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list