[llvm-dev] Aligned vector spills and variably sized stack frames
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 28 16:29:28 PDT 2015
----- Original Message -----
> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Friday, August 28, 2015 6:21:00 PM
> Subject: Re: [llvm-dev] Aligned vector spills and variably sized stack frames
> On 08/28/2015 04:00 PM, Philip Reames via llvm-dev wrote:
> > I've run into a problem that I'm trying to figure out how to
> > address
> > and would welcome ideas and feedback.
> > Today, the vectorizer will nicely vectorize loops using the widest
> > legal vector type for the target. On a reasonable recent machine,
> > this will often end up using AVX2 registers which are 32 bytes
> > wide.
> > If during register allocation, we decide to spill one of these
> > registers, we use the vmovaps instruction which requires the
> > address
> > in memory accessed to be 32 byte aligned. So far, so good.
> > However, the C ABI generally only provides 16 bytes of alignment
> > for
> > the stack on entry to the function. To work around this, the
> > backend
> > will create a variable sized frame with a dynamic amount of padding
> > inserted if required to ensure that a 32 byte aligned spill slot is
> > available.
> > The problem I have is that my runtime's ABI really doesn't like
> > variably sized frames. In particular, the assumption that stack
> > frames are fixed size - except during prolog and epilogue - is
> > fairly
> > baked in.
> > I'm weighing a couple of options for addressing this and want to
> > gather feedback on the perceived difficulty of each. If someone
> > has
> > another approach, I'm also very open to that.
> > Option 1 - Fix my runtime to not expect mostly fixed size frames.
> > This
> > isn't a small change to make, but given it's a strictly internal
> > ABI,
> > I can probably get away with doing it. Given things like
> > shrink-wrapping are coming down the pipe, it might also have
> > secondary
> > benefits. However, this is a relatively risky change to make for a
> > fairly corner case.
> > Option 1a - I could change my ABI to use a 32 byte aligned frame.
> > This
> > has many of the same problems as (1).
> > Option 2 - Don't compile things which need to spill vector
> > registers.
> > This is actually what we do today and has worked out fairly well in
> > practice. This is what I'm hoping to move away from.
> > Option 3 - Add an option in the x86 backend to not require aligned
> > spill slots for AVX2 registers. In particular, the VMOVUPS
> > instruction can be used to spill vector registers into an 8 or 16
> > byte
> > aligned spill slot and not require dynamic frame realignment. This
> > seems like it might be useful in other context as well, but I can't
> > name any at the moment.
> > One thing that occurs to me is that many spills are down rare
> > paths.
> > Maybe it would make sense to only do dynamic alignment for hot
> > spill/reloads? We could then simply override the heustic to always
> > use unaligned spills.
> > I don't really have a sense for how hard (3) would be to implement.
> > Anyone have an intuition?
> After sending this, I did another search and promptly discovered the
> existing "no-realign-stack" function attribute which seems to do
> what I need. Anyone know if this is robust?
I believe this works correctly, but is not a targeted fix for the AVX spilling problem. ;) -- and I can certainly imagine such a feature being generally desirable. Specifically, all overaligned locals will simply fail to be overaligned (and, thus, the resulting code will likely be broken). In your case, I can imagine you can simply promise never to create such things, and you'll be fine.
> > Philip
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev