[cfe-dev] Default stack alignment for x86 changed

Smith, Kevin B kevin.b.smith at intel.com
Thu Jan 15 12:01:15 PST 2015


Ø  It's interesting that it’s such a performance issue though

I don’t think it really is much of a performance issue, except perhaps on Quark.  All recent processors for IA32 make unaligned accesses effectively the same
performance as aligned accesses unless they cross a cache-line boundary.  And since alignment within structs is made 8 bytes, provided the class when created is dynamic, then often the memory allocators will return memory that is “well aligned” as well.

So, that leaves potential penalities for things allocated on the stack.  Again, if the compiler thinks it is worthwhile it can use extra instructions in the prolog and
epilog of a routine to ensure a higher than minimum stack alignment if it thinks there is a performance reason for doing so.

But this discussion was about the ABIs, and what was guaranteed.  And the ABI for IA32 windows only has a 4 byte guarantee for the stack
upon entry to a function.  And the ABI (that gcc is assuming) for IA32 linux has a guarantee of 16 byte alignment, but that can be controlled by the
option shown below.  And, for example, the linux kernel is built with gcc using the 4 byte alignment guarantee version of that option.

Kevin

From: John Sully [mailto:john at csquare.ca]
Sent: Thursday, January 15, 2015 11:52 AM
To: mats petersson
Cc: Smith, Kevin B; cfe-dev at cs.uiuc.edu
Subject: Re: [cfe-dev] Default stack alignment for x86 changed

Clang is really no different than MSVC here (I just double checked).  For SSE you always had to specify the alignment required because it was never guaranteed by the compiler (especially when you get into mandatory 16-byte alignment).  It's interesting that its such a performance issue though, unless your really memory constrained it seems the size/speed trade-off is clearly in favour of 8 byte alignment even though its not technically necessary.

On Thu, Jan 15, 2015 at 11:16 AM, mats petersson <mats at planetcatfish.com<mailto:mats at planetcatfish.com>> wrote:
To be clear: double does not REQUIRE 8 byte alignment, but on
(reasonably modern, like "Pentium onwards", so ca 1994-5 ish) x86
processors would "prefer" 8-byte alignment for "double" values, since
they can then be read as ONE cycle on a 64-bit bus.

And of course, SSE instructions that aren't specifically designed for
unaligned loads will require a 16-byte alignment. Or does SSE code
automatically modify the alignment criteria for the function?

Further, shouldn't the stack be aligned to "LargestAlignment" or
whatever it is called? Otherwise, any structure alignment will surely
be "lost"?

--
Mats

On 15 January 2015 at 18:38, Smith, Kevin B <kevin.b.smith at intel.com<mailto:kevin.b.smith at intel.com>> wrote:
> Although alignof(double) on windows returns 8, the actual minimum stack alignment is still 4.  Here is a source example illustrating
> this.
>
> #include <stdlib.h>
>
> int a = __alignof(double);
>
> extern void crud1(int i, double *p);
>
> void crud(void) {
>   double dummy;
>   crud1(0, &dummy);
> }
>
> Assembly code produced from VS 2012, compiling with cl -Fa -c -O2 crud.c
> _DATA   SEGMENT
> _a      DD      08H
> _DATA   ENDS
> PUBLIC  _crud
> EXTRN   _crud1:PROC
> ; Function compile flags: /Ogtpy
> ;       COMDAT _crud
> _TEXT   SEGMENT
> _dummy$ = -8
> _crud   PROC
> ; File d:\users\kbsmith1\tc_tmp1\crud.c
> ; Line 7
>         sub     esp, 8
> ; Line 9
>         lea     eax, DWORD PTR _dummy$[esp+8]
>         push    eax
>         push    0
>         call    _crud1
> ; Line 10
>         add     esp, 16
>         ret     0
>
> You can see that __alignof(double) produced 8 by the initialization value of a.  You can also see that there is no code at the beginning of function crud to
> align the stack.  So, if it comes in on a 4 byte boundary, it will remain on a 4 byte boundary, and since it subs 8 from esp, if it comes in on an 8 byte boundary
> it will stay on an 8 byte boundary.  Now consider the call to crud1.  This pushes two parameters, and then the call pushes the return address.  So, if the stack
> comes in 8 byte aligned, at the entry to crud1, the stack is now only 4 byte aligned.
>
> For this reason, in windows, although __alignof(double) is 8, it doesn't follow that the value of every double * must be such that the pointer value is 8 byte aligned.
>
> Also, for IA32 on linux, 4 byte minimum stack alignment used to be specified by the Sys V ABI, which is pretty much the only one you can find references to on the web.  However, for quite a number of years, gcc's default on linux is to assure 16 byte stack alignment at function entry, so that every function that used SSE/SSE2 instructions (and might possibly need to spill) didn't have to perform dynamic stack alignment.  In gcc this is controlled by -mpreferred-stack-boundary=num option.  https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/i386-and-x86-64-Options.html, says the default for this option is 4, implying 16 byte stack alignment.
>
> Kevin Smith
>
> -----Original Message-----
> From: cfe-dev-bounces at cs.uiuc.edu<mailto:cfe-dev-bounces at cs.uiuc.edu> [mailto:cfe-dev-bounces at cs.uiuc.edu<mailto:cfe-dev-bounces at cs.uiuc.edu>] On Behalf Of palparni
> Sent: Thursday, January 15, 2015 9:34 AM
> To: cfe-dev at cs.uiuc.edu<mailto:cfe-dev at cs.uiuc.edu>
> Subject: Re: [cfe-dev] Default stack alignment for x86 changed
>
> I understand, so the change was made for Unix-based systems in mind.
> Unfortunately the win32 x86 ABI seems to require doubles to be 64-bit
> aligned. Could we perhaps keep the 8-byte alignment only for win32 targets?
>
> Thanks,
> Alpar
>
>
>
> --
> View this message in context: http://clang-developers.42468.n3.nabble.com/Default-stack-alignment-for-x86-changed-tp4043481p4043483.html
> Sent from the Clang Developers mailing list archive at Nabble.com.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu<mailto:cfe-dev at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu<mailto:cfe-dev at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
cfe-dev at cs.uiuc.edu<mailto:cfe-dev at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150115/4801d1b1/attachment.html>


More information about the cfe-dev mailing list