[PATCH] Add a "probe-stack" attribute

John Kåre Alsaker john.mailinglists at gmail.com
Mon Aug 18 12:40:29 PDT 2014


On Mon, Aug 18, 2014 at 8:31 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
> On 08/15/2014 01:55 PM, John Kåre Alsaker wrote:
>
>  On Fri, Aug 15, 2014 at 6:47 PM, Philip Reames <listmail at philipreames.com
> > wrote:
>
>> Sorry for the delay.
>>
>> Patch 0002 LGTM.  Please submit
>> Patch 0003 LGTM.  Please submit.
>>
> Yes, please submit.
>
> Do you have commit access?  Or do you want me to submit on your behalf?
>
I'd like you to submit them.


>
>
>
>>
>> Patch 0001 LGTM, but doesn't do anything by itself.  Please hold it until
>> either 0004 or 0006 are ready to land.
>>
>> Patch 0004
>> - It seems like you're strill mixing several functional changes in 0004.
>> Can you enumerate (in email) what this patch is trying to accomplish?  The
>> trivial implementation I expected is to add MF.shouldProbeStack in a couple
>> places and call it done.  Why is the rest needed?
>>
> It's just generalizing the code for Win64 stack probes to x86-32 and
> x86-64. The __probestack calling convention on both x86-32 and x86-64 is
> the same as __chkstk on Win64.
>
> Why?  Is there a good reason now to use the x86-32 chkstk convention on
> x86-32 probestack?  What's the motivation for being different?
>
I'm being consistent across x86 while MSVC is not.


>    In addition it doesn't assume RAX never is live in.
>
> Why?  Is this a bug fix?
>
That depends on if it ever can be live in. There were already logic for EAX
which I generalized.

>   It also disables the redzone (which never is used on Windows) since the
> probing code doesn't account for that.
>
> Ok.  I see the value in this.
>
> This sounds like at least three distinct changes. Please separate them and
> we'll review each in turn.
>
> Sorry for being so pedantic about separation here.  I'm trying to review
> code I don't really know well.  As a result, I'm trying to be extremely
> carefully about making sure I understand each piece.  If you can find
> another reviewer who knows this code better, I'd be happy to step aside.
>
>
>
>>
>> Patch 0005
>> - 0x5000 - Please write this as " 5 * PageSize".  I won't ask you to make
>> page size configurable right now, but it should be clear to someone who
>> might want to extend it what this is coming from.  Also, *why* 5?
>>
> 5 is what GCC uses before turning the code into a loop.
>
> Document that motivation please.
>
>
>
>> - "0x1000" should be replaced with "PageSize" to make it more
>> understandable what's happening here.
>>
> Ok
>
>
>> - I don't think your addressing is quite right.  If RSP before adjustment
>> was exactly on a page boundary, wouldn't you miss touching the first page
>> entirely?
>>
> No, we'd write to the first word of the page.
>
> Assume the original RSP was page aligned (after pushing the RetPC),
> NumBytes is exactly one page.  Your code would compute an offset which is
> exactly one page off from the new RSP.  This would be in the same page as
> the old RSP, not the new one.  In theory, you could then push another frame
> and miss touching the page.
>
> 0x1000 - ret PC, touch location
> 0x2000 - new RSP
>
> I may be wrong here, but if so, why?
>

The call instruction wrote to 0x4000...0x4008, so the page 0x4000...0x5000
is written to.
On function entry:
rsp = 0x4000

We allocate a page.
rsp -= 0x1000
rsp = 0x3000
We touch [rsp + 0] (NumBytes - (0 + 1) * PageSize is 0). 0x3000...0x3008 is
then written to, so the page 0x3000...0x4000 is also written to.


>
>
>
>> - To make the loop more readable, I might restructure it as: for(int
>> Offset = NumBytes - PageSize + 1; Offset >= 0; Offset -= PageSize) { touch
>> RSP+Offset }
>>
> I like the clarity on the amount of probes done.
>
>
>> - After reading up a bit about what chkstk does internally on windows
>> (i.e. internal probe stack), the optimization seems clearly correct.  It
>> would be nice to explain that in a comment.  Don't assume everyone who
>> reads the code knows how chkstk manages stack expansion.
>>
> Sure
>
>> - With these changes, LGTM.   I believe this is independent of 0004 right?
>>
> It is.
>
>
>>
>> Patch 0006 LGTM.
>>
>> Patch 0007 LGTM, but you should get someone who's familiar with the ARM
>> backend to review.
>>
>> Patch 0008 I don't understand.  Why is this correct?
>
> chkstk_ms on Cygwin/MinGW is the same as chkstk on MSVC, so we can reuse
> the code for the call to the MSVC version.
>
>
>>
>
>
>>
>> Philip
>>
>>
>> On 08/07/2014 02:23 AM, John Kåre Alsaker wrote:
>>
>>> This is a continuation of http://reviews.llvm.org/D4717
>>>
>>> It seems Phabricator is useless with multiple patches.
>>>
>>> I've fixed the points raised in D4717 and separated out whitespaces
>>> changes to make review easier.
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140818/a96d4f8e/attachment.html>


More information about the llvm-commits mailing list