[stackprotector] Add the llvm.stackprotectorcheck intrinsic

Eli Friedman eli.friedman at gmail.com
Tue Jul 23 16:14:11 PDT 2013


On Tue, Jul 23, 2013 at 3:39 PM, Michael Gottesman <mgottesman at apple.com> wrote:
> The attached patch adds the llvm.stackprotectorcheck intrinsic. First I am going to describe the problem the intrinsic is meant to solve and how it fits into said problem's final solution. I am fine with any comments about the general algorithm being in response to this.
>
> *NOTE* I was purposely trying to not rewrite the stack protector pass itself.
> *NOTE* The following analysis ignores OSes which do not support the normal LLVM stack protector check implementation (i.e. OpenBSD). In such cases, the normal stack protector check pass will still be used.
>
> ==========
> The Problem
> ==========
>
> For those unfamiliar, a sibling call is a specific type of safe tail call optimization which LLVM automatically recognizes/performs. Currently the stack protector pass disrupts sibling calls since we insert the stack protector check calls before we lower calls from LLVM IR -> SelectionDAG nodes where the decision is made upon whether or not a sibling call optimization is safe. Thus if originally we had the following IR:
>
> ----
> %x = tail call i8* @cool_function()
> ret %x
> ----
>
> The stack protector pass will modify said code like so:
> ----
> %x = tail call i8* @fun()
> %guard = load …
> %stackslot = load …
> %cmp = icmp i8* %guard, %stackslot
> br i1 %cmp, label %success, label %failure
>
> success:
>   ret %x
>
> fail:
>   call void @__stack_chk_fail()
>   unreachable.
> ----
>
> Since “ fun” is no longer in the “tail position”, no sibling call can be generated when we lower the LLVM IR Call to a selection dag node. Notice that if we were able to speculate and realize that the sibling call would actually occur without the stack protector, since we are reusing the stack, it would be safe to commute the call to cool_function and the stack protector, i.e.:
>
> ----
> %guard = load …
> %stackslot = load …
> %cmp = icmp i8* %guard, %stackslot
> br i1 %cmp, label %success, label %failure
>
> success:
>   %x = tail call i8* @fun()
>   ret %x
>
> fail:
>   call void @__stack_chk_fail()
>   unreachable.
> ----
>
> giving us the tail call *AND* the protection of the stack protector. Sadly we can not recognize if “fun” will actually tail call at the IR level since on certain platforms the sibling call decision requires target dependent knowledge of which registers are in use (see X86ISelLowering::IsEligibleForTailCallOptimization). Thus a different scheme is needed.
>
> ==========
> The Solution
> ==========
>
> The key realization that occurred to me is that at the MI level tail calls are represented as a form of special return statement. If we could delay the expansion of the stack protector check, we could just allow for normal CodeGen to occur and always insert the stack protector check (and the branches to its basic blocks) right before the return statement.
>
> Thus consider the following solution to the stated problem:
>
> 1. Introduce an llvm.stackprotector check intrinsic (this patch). This represents to code-gen the comparison, branch, and two basic blocks/etc. Have the stack protector pass insert it as appropriate. Thus at the IR level one would see:
>
> ----
> %x = tail call i8* @fun()
> %guard = load …
> %stackslot = load …
> call void @llvm.stackprotectorcheck(i8* %guard, i8* %stackslot)
> ret %x
> ----
>
> 2. If in the stack protector pass we can prove that there is a call right before the return statement which satisfies all the platform independent requirements for a tail call, we swap the order of the stack protector and the call, i.e.:
>
> ----
> %guard = load …
> %stackslot = load …
> call void @llvm.stackprotectorcheck(i8* %guard, i8* %stackslot)
> %x = tail call i8* @fun()
> ret %x
> ----

What happens if the call satisfies all the platform-independent
requirements, but fails some platform-specific requirement?  Does it
just not matter because the buffers the stack protector is protecting
can't be referenced?

Also, I don't understand why we need to introduce an intrinsic: if
steps 1 and 2 are both in the same pass, can't you just insert the
compare+branch before the tail call?

-Eli




More information about the llvm-commits mailing list