[LLVMdev] How to get the return address on the stack on LLVM

John Criswell criswell at illinois.edu
Tue Jul 26 15:48:09 PDT 2011


On 7/26/11 5:37 PM, Xueying ZHANG wrote:
> Hi John,
>
> Thanks for your reply!

I'm CC'ing this to the list in case anyone knows why you're seeing this 
behavior.

>
> Now, I know the different between llvm.returnaddress(0) and 
> llvm.returnaddress(1). I modify the StackPortector.cpp and I just want 
> to get value of the return address stored on the stack.
>
> But when I call llvm.returnaddress(0) twice. For the first time, it 
> goes to the correct address which storing return address.


Fascinating.  The first thing to do is to see if one of the LLVM IR 
optimizations is eliminating the second call to llvm.returnaddress(0).  
To do that, run your instrumentation pass and then run the 
-std-compile-opts pass in opt.  Disassemble the output to LLVM assembly 
code and see if there is one or two calls to llvm.returnaddress(0).

If there are still two calls, then it's probably a code generator 
optimization and not a mid-level optimization that's causing this behavior.

> And then I guess it saves the content it reads into the local stack. 
> Because when I call llvm.returnaddress(0) again, it just reads from 
> the local stack instead of reading the return address.
>
> Here is a simple sample code and assembly language below:
>
> int main()
> {
>    return 0;
> }
>
> asm generated:
>
> main:                                   # @main
> # BB#0:
>     pushl    %ebp
>     movl    %esp, %ebp
>     subl    $8, %esp
>     movl    4(%ebp), %eax           //call llvm.returnaddress(0) first
>     movl    %eax, -4(%ebp)
>     movl    $0, -8(%ebp)i
>     movl    -4(%ebp), %ecx         //I want to read 4(%ebp) the return
>                                        //address again, but the 
> compiler save
>                                        //save it as a temporary 
> variable instead
>                                        //of reading the return address 
> again
>     cmpl    %ecx, %eax
>     jne    .LBB0_2
>
>
> How can I solve this problem?

If it's an optimization that's causing the problem, you'll have to 
figure out which optimization it is and disable it.

That said, I think implementing a stack canary system at the LLVM IR 
level is the wrong way to go.  As I've suggested before, you should look 
into writing a pass that modifies the prologue/epilogue code at the 
level of MachineInstructions (i.e., you should write a 
MachineFunctionPass instead of a FunctionPass or ModulePass).  A 
MachineFunctionPass should allow you to directly control code generation 
and give you a reliable way of fetching and examining the function 
return address and inserting the stack canary.

-- John T.

>
> Thank you very much.
> Ying
>
>
>
>
> Quoting John Criswell <criswell at illinois.edu> on Tue, 26 Jul 2011 
> 16:44:10 -0500:
>
>> On 7/26/11 9:49 AM, Xueying ZHANG wrote:
>>> Hi all,
>>>
>>> I want to implement the Xor random canary, so I have to get the return
>>> address in the prologue and epilogue of the function.
>>
>> First, two clarifications on the llvm.returnaddress() intrinsic to make
>> sure you understand what your code is doing:
>>
>> 1) If I understand correctly, the llvm.returnaddress instrinsic returns
>> the value of the return address stored on the stack.  It does not
>> return the location of the return address within the stack.  In other
>> words, the llvm.returnaddress intrinsic can tell you the program
>> counter to which control flow will jump on function return, but it
>> doesn't give you a way to modify the return address.
>>
>> 2) The llvm.returnaddress intrinsic is fragile.  Optimizations can
>> prevent it from returning the correct value, especially when you give
>> it a non-zero argument.  For example, frame pointer elimination may
>> cause it to return incorrect results.
>>
>>>
>>> In the prologue of the function, before I insert into the canary on
>>> the stack, I can get the return address by:
>>>
>>> ConstantInt* ci =
>>> llvm::ConstantInt::get(Type::getInt32Ty(RI->getContext()), 0);
>>>        Value* Args1[] = {ci};
>>> CallInst* callInst = CallInst::Create(Intrinsic::getDeclaration(M,
>>> Intrinsic::returnaddress),
>>> &Args1[0], array_endof(Args1), "Call Return  Address", InsPt);
>>
>> This generates a call to llvm.returnaddress(0).  This returns the
>> program counter of the call site that called the currently active
>> function.
>>
>>>
>>> CallInst will get the return address and it works.
>>>
>>> While, in the epilogue of the function, due to the canary has been
>>> inserted. I write the similar code:
>>>
>>> ConstantInt* ci2 =
>>> llvm::ConstantInt::get(Type::getInt32Ty(RI->getContext()), 1);
>>>      Value* Args3[] = {ci2};
>>>      CallInst* callInst1 =
>>> CallInst::Create(Intrinsic::getDeclaration(M,
>>> Intrinsic::returnaddress),
>>> &Args3[0], array_endof(Args3), "Caaall Return Address", BB);
>>
>> This code generates a call to llvm.returnaddress(1).  This returns the
>> program counter of the call site that called the function that called
>> the currently active function.
>>
>> -- John T.
>>>
>>> But it does not work this time. I cannot get the return address.
>>>
>>> What is problem? How can I get the return address? Thank you!
>>>
>>> Ying
>>>
>>>
>
>
>




More information about the llvm-dev mailing list