[LLVMdev] Pointers in Load and Store

Surinder surifilms at gmail.com
Sat Jan 22 17:19:13 PST 2011


John,

I have looked at the real code (instead of the obsolete one) and it
appears to be easy to find if an operand is a getelementptr
instruction.

  if (ConstantExpr * CE = dyn_cast<ConstantExpr>(I.getOperand(0)))
    { Out<<  "*** operand 0 is a constant Expr******";
       if (CE->getOpcode() == Instruction::GetElementPtr)
         { Out<<  "*** operand 0 is a gep instruction ********";
           if (const ArrayType *ar =
dyn_cast<ArrayType>(CE->getPointerOperandType()->getElementType()))
              hi=ar->getNumElements();

Thank you for that.

I would like to use safecode programs rather than write my own code.
However, the website of safecode says that it works only with version
2.6 or 2.7 of llvm whereas I use version 2.8 of llvm.

To get around the problem, I plan to do as follows :

(1)  Do not install safecode with llvm 2.8 (as it may or may not work)

(2)  Create a new pass named "unGep", "Breaks Constant GEPs"

(3) The new pass derives from FunctionPass (because safecode does so,
if I had to write it, ModulePass would have been good enough.)

(4) The RunOnFunction method of the unGep pass invokes
addPoolChecks(F) passing it the function F.  I will modify
addGetElementPtrChecks so that it produces array bounds in the way I
need. (I need a check that array bounds are being voilated for my
reaserch to detect overflows.)

I will then run opt as

opt -load../unGep.so

to produce llvm code without geps as operands.

Please advise if this will work or if there is an easier way.

Thanks.

Surinder Kumar Jain


On Sat, Jan 22, 2011 at 4:08 PM, John Criswell <criswell at illinois.edu> wrote:
> On 1/21/2011 10:46 PM, Surinder wrote:
>>
>> John,
>>
>> I have looked at the SAFECode and thought following sould work
>>
>>        if (isa<Constant>(I.getOperand(0)))
>>        { Out<<  "*** operand 0 is a constant ******";
>>          if (Instruction *operandI =
>> dyn_cast<Instruction>(I.getOperand(0)))
>>            { Out<<  "********operand is an instruction ****";
>>              if (GetElementPtrInst *gepI =
>> dyn_cast<GetElementPtrInst>(operandI))
>>                { Out<<  "*** operand is a gep instruction ********";
>>                  if (const ArrayType *ar =
>> dyn_cast<ArrayType>(gepI->getPointerOperandType()->getElementType()))
>>                    hi=ar->getNumElements();
>
>> But this does not recognize that operand(0) of instruction I is even
>> an instruction, let alone a get element pointer instruction.  I have
>> taken the code from line 632 and line 757 of
>> safecode/lib/ArrayBoundsChecks/ArrayBoundCheck.cpp
>>
>> I must be doing something wrong, what is it?
>
> The problem is simple: you're looking at the wrong source file.
> :)
>
> More specifically, you're looking at the very antiquated static array bounds
> checking pass (it hasn't compiled in several years now).  The file you want
> to look at is in lib/InsertPoolChecks/insert.cpp.  This file contains the
> InsertPoolChecks pass which, in mainline SAFECode, is responsible for
> inserting array bounds checks and indirect function call checks.  In
> particular, you want to look at the addGetElementPtrChecks() method.
>
> As for Constant Expression GEPs, you want to look at the BreakConstGEP pass,
> located in lib/ArrayBoundsChecks/BreakConstantGEPs.cpp.
>
> The BreakConstantGEP pass is run first.  All it does is find instructions
> that use constant expression GEPs and replaces the Constant Expression GEP
> with a GEP instruction.  All of the other SAFECode passes that worry about
> array bounds checks (i.e., the static array bounds checking passes in
> lib/ArrayBoundsCheck and the run-time instrumentation pass in
> lib/InsertPoolChecks/insert.cpp) only look for GEP instructions.
>
> -- John T.
>
>
>> Surinder Kumar Jain
>>
>>
>> PS: Yes, I will be using safecode but still I want to know why above
>> code does not work.  I am posting a separate mail wioth the title "OPT
>> optimizations"
>
>
>>
>> On Fri, Jan 21, 2011 at 3:12 PM, John Criswell<criswell at illinois.edu>
>>  wrote:
>>>
>>> On 1/20/2011 10:02 PM, Surinder wrote:
>>>>
>>>> When I compile C programs into llvm, it produces load instructions in
>>>> two different flavours.
>>>>
>>>> (1)    %8 = load i8** %bp, align 8
>>>>
>>>> (2)    %1 = load i8* getelementptr inbounds ([4 x i8]* @.str, i64 0,
>>>> i64 0), align 1
>>>>
>>>> I know that %bp in first case and the entire "getelementptr inbounds
>>>> ([4 x i8]* @.str, i64 0, i64 0)" in second case can be obtained by
>>>> dump'ing I.getOperand(0)
>>>>
>>>> However, I want to find out which of the two forms of load have been
>>>> produced because in the second case, I want to insert checks for array
>>>> bounds.
>>>>
>>>> How can I find out when I am in Instruction object I and I.getOpcode()
>>>> == 29 whether I am dealing with type (1) or type (2) above.
>>>
>>> The second load instruction is not really a "special" load instruction.
>>> Rather, its pointer argument is an LLVM constant expression (class
>>> llvm::ConstExpr).  The Getelementptr (GEP), which would normally be a
>>> GEP instruction, is instead a constant expression that will be converted
>>> into a constant numeric value at code generation time.
>>>
>>> So, what you need to do is to examine the LoadInst's operand and see if
>>> its a ConstExpr, and then see whether the ConstExpr's opcode is a GEP
>>> opcode.
>>>
>>> However, there's an easier way to handle this.  SAFECode
>>> (http://safecode.cs.illinois.edu) has an LLVM pass which converts
>>> constant expression GEPs into GEP instructions.  If you run it on your
>>> code first, you'll get the following instruction sequence:
>>>
>>> %tmp = getelementptr inbounds ([4 x i8]* @.str, i64 0,i64 0), align 1
>>> %1 = load i8* %tmp
>>>
>>> You then merely search for GEP instructions and put run-time checks on
>>> those (which you have to do anyway if you're adding array bounds
>>> checking).  The only ConstantExpr GEPs that aren't converted, I think,
>>> are those in global variable initializers.
>>>
>>> Now, regarding the insertion of array bounds checks, SAFECode does that,
>>> too (it is a memory safety compiler for C code).  It also provides a
>>> simple static array bounds checker and some array bounds check
>>> optimization passes.
>>>
>>> I can direct you to the relevant portions of the source code if you're
>>> interested.
>>>
>>> -- John T.
>>>
>>>> Thanks.
>>>>
>>>> Surinder Kumar Jain
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>
>




More information about the llvm-dev mailing list