[llvm-commits] [PATCH] New pass: pointer (bounds) tracking!
Török Edwin
edwintorok at gmail.com
Mon Jul 13 13:59:56 PDT 2009
On 2009-07-08 23:59, Török Edwin wrote:
> On 2009-07-08 23:43, Dan Gohman wrote:
>
>> Hi Edwin,
>>
>> On Jul 2, 2009, at 12:32 PM, Török Edwin wrote:
>>
>>
>>
>>
>>> Hi,
>>>
>>> The attached patch introduces a new pass that can be queried for
>>> allocation size of pointers,
>>> and contains a simple solver [*] for (a ult b) inequalities useful in
>>> bounds checking. This is based on the bounds checker tool I developed
>>> for my undergraduate thesis.
>>>
>>>
>> This sounds cool. Thanks for your patience; I'm finally getting around
>> to taking a look.
>>
>>
>>
>
> Hi Dan,
>
> I am replying to part of your email now, next part will come tomorrow.
>
Please review the attached is pointertracking1.patch, which should be
ready to be applied to LLVM.
I also attached pointertracking_notready.patch, which is a WIP for
checkLimits, I'll submit it again once pointertracking1.patch is applied
to LLVM.
This is not ready to be reviewed at this time, I still have some
unimplemented parts there, and it needs more testing.
>>
>>
>>> [*]
>>> The solver uses isLoopGuardedByCond(), but is able to handle more
>>> complicated situations by using
>>> properties of SCEV expressions, and transitivity of ULT/ULE:
>>> - if a SCEVAddRecExpr is monotonic it checks whether the start and
>>> exit
>>> value is known to be ULT Limit.
>>>
>>>
>> This is cool :-).
>>
>>
I moved this to pointertracking_notready.patch, I have some more
(unimplemented) ideas for checking
monotonicity, and limits, based on derivating AddRecExpr ;)
>>
>>
>>> - for PHI nodes that are SCEVUnknowns it checks whether all parts of
>>> the PHI are known to be ULT
>>> - if a value is a constant offset away from a PHI: C + X <u B -> X
>>> <u B-C
>>> It also looks for predicates among the basic blocks that dominate the
>>> memory access, trying to
>>> find a predicate that is sufficient for the validity of the memory
>>> access.
>>>
>>> Best regards,
>>> --Edwin
>>>
>>> <pointertracking.patch>
>>>
>>>
>> Here are some comments on the patch.
>>
>> > namespace llvm {
>> > char PointerTracking::ID=0;
>> >
>> > [...]
>>
>> Don't indent namespace contents.
>>
>> > void PointerTracking::getAnalysisUsage(AnalysisUsage &AU) const {
>> > AU.addRequired<DominatorTree>();
>> > AU.addRequired<LoopInfo>();
>> > AU.addRequired<ScalarEvolution>();
>>
>> These should be addRequiredTransitive instead of addRequired, so that
>> the
>> required analyses are preserved throughout the lifetime of the
>> PointerTracking analysis.
>>
>> > callocFunc = M.getFunction("calloc");
>> [...]
>> > reallocFunc = M.getFunction("realloc");
>>
>> I don't know what the current prevailing fasion is with respect to
>> passes knowing special things about functions with standard C library
>> names. It would be good to mention that this pass knows about
>> "calloc" and "realloc" in a high-level comment though.
>>
>>
>
> Ok.
>
>
>> > const SCEV *PointerTracking::getAllocationSize(Value *V)
>>
>> This name is a little confusing. Can you rename this to
>> getAllocationElementCount or something?
>>
>>
>
> Ok.
>
I also refactored the internals of this method. The implementation
should be more straight forward now:
- we request to know the element count of a given Type (not necessarely
same as allocated type)
- we explicitly get the type of allocated element
- we check the size of the 2 type, if possible we convert the count
without using TargetData
- if necessary I use TargetData to convert count
Now getAllocationSizeInBytes simply asks for element count with
Type::Int8Ty.
>
>> > {
>> > if (AllocationInst *AI = dyn_cast<AllocationInst>(V)) {
>> > Value *arraySize = AI->getArraySize();
>> > if (ConstantInt *C = dyn_cast<ConstantInt>(arraySize)) {
>> > return SE->getConstant(C);
>> > }
>> > return SE->getSCEVAtScope(arraySize, LI->getLoopFor(AI-
>> >getParent()));
>>
>> It isn't actually necessary to check for a ConstantInt specially
>> here. getSCEVAtScope will effectively do this automatically.
>>
Fixed, I don't use SCEVAtScope here anymore either.
>> > }
>> > if (GlobalVariable *GV = dyn_cast<GlobalVariable>(V)) {
>> > if (GV->hasInitializer()) {
>> > Constant *C = GV->getInitializer();
>> > if (const ArrayType *ATy = dyn_cast<ArrayType>(C->getType
>> ())) {
>> > return SE->getConstant(Type::Int32Ty, ATy->getNumElements
>> ());
>>
>> Hard-coding Int32Ty here may be problematic in some cases.
>>
I think in general I can't guarantee what the type of the returned SCEV
is, the client
should convert it to a compatible type when using it (and I'm doing
some conversion in compareSCEV too).
>> > }
>> > }
>> > return SE->getConstant(Type::Int32Ty, 1);
>> > }
>> > // TODO: implement more complicated pointer size tracking
>> > return 0;
>> > }
>>
>> > const SCEV *PointerTracking::getAllocationSizeInBytes(Value *V)
>> > {
>> > assert(TD && "TargetData must be available to calculate size in
>> bytes!");
>>
>> Would it make sense to postpone this check until TD is actually
>> needed, and then return CouldNotCompute if TD is unavailable?
>>
>>
>
> Yes.
>
Done, I also calculate counts using getPrimitiveSizeInBits() if
available, and only after that try TargetData (if available).
>> > const SCEV *S = getAllocationSize(V);
>> > if (isa<SCEVCouldNotCompute>(S)) {
>> > if (CallInst *CI = dyn_cast<CallInst>(V)) {
>> > CallSite CS(CI);
>> > Function *F = dyn_cast<Function>(CS.getCalledValue()-
>> >stripPointerCasts());
>>
>> This line exceeds 80 columns.
>>
>> > const Loop *L = LI->getLoopFor(CI->getParent());
>> > if (F == callocFunc) {
>> > return
>> > SE->getSCEVAtScope(SE->getMulExpr(SE->getSCEV
>> (CS.getArgument(0)),
>> > SE->getSCEV
>> (CS.getArgument(1))),
>> > L);
>> > } else if (F == reallocFunc) {
>> > return SE->getSCEVAtScope(CS.getArgument(1), L);
>> > }
>> > }
>> > return S;
>> > }
>> > uint64_t elementsize = TD->getTypeStoreSize(V->getType());
>> > return SE->getMulExpr(SE->getConstant(S->getType(),
>> elementsize), S);
>>
>> The size of the allocation will be the number of elements multiplied
>> by the "Alloc" size, not the "Store" size.
>>
>>
>
> Thanks, will fix.
>
>
>> > const SCEV *getStart(const SCEV *A, const Loop *L)
>> > {
>>
>> Here and elsewhere, the brace for a function definition goes on the
>> same line as the close paren.
>>
>>
>
> Ok.
>
>> > if (isa<SCEVConstant>(A))
>> > return A;
>>
>> This would be a little stronger as A->isLoopInvariant(L).
>>
>> > DomTreeNodeBase<BasicBlock> *N = DT->getNode(BB);
>> > DomTreeNodeBase<BasicBlock> *D = N->getIDom();
>> > while (D) { // D dominates N
>> > BasicBlock *dBB = D->getBlock();
>> > bool negated;
>> > Value *V = getConditionToReach(dBB, BB, negated);
>> > if (conditionSufficient(V, negated, I, ICmpInst::ICMP_ULT,
>> Limit))
>> > return AlwaysTrue;
>> > BB = dBB;
>> > D = D->getIDom();
>> > }
>>
>> This doesn't appear correct. Even if D dominates N and has a
>> conditional branch with a successor that reaches N, it's possible
>> that the other successor of the conditional branch reaches N also.
>> See ScalarEvolution's getPredecessorWithUniqueSuccessorForBB for
>> code that solves a similar problem.
>>
>>
>
> That is quite a serious bug in my dominator predicate handling then,
> I'll have to be more careful.
> I'll have a look at this tomorrow, and try to post an updated patch.
>
I fixed this in pointertracking_notready.patch by checking that:
- there is one and only one successor of the branch that dominates the
target BB, excluding backedges
- all predecessors of the targetBB are dominated by that successor
I still need to add testcases for this, and review all the methods in
pointertracking_notready.patch once more.
Best regards,
--Edwin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pointertracking1.patch
Type: text/x-diff
Size: 20620 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090713/fb086b3f/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pointertracking_notready.patch
Type: text/x-diff
Size: 12264 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090713/fb086b3f/attachment-0001.patch>
More information about the llvm-commits
mailing list