[cfe-dev] Weak function pointers (was "SymbolRef and SVal confusion")

Richard tarka.t.otter at googlemail.com
Sun Jan 20 09:00:12 PST 2013


Hey Ted,

On 17 Jan 2013, at 19:16, Ted Kremenek <kremenek at apple.com> wrote:

> Hi Richard,
> 
> Sorry for joining this so late.  I have mixed thoughts about this direction, but I think we should give it a try and see how well it works out.  Thanks so much for working in this!  I have a few specific comments on the patch itself:
> 
>> Index: include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
>> ===================================================================
>> --- include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h	(revision 172612)
>> +++ include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h	(working copy)
>> @@ -532,9 +532,10 @@
>>  /// FunctionTextRegion - A region that represents code texts of function.
>>  class FunctionTextRegion : public CodeTextRegion {
>>    const NamedDecl *FD;
>> +  const SymbolExtent *WeakSym;
>>  public:
>>    FunctionTextRegion(const NamedDecl *fd, const MemRegion* sreg)
>> -    : CodeTextRegion(sreg, FunctionTextRegionKind), FD(fd) {
>> +    : CodeTextRegion(sreg, FunctionTextRegionKind), FD(fd), WeakSym(NULL) {
>>      assert(isa<ObjCMethodDecl>(fd) || isa<FunctionDecl>(fd));
>>    }
>>    
>> @@ -555,7 +556,17 @@
>>    const NamedDecl *getDecl() const {
>>      return FD;
>>    }
>> -    
>> +
>> +  const SymbolExtent *getWeakSymbol() const {
>> +    return WeakSym;
>> +  }
> 
> Please include a doxygen comment for this method, since this is new functionality that needs some explanation.
> 

Done.

>> +
>> +  void setWeakSymbol(const SymbolExtent *S) {
>> +    // Only weak functions can have symbols.
>> +    assert(cast<ValueDecl>(FD)->isWeak());
>> +    WeakSym = S;
>> +  }
> 
> MemRegions need to be immutable, so the setWeakSymbol() is not going to work.  Can we just have the symbol associated with the FunctionTextRegion when it gets created?

I would like to do this, but I don't see how it is possible easily. The SymbolExtent requires the FunctionTextRegion in its constructor, and I don't see a nice way to get hold of the SymbolManager in the FunctionTextRegions constructor to create it. Any suggestions here?

> 
>> +
>>    virtual void dumpToStream(raw_ostream &os) const;
>>    
>>    void Profile(llvm::FoldingSetNodeID& ID) const;
>> Index: lib/StaticAnalyzer/Core/SVals.cpp
>> ===================================================================
>> --- lib/StaticAnalyzer/Core/SVals.cpp	(revision 172612)
>> +++ lib/StaticAnalyzer/Core/SVals.cpp	(working copy)
>> @@ -73,6 +73,9 @@
>>      const MemRegion *R = X->stripCasts();
>>      if (const SymbolicRegion *SymR = dyn_cast<SymbolicRegion>(R))
>>        return SymR->getSymbol();
>> +
>> +    else if (const FunctionTextRegion *FnR = dyn_cast<FunctionTextRegion>(R))
>> +      return FnR->getWeakSymbol();
>>    }
> 
> Minor not: please no empty line between if the 'if' and the 'else if'.  Also please include a comment here, within the else if clause, describing what this is doing.
> 

Done.

>>    return 0;
>>  }
>> Index: lib/StaticAnalyzer/Core/SimpleConstraintManager.h
>> ===================================================================
>> --- lib/StaticAnalyzer/Core/SimpleConstraintManager.h	(revision 172612)
>> +++ lib/StaticAnalyzer/Core/SimpleConstraintManager.h	(working copy)
>> @@ -96,6 +96,10 @@
>>    ProgramStateRef assumeAuxForSymbol(ProgramStateRef State,
>>                                           SymbolRef Sym,
>>                                           bool Assumption);
>> +
>> +  ProgramStateRef assumeLocSymbol(ProgramStateRef State,
>> +                                  SymbolRef S,
>> +                                  bool Assumption);
>>  };
> 
> Please include a doxygen comment.  Gradually over time we want to get all of these methods doxygenified, and new code should definitely include comments.
> 

Done, not too much to say about this one.

>>  
>>  } // end GR namespace
>> Index: lib/StaticAnalyzer/Core/SymbolManager.cpp
>> ===================================================================
>> --- lib/StaticAnalyzer/Core/SymbolManager.cpp	(revision 172612)
>> +++ lib/StaticAnalyzer/Core/SymbolManager.cpp	(working copy)
>> @@ -334,6 +334,14 @@
>>  
>>  QualType SymbolExtent::getType() const {
>>    ASTContext &Ctx = R->getMemRegionManager()->getContext();
>> +
>> +  // If this symbol is wrapping a weak function region
>> +  // then the type should be a function pointer.
>> +  if (const FunctionTextRegion *FTR = dyn_cast<FunctionTextRegion>(R)) 
>> +    if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(FTR->getDecl()))
>> +      if (FD->isWeak())
>> +        return Ctx.getPointerType(FD->getType());
>> +
>>    return Ctx.getSizeType();
>>  }
>>  
>> Index: lib/StaticAnalyzer/Core/SValBuilder.cpp
>> ===================================================================
>> --- lib/StaticAnalyzer/Core/SValBuilder.cpp	(revision 172612)
>> +++ lib/StaticAnalyzer/Core/SValBuilder.cpp	(working copy)
>> @@ -190,7 +190,17 @@
>>  }
>>  
>>  DefinedSVal SValBuilder::getFunctionPointer(const FunctionDecl *func) {
>> -  return loc::MemRegionVal(MemMgr.getFunctionTextRegion(func));
>> +  const FunctionTextRegion *FTR = MemMgr.getFunctionTextRegion(func);
>> +
>> +  // If this is a weak function it needs an associated symbol
>> +  // for binding a possible NULL value to.
>> +  if (func->isWeak()) {
>> +    const SymbolExtent *S = SymMgr.getExtentSymbol(FTR);
>> +    FunctionTextRegion *F = const_cast<FunctionTextRegion *>(FTR);
>> +    F->setWeakSymbol(S);
>> +  }
> 
> I don't think we need 'setWeakSymbol()'.  We can just associate the symbol with the region when it gets created.  SValBuilder shouldn't' be tampering with the definition of a MemRegion after it has been created.
> 

As above.

>> +
>> +  return loc::MemRegionVal(FTR);
>>  }
>>  
> 
> 
>>  DefinedSVal SValBuilder::getBlockPointer(const BlockDecl *block,
>> Index: lib/StaticAnalyzer/Core/SimpleConstraintManager.cpp
>> ===================================================================
>> --- lib/StaticAnalyzer/Core/SimpleConstraintManager.cpp	(revision 172612)
>> +++ lib/StaticAnalyzer/Core/SimpleConstraintManager.cpp	(working copy)
>> @@ -72,6 +72,16 @@
>>    return state;
>>  }
>>  
>> +ProgramStateRef SimpleConstraintManager::assumeLocSymbol(ProgramStateRef State,
>> +                                                         SymbolRef S,
>> +                                                         bool Assumption) {
>> +  const llvm::APSInt &Zero = getBasicVals().getZeroWithPtrWidth();
>> +  if (Assumption)
>> +    return assumeSymNE(State, S, Zero, Zero);
>> +  else
>> +    return assumeSymEQ(State, S, Zero, Zero);
>> +}
>> +
>>  ProgramStateRef SimpleConstraintManager::assumeAux(ProgramStateRef state,
>>                                                    Loc Cond, bool Assumption) {
>>    switch (Cond.getSubKind()) {
>> @@ -81,19 +91,20 @@
>>  
>>    case loc::MemRegionKind: {
>>      // FIXME: Should this go into the storemanager?
>> -
>>      const MemRegion *R = cast<loc::MemRegionVal>(Cond).getRegion();
>>      const SubRegion *SubR = dyn_cast<SubRegion>(R);
>> -
>> +
>>      while (SubR) {
>>        // FIXME: now we only find the first symbolic region.
>> -      if (const SymbolicRegion *SymR = dyn_cast<SymbolicRegion>(SubR)) {
>> -        const llvm::APSInt &zero = getBasicVals().getZeroWithPtrWidth();
>> -        if (Assumption)
>> -          return assumeSymNE(state, SymR->getSymbol(), zero, zero);
>> -        else
>> -          return assumeSymEQ(state, SymR->getSymbol(), zero, zero);
>> -      }
>> +      if (const SymbolicRegion *SymR = dyn_cast<SymbolicRegion>(SubR))
>> +        return assumeLocSymbol(state, SymR->getSymbol(), Assumption);
>> +
> 
> Please no empty line.
> 
>> +      // If this region is modelling a weak function
> 
> "modelling" is has only one "l".
> 

Done.

>> +      // then we want to assume based on its associated symbol.
>> +      else if (const FunctionTextRegion *FTR = dyn_cast<FunctionTextRegion>(SubR))
>> +        if (const SymbolExtent *S = FTR->getWeakSymbol())
>> +          return assumeLocSymbol(state, S, Assumption);
>> +
>>        SubR = dyn_cast<SubRegion>(SubR->getSuperRegion());
>>      }
>>  
> 
> The patch also need some kind of test case to exercise this functionality, although I'm not certain exactly what yet you can test.  We don't want to add code that isn't tested in some way.
> 
> Thanks for working on this!
> 
> Ted


I have included a test case with the patch, it tests for values being assumed for weak functions, and checks that global function behaviour remains the same. Sufficient?

> 
> 
> On Jan 16, 2013, at 6:14 AM, Richard <tarka.t.otter at googlemail.com> wrote:
> 
>> Hey,
>> 
>> On 15 Jan 2013, at 19:30, Jordan Rose <jordan_rose at apple.com> wrote:
>> 
>>> 
>>> On Jan 15, 2013, at 7:21 , Richard <tarka.t.otter at googlemail.com> wrote:
>>> 
>>>> Hi Jordan,
>>>> 
>>>> On 15 Jan 2013, at 03:44, Jordan Rose <jordan_rose at apple.com> wrote:
>>>> 
>>>>> That's not quite going to work -- what if I explicitly spell out my comparison?
>>>>> 
>>>>>> if (MyWeakFunction == NULL) {
>>>>>> }
>>>>> 
>>>>> In this case, SimpleConstraintManager won't go through your new assumeLocSymbol.
>>>>> 
>>>>> ...although actually, it won't make it there at all, because SimpleSValBuilder::evalBinOpLL folds null comparisons against non-symbolic regions. That's actually easy enough to fix, though, and if you do that this might actually work.
>>>> 
>>>> I am not sure I follow you here. Why will this not work? On line 646 of SimpleSValBuilder we have:
>>>> 
>>>> if (SymbolRef lSym = lhs.getAsLocSymbol())
>>>>     return MakeSymIntVal(lSym, op, rInt->getValue(), resultTy);
>>>> 
>>>> The metadata symbol will be returned here by getAsLocSymbol() and everything proceeds as expected. Testing this on the following trivial code shows a divide by zero warning for both branches of the IfStmt:
>>>> 
>>>> int myFunc() __attribute__((weak_import));
>>>> int main(int argc, char *argv[])
>>>> {
>>>>   if (myFunc == NULL) {
>>>>     1 / 0;
>>>>   } else {
>>>>     1 / 0;
>>>>   }
>>>>   return 0;
>>>> }
>>> 
>>> Oops! I missed your changed to SVals.cpp. (This is what I get for reviewing patches by sight only.)
>>> 
>>> I am a little concerned about getAsLocSymbol always returning the metadata symbol, but maybe it's okay. I need to go trace through the ramifications of that. My idea was to only produce the metadata symbol when we are sure we are doing a comparison (or a cast to bool, I suppose), since right now there's no way to go from the symbol back to the region (wrapping the symbol in a SymbolicRegion would be incorrect).
>>> 
>> 
>> Is it not possible to get the region back from the symbol using getRegion()? 
>> 
>>> 
>>>>> Oh, and metadata symbols also die unless some checker specifically requests to keep them alive. I'm wondering now if the current behavior of SymbolExtent (immutable, lasts as long as the base region, only as path-specific as their region) is, in fact, closer to the desired behavior here than metadata symbols (invalidatable, path-sensitive, only lasts as long as someone is interested). If this is the case, maybe SymbolExtent should be made more generic.
>>>>> 
>>>>> (Sorry for giving you a runaround here. I'm not sure what the right thing to do is...I just want to avoid "new symbols" being the solution to every problem, and at the same time make sure that the existing symbols have well-defined semantics.)
>>>>> 
>>>> 
>>>> OK, I was under the impression that a SymbolMetadata would stay alive as long as the MemRegion was, but I see that is incorrect in the docs. I would confess to not exactly being an expert on the Symbol class hierarchy, but it seems to me that SymbolExtent is not the right thing to use here. It has the properties we want, but as you mentioned before, it represents the size of a region. This seems an odd symbol to be using to represent a possible NULL pointer to a function. What about using a SymbolRegionValue here?
>>> 
>>> Heh. A bit of history: I added both SymbolExtent and SymbolMetadata. Originally I had hoped they'd be able to use the same kind of symbol, but it turned out that extents being non-invalidatable and metadata being somewhat volatile made it hard to unify the two. If we changed the symbol hierarchy now, we'd rename SymbolExtent -- it'd be something like SymbolRegionMetadata and SymbolRegionContentsMetadata, except those names are ugly.
>>> 
>>> The other symbols have single, well-defined purposes, and I'd rather not overload them:
>>> - SymbolRegionValue represents the contents of a MemRegion, i.e. the value of a parameter or global variable.
>>> - SymbolDerived represents the contents of a subregion of a SymbolicRegion, e.g. the third field in a struct passed by value.
>>> - SymbolConjured represents the value of an expression that the analyzer can't evaluate (usually a call expression).
>>> 
>>> I/we sometimes cheat and reuse SymbolConjured because it's so flexible, but mostly it's good to stick to these uses. And...
>>> 
>>> - SymbolExtent is (currently) a single-purpose symbol used to represent the size (in bytes) of a region whose size is not statically known (i.e. by the compiler).
>>> - SymbolMetadata is a general-purpose symbol used to represent invalidatable metadata associated with a particular region.
>>> 
>>> (I realize all of this should be added to / clarified in our docs.)
>>> 
>>> 
>> 
>> SymbolExtent it is then, what do you think of the attached patch? I have changed the QualType of the symbol when it is wrapping a weak FunctionTextRegion to a function pointer type, is this sufficient?
>> 
>>>>> - The initialization of FunctionTextRegion's WeakSym can happen in the constructor initializers rather than the body.
>>>> 
>>>> Ja, I wanted to do this, but it seemed like a bit of a chicken and egg thing. To create the symbol requires knowing the FunctionTextRegion in its constructor, so how does the FunctionTextRegion have the symbol in its constructor, without passing it the SymbolManager so it can create the symbol itself, which also seemed a bit odd. Or am I being stupid?
>>> 
>>> Ha, I just meant the initialization to NULL. I didn't think through whether it was possible to push the creation of the weak symbol into the constructor.
>>> 
>>> Jordan
>>> 
>> 
>> Then I was being stupid, no easy fix for that unfortunately :|
>> 
>> Richard
>> 
>> <weak.diff>
>> 
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130120/8dffa6da/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cl.diff
Type: application/octet-stream
Size: 6264 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130120/8dffa6da/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130120/8dffa6da/attachment-0001.html>


More information about the cfe-dev mailing list