[cfe-commits] r164339 - in /cfe/trunk: include/clang/Analysis/ include/clang/StaticAnalyzer/Core/ include/clang/StaticAnalyzer/Core/BugReporter/ include/clang/StaticAnalyzer/Core/PathSensitive/ lib/Analysis/ lib/StaticAnalyzer/Core/ test/Analysis/

Thu Sep 20 23:00:51 PDT 2012

On Sep 20, 2012, at 7:24 PM, Jordan Rose <jordan_rose at apple.com> wrote:

> Cool! Some comments:
> 
> On Sep 20, 2012, at 17:09 , Ted Kremenek <kremenek at apple.com> wrote:
> 
>> 
>> Modified: cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h?rev=164339&r1=164338&r2=164339&view=diff
>> ==============================================================================
>> --- cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h (original)
>> +++ cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h Thu Sep 20 19:09:11 2012
>> @@ -20,6 +20,7 @@
>> #include "clang/AST/DeclCXX.h"
>> #include "clang/AST/ExprCXX.h"
>> #include "clang/AST/ExprObjC.h"
>> +#include "clang/Analysis/AnalysisContext.h"
>> #include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
>> #include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"
>> #include "llvm/ADT/PointerIntPair.h"
>> @@ -191,7 +192,8 @@
>> 
>>  /// \brief Returns the definition of the function or method that will be
>>  /// called.
>> -  virtual RuntimeDefinition getRuntimeDefinition() const = 0;
>> +  virtual RuntimeDefinition
>> +          getRuntimeDefinition(AnalysisDeclContextManager &M) const = 0; 
> 
> This feels ugly to me. Is it okay if the CallEvent grovels through the state -> state manager -> etc chain to get it instead?

Works for me.  I forgot about that access path.

> 
> 
>> +static BodyFarm &getBodyFarm(ASTContext &C) {
>> +  static BodyFarm *BF = new BodyFarm(C);
>> +  return *BF;
>> +}
> 
> This seems like a disaster waiting to happen if we ever use the same process to run the analyzer over multiple translation units (i.e. multiple ASTContexts). Can this just live on the AnalysisDeclContextManager? That lasts for the entire analysis action and probably already implicitly depends on being per-ASTContext.

Yes, that makes perfect sense.

For some reason I got it into my head that AnalysisDeclContextManager had a lifetime limited to the ExprEngine, but that's not the case.  It's owned by AnalysisManager.

> 
> 
>> Stmt *AnalysisDeclContext::getBody() const {
>> -  if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D))
>> -    return FD->getBody();
>> +  if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) {
>> +    Stmt *Body = FD->getBody();
>> +    if (!Body && Manager && Manager->synthesizeBodies())
>> +      return getBodyFarm(getASTContext()).getBody(FD);
>> +    return Body;
>> +  }
>>  else if (const ObjCMethodDecl *MD = dyn_cast<ObjCMethodDecl>(D))
>>    return MD->getBody();
>>  else if (const BlockDecl *BD = dyn_cast<BlockDecl>(D))
> 
> Is there any reason why this doesn't just use the BodyFarm for all bodies? We may not be using it now, but that's no reason not to make it flexible from the beginning.

My thought was to add overloaded entry points to BodyForm for each declaration type.  We'd just add them on demand.  But we could just sink this all down into BodyFarm.

> 
> 
>> }
>> 
>> AnalysisDeclContext *AnalysisDeclContextManager::getContext(const Decl *D) {
>> +  if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) {
>> +    FD->hasBody(FD);
>> +    D = FD;
>> +  }
> 
> So there is apparently a Decl::getBody that does the right thing (i.e. returns 0 if there is no body). Can we just use that?

We're not trying to get the body here.  We're trying to update the D to be the one that has the body (if one exists), and use that as the Decl* we use to construct the AnalysisDeclContext.  Maybe getBody() looks cleaner, but it functionality isn't really any different.

Note that apparently not updating 'D' here correctly broke a ton of stuff.

> 
> 
>> +  // Everything checks out.  Create a fake body that just calls the block.
>> +  // This is basically just an AST dump of:
>> +  //
>> +  // void dispatch_sync(dispatch_queue_t queue, void (^block)(void)) {
>> +  //   block();
>> +  // }
>> +  //
>> +  DeclRefExpr *DR = DeclRefExpr::CreateEmpty(C, false, false, false, false);
>> +  DR->setDecl(const_cast<ParmVarDecl*>(PV));
>> +  DR->setValueKind(VK_LValue);
>> +  ImplicitCastExpr *ICE = ImplicitCastExpr::Create(C, Ty, CK_LValueToRValue,
>> +                                                   DR, 0, VK_RValue);
>> +  CallExpr *CE = new (C) CallExpr(C, ICE, ArrayRef<Expr*>(), C.VoidTy,
>> +                                  VK_RValue, SourceLocation());
>> +  return CE;
>> +}
> 
> The name for ArrayRef<Expr*>() is 

Hmm?

> 
> It is probably not a good idea to make a function body that isn't a CompoundStmt or CXXTryStmt.

Why?  I know how the CFGBuilder works.  This isn't an issue.  The CFGBuilder just flattens out the CompoundStmt.  That concept  doesn't even exist in the CFG.

> Right now I don't think anything depends on it besides the diagnostics, which don't apply here, but still.
> 
> 
>> +Stmt *BodyFarm::getBody(const FunctionDecl *D) {
>> +  D = D->getCanonicalDecl();
>> +  
>> +  llvm::Optional<Stmt *> &Val = Bodies[D];
>> +  if (Val.hasValue())
>> +    return Val.getValue();
>> +  
>> +  Val = 0;
> 
> Rather than using llvm::Optional<Stmt *>, which you immediately set to NULL, you can just use a map of Stmts and DenseMap::find(). That will still tell you if a value is present or not, and lets you insert the value efficiently if it isn't. It's a little clunkier code-wise, but it saves a bool in every value.

Sure, makes sense.

> 
> 
>> +  if (D->getIdentifier() == 0)
>> +    return 0;
>> +
>> +  StringRef Name = D->getName();
>> +  if (Name.empty())
>> +    return 0;
> 
> This check isn't really necessary. Any function with a simple identifier name will actually have a non-empty name. Even if an empty function name happens to slip past, it still won't match in the StringSwitch.

Right.  This was code I added late last night before I added the preceding code, which was necessary when I hit a crash.

> 
> 
>> +  FunctionFarmer FF =
>> +    llvm::StringSwitch<FunctionFarmer>(Name)
>> +      .Case("dispatch_sync", create_dispatch_sync)
>> +      .Default(NULL);
>> +  
>> +  if (FF) {
>> +    Val = FF(C, D);
>> +  }
> 
> Another early return here?
> 
> if (!FF)
>  return 0;

Sure.

> 
> 
>> Modified: cfe/trunk/lib/StaticAnalyzer/Core/BugReporter.cpp
>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/StaticAnalyzer/Core/BugReporter.cpp?rev=164339&r1=164338&r2=164339&view=diff
>> ==============================================================================
>> --- cfe/trunk/lib/StaticAnalyzer/Core/BugReporter.cpp (original)
>> +++ cfe/trunk/lib/StaticAnalyzer/Core/BugReporter.cpp Thu Sep 20 19:09:11 2012
>> @@ -121,7 +121,8 @@
>> /// Recursively scan through a path and prune out calls and macros pieces
>> /// that aren't needed.  Return true if afterwards the path contains
>> /// "interesting stuff" which means it should be pruned from the parent path.
>> -bool BugReporter::RemoveUneededCalls(PathPieces &pieces, BugReport *R) {
>> +bool BugReporter::RemoveUneededCalls(PathPieces &pieces, BugReport *R,
>> +                                     PathDiagnosticCallPiece *CallWithLoc) {
>>  bool containsSomethingInteresting = false;
>>  const unsigned N = pieces.size();
>> 
>> @@ -131,6 +132,11 @@
>>    IntrusiveRefCntPtr<PathDiagnosticPiece> piece(pieces.front());
>>    pieces.pop_front();
>> 
>> +    // Throw away pieces with invalid locations.
>> +    if (piece->getKind() != PathDiagnosticPiece::Call &&
>> +        piece->getLocation().asLocation().isInvalid())
>> +      continue;
> 
> This .asLocation().isInvalid() would be nice to have in a helper method. This isn't the last time you use it.

Right now 'isValid()' means something different for PathDiagnosticLocations, but I agree that a helper would be useful.

> 
> 
>>    switch (piece->getKind()) {
>>      case PathDiagnosticPiece::Call: {
>>        PathDiagnosticCallPiece *call = cast<PathDiagnosticCallPiece>(piece);
>> @@ -142,8 +148,17 @@
>>        }
>>        // Recursively clean out the subclass.  Keep this call around if
>>        // it contains any informative diagnostics.
>> -        if (!RemoveUneededCalls(call->path, R))
>> +        PathDiagnosticCallPiece *NewCallWithLoc =
>> +          call->getLocation().asLocation().isValid()
>> +            ? call : CallWithLoc;
> 
> I think there should actually be an assertion here -- only inlined functions can be synthesized, so there will always be a non-synthesized call with a valid location.

Not true.  The synthesized function itself can call other functions.  That call doesn't have a valid location.

> 
> 
>> +        if (!RemoveUneededCalls(call->path, R, NewCallWithLoc))
>>          continue;
>> +
>> +        if (NewCallWithLoc == CallWithLoc && CallWithLoc) {
>> +          call->callEnter = CallWithLoc->callEnter;
>> +        }
> 
> Extra braces? Also, when does callExit get set?

Good point.  We'll need a way to propagate back the call exit location from the bottom.

> 
> 
>> Modified: cfe/trunk/test/Analysis/unix-fns.c
>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Analysis/unix-fns.c?rev=164339&r1=164338&r2=164339&view=diff
>> ==============================================================================
>> --- cfe/trunk/test/Analysis/unix-fns.c (original)
>> +++ cfe/trunk/test/Analysis/unix-fns.c Thu Sep 20 19:09:11 2012
>> @@ -1,4 +1,5 @@
>> -// RUN: %clang_cc1 -triple x86_64-apple-darwin10 -analyze -analyzer-checker=unix.API,osx.API %s -analyzer-store=region -fblocks -verify
>> +// RUN: %clang_cc1 -triple x86_64-apple-darwin10 -analyze -analyzer-checker=core,unix.API,osx.API %s -analyzer-store=region -analyzer-output=plist -analyzer-ipa=inlining -analyzer-eagerly-assume -analyzer-config faux-bodies=true -fblocks -verify -o %t.plist
>> +// RUN: FileCheck --input-file=%t.plist %s 
> 
> Please add an -analyzer-output=text version of this test, and use that for -verify. I know it's imprecise, but it's easier to tell what changes when we change things.

Sure.  I'm not convinced it really adds any value.  I find the plist output fairly easy to read, and I can tell exactly where the arrows are.

> 
> 
>> struct _opaque_pthread_once_t {
>>  long __sig;
>> @@ -15,9 +16,27 @@
>> void *alloca(size_t);
>> void *valloc(size_t);
>> 
>> +typedef union {
>> + struct _os_object_s *_os_obj;
>> + struct dispatch_object_s *_do;
>> + struct dispatch_continuation_s *_dc;
>> + struct dispatch_queue_s *_dq;
>> + struct dispatch_queue_attr_s *_dqa;
>> + struct dispatch_group_s *_dg;
>> + struct dispatch_source_s *_ds;
>> + struct dispatch_source_attr_s *_dsa;
>> + struct dispatch_semaphore_s *_dsema;
>> + struct dispatch_data_s *_ddata;
>> + struct dispatch_io_s *_dchannel;
>> + struct dispatch_operation_s *_doperation;
>> + struct dispatch_disk_s *_ddisk;
>> +} dispatch_object_t __attribute__((__transparent_union__));
>> +
>> typedef void (^dispatch_block_t)(void);
>> typedef long dispatch_once_t;
>> +typedef struct dispatch_queue_s *dispatch_queue_t;
>> void dispatch_once(dispatch_once_t *predicate, dispatch_block_t block);
>> +void dispatch_sync(dispatch_queue_t queue, dispatch_block_t block);
>> 
>> #ifndef O_CREAT
>> #define O_CREAT 0x0200
>> @@ -151,3 +170,1419 @@
>>  dispatch_once_t pred = 0;
>>  dispatch_once(&pred, ^(){});  // expected-warning {{Call to 'dispatch_once' uses the local variable 'pred' for the predicate value}}
>> }
>> +
>> +// Test inlining of dispatch_sync.
>> +void test_dispatch_sync(dispatch_queue_t queue, int *q) {
>> +  int *p = 0;
>> +  dispatch_sync(queue, ^(void){ 
>> +	  if (q) {
>> +		*p = 1; // expected-warning {{null pointer}}
>> +	   }
>> +  });
>> +}
> 
> Very cool. :-)

Thanks for the review.