[PATCH][Review request] unix.Malloc checker improvement: +handling new/delete, +memory.MismatchedFree, +memory.MismatchedDelete, +improved display names for allocators/deallocators

Jordan Rose jordan_rose at apple.com
Mon Feb 25 18:31:32 PST 2013


Okay, starting over. Anna went and explained to me that we really do want to be planning for the future case, where we allow people to specify their own custom allocation families, and that we should start doing that now. (Sorry for sending you in the wrong direction.) The design we hashed out is similar to what you came up with, but with a change in focus towards what we're calling "families", which may have multiple allocators and deallocators.

class RefState {
  const Stmt *S;
  unsigned State : 2; // Kind enum, but stored as a bitfield
  unsigned Family : 30; // Rest of 32-bit word, currently just an ID

  // ...
};

enum AllocationFamilies {
  AF_None,
  AF_Malloc,
  AF_CXXNew,
  AF_CXXNewArray,
  AF_FirstUserFamily
};

That's all for now, but if/when we support user annotations for allocations/deallocations, we can add a table to MallocChecker (not globally):

mutable llvm::SmallDenseMap<IdentifierInfo *, > FamilyIDTable;
mutable unsigned NextUserFamily = AF_FirstUserFamily;

and prepopulate it with any families we want, such as, say, "malloc". (See the old malloc-annotations test case.) When we come across a family we haven't seen before, we add it to the table and increment the "NextUserFamily" counter. (We don't need to design most of this now.)

This does have the downside that you pointed out: in the error message at the end, we can't say "memory was allocated by malloc()" if malloc is called indirectly. Anna convinced me that this is acceptable, especially if we do rephrase the diagnostic when we don't have a good message to print, because we'll still have the path note that shows where the allocation happened. We can still rederive the allocator name from the source statement.

We'll also have trouble printing out an archetypical allocator or deallocator in the "user-defined family" case, but we can design that later. Anna and I tossed around a few ideas but didn't settle on anything.

On the other hand, this keeps RefState a little more compact, and avoids string-based lookups for function names when many functions don't have names.


Other issues: I now understand why you decided to check new/delete within classes; excluding all cases with placement args is probably a good balance. (LLVM uses placement-arg-based allocators for POD types on a BumpPtrAllocator, for example.) Sorry for not getting it sooner.

>> Stepping back, there are many problems with this, the main one being that just keeping track of the function name isn't really good enough. MacOSKeychainAPIChecker can get away with it because its functions are 
> 
> Could you, please, send the remainder of the sentence? Tried to guess, but failed.


Since I have been convinced otherwise, this is less relevant, but I think I was saying that MacOSKeychainAPIChecker can get away with this because its functions are all simple C functions in the global namespace, whereas a general alloc/dealloc checker needs to handle functions in namespaces, C++ instance methods, ObjC methods...all sorts of things where a simple name can't disambiguate.

> Returning DK_free for Objective-C messages just mean that they belong to the 'malloc/free' family. Consider the following example:
> void test() {
>   int *p = (int *)malloc(sizeof(int));
>   [NSData dataWithBytesNoCopy:bytes length:sizeof(int)];
> }

Ha, I forgot about free-when-done; carry on. Not entirely happy with assuming all of ObjC belongs to malloc, but we can revisit this when we have other interesting families.


I'm deliberately not going to comment on your latest patch because I'd rather have your feedback on this design. What do you think?
Jordan



More information about the cfe-commits mailing list