[PATCH][Review request] unix.Malloc checker improvement: +handling new/delete, +memory.MismatchedFree, +memory.MismatchedDelete, +improved display names for allocators/deallocators

Anton Yartsev anton.yartsev at gmail.com
Wed Feb 27 03:59:29 PST 2013


On 26.02.2013 6:31, Jordan Rose wrote:
> Okay, starting over. Anna went and explained to me that we really do want to be planning for the future case, where we allow people to specify their own custom allocation families, and that we should start doing that now. (Sorry for sending you in the wrong direction.) The design we hashed out is similar to what you came up with, but with a change in focus towards what we're calling "families", which may have multiple allocators and deallocators.
>
> class RefState {
>    const Stmt *S;
>    unsigned State : 2; // Kind enum, but stored as a bitfield
>    unsigned Family : 30; // Rest of 32-bit word, currently just an ID
>
>    // ...
> };
>
> enum AllocationFamilies {
>    AF_None,
>    AF_Malloc,
>    AF_CXXNew,
>    AF_CXXNewArray,
>    AF_FirstUserFamily
> };
>
> That's all for now, but if/when we support user annotations for allocations/deallocations, we can add a table to MallocChecker (not globally):
>
> mutable llvm::SmallDenseMap<IdentifierInfo *, > FamilyIDTable;
> mutable unsigned NextUserFamily = AF_FirstUserFamily;
>
> and prepopulate it with any families we want, such as, say, "malloc". (See the old malloc-annotations test case.) When we come across a family we haven't seen before, we add it to the table and increment the "NextUserFamily" counter. (We don't need to design most of this now.)
>
> This does have the downside that you pointed out: in the error message at the end, we can't say "memory was allocated by malloc()" if malloc is called indirectly. Anna convinced me that this is acceptable, especially if we do rephrase the diagnostic when we don't have a good message to print, because we'll still have the path note that shows where the allocation happened. We can still rederive the allocator name from the source statement.
>
> We'll also have trouble printing out an archetypical allocator or deallocator in the "user-defined family" case, but we can design that later. Anna and I tossed around a few ideas but didn't settle on anything.
>
> On the other hand, this keeps RefState a little more compact, and avoids string-based lookups for function names when many functions don't have names.
>
>
> Other issues: I now understand why you decided to check new/delete within classes; excluding all cases with placement args is probably a good balance. (LLVM uses placement-arg-based allocators for POD types on a BumpPtrAllocator, for example.) Sorry for not getting it sooner.
>
>>> Stepping back, there are many problems with this, the main one being that just keeping track of the function name isn't really good enough. MacOSKeychainAPIChecker can get away with it because its functions are
>> Could you, please, send the remainder of the sentence? Tried to guess, but failed.
>
> Since I have been convinced otherwise, this is less relevant, but I think I was saying that MacOSKeychainAPIChecker can get away with this because its functions are all simple C functions in the global namespace, whereas a general alloc/dealloc checker needs to handle functions in namespaces, C++ instance methods, ObjC methods...all sorts of things where a simple name can't disambiguate.
>
>> Returning DK_free for Objective-C messages just mean that they belong to the 'malloc/free' family. Consider the following example:
>> void test() {
>>    int *p = (int *)malloc(sizeof(int));
>>    [NSData dataWithBytesNoCopy:bytes length:sizeof(int)];
>> }
> Ha, I forgot about free-when-done; carry on. Not entirely happy with assuming all of ObjC belongs to malloc, but we can revisit this when we have other interesting families.
>
>
> I'm deliberately not going to comment on your latest patch because I'd rather have your feedback on this design. What do you think?
> Jordan
Families are just what I thought about, but postponed the refactoring.

I am slightly suspicious of bit-fields as too much is 
implementation-defined (again, when I used the enum type 'Kind' instead 
of the 'unsigned' in 'unsigned State : 2' the State failed to hold enum 
values for some reason. Haven't dig this, maybe the case is related to 
"A bit-field shall have integral or enumeration type (3.9.1). It is 
implementation-defined whether a plain (neither explicitly signed nor 
unsigned) char, short, int, long, or long long bit-field is signed or 
unsigned."), but this approach is much more convenient and readable.

The current solution handles the majority of cases. It is also a good 
step to approach with a dynamic table that will eliminate the downsides 
of a current approach and enable support of user-defined families.
I am for getting this in and moving towards the approach with dynamic 
tables as a next step.

Fresh patch is attached.

-- 
Anton

-------------- next part --------------
A non-text attachment was scrubbed...
Name: MallocChecker_v6.patch
Type: text/x-diff
Size: 31446 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130227/4be3e9d2/attachment.patch>


More information about the cfe-commits mailing list