<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 23.02.2013 6:47, Jordan Rose wrote:<br>
</div>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite"><br>
<div>
<div>On Feb 22, 2013, at 7:24 , Anton Yartsev <<a
moz-do-not-send="true" href="mailto:anton.yartsev@gmail.com">anton.yartsev@gmail.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 22.02.2013 9:30, Anna Zaks
wrote:<br>
</div>
<blockquote
cite="mid:254908E4-011B-4100-BE1A-48A980CDD204@apple.com"
type="cite"> <br>
<div>
<div>On Feb 21, 2013, at 6:26 PM, Anna Zaks <<a
moz-do-not-send="true" href="mailto:ganna@apple.com">ganna@apple.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div style="word-wrap: break-word; -webkit-nbsp-mode:
space; -webkit-line-break: after-white-space;"><br>
<div>
<div>On Feb 21, 2013, at 6:00 PM, Anton Yartsev
<<a moz-do-not-send="true"
href="mailto:anton.yartsev@gmail.com">anton.yartsev@gmail.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div text="#000000" bgcolor="#FFFFFF">
<blockquote
cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
type="cite"> <br>
<div>
<div>On Feb 19, 2013, at 10:18 PM, Anton
Yartsev <<a moz-do-not-send="true"
href="mailto:anton.yartsev@gmail.com">anton.yartsev@gmail.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">Hi, Jordan. Thanx
for the review!<br>
<br>
Attached is the new version of the patch
with all the comments addressed. Also
added support for directly called
operator new()/new[]() and operator
delete()<br>
<br>
There is currently one problem with
handling of operator delete(). The
following test<br>
<br>
void testDeleteOp1() {<br>
int *p = (int *)malloc(sizeof(int));<br>
operator delete(p); // FIXME: should
complain "Argument to operator delete()
was allocated by malloc(), not operator
new"<br>
}<br>
<br>
produce no warnings as attached RefState
seem to be missing at the point when
checkPostStmt(const CallExpr *CE,
CheckerContext &C) callback is
called for operator delete(p).<br>
I haven't investigated the problem
deeply yet, intend to address it
parallel with the review.<br>
<br>
<blockquote type="cite">+ if
(NE->getNumPlacementArgs())<br>
+ return;<br>
+ // skip placement new operators as
they may not allocate memory<br>
<br>
Two comments here:<br>
- Please make sure all comments are
complete, capitalized, and punctuated
sentences. (This has the important
one, "complete"....just missing
capitalization and punctuation.)<br>
</blockquote>
I'll try. Unfortunately I am not as good
in English as I want to be, so sorry for
my grammar, syntax, and punctuation.<br>
<br>
-- <br>
Anton<br>
<br>
<span><MallocChecker_v2.patch></span></blockquote>
<div><br>
</div>
</div>
</blockquote>
<br>
Hi Anna. Thanks for your comments! Attached is
the new patch.<br>
<br>
<blockquote
cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
type="cite">
<div>Just adding another kind as extra
enumeration values does not seem right.
Another option is to make Kind be a
pointer to a static array, which contains
objects recording all necessary info about
each kind (<span style="color: rgb(0, 97,
65); font-family: Monaco; font-size:
11px; ">MacOSKeychainAPIChecker</span> uses
this approach). This is probably an
overkill for now, but is another option.</div>
</blockquote>
Not sure that I have got an idea.<br>
The memory and deallocator kind are both set
for a RefState. Do you mean creating a static
array with 'memory kinds' * 'deallocator kind'
elements for all possible combinations? Also
there is no necessary info other then the kind
itself.<br>
Left for now.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I think of ToBeReleasedWith* as being
different types of Allocate; I don't think they
should be separate values in the same enum. It's
also unfortunate to have to copy the constant
values in both places - DeallocatorKind and
RefState::Kind. Maybe you could restructure this
similarly to how this is done in SVals.h?<br>
<blockquote type="cite">
<div text="#000000" bgcolor="#FFFFFF"><br>
<blockquote
cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
type="cite">
<div>+ const FunctionDecl
*getCalleeDecl() const { return
CalleeDecl; }</div>
<div>Do we only store the name of the
allocator declaration here? </div>
</blockquote>
</div>
</blockquote>
<div><br>
</div>
<div>If the Decl is always an allocator Decl, we
should change the name of the method to be
more descriptive.</div>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
<blockquote
cite="mid:254908E4-011B-4100-BE1A-48A980CDD204@apple.com"
type="cite">
<div>
<blockquote type="cite">
<div style="word-wrap: break-word; -webkit-nbsp-mode:
space; -webkit-line-break: after-white-space;">
<div>
<blockquote type="cite">
<div text="#000000" bgcolor="#FFFFFF">
<blockquote
cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
type="cite">
<div>Do we need to store this in the state?
(Growing states implies memory overhead.)
Can't this be easily implied from the
kind?</div>
</blockquote>
Kind can't give us information about the name
of an allocator that can be malloc(),
realloc(), a user function with
ownership_takes attribute, etc.</div>
</blockquote>
<blockquote type="cite">
<div text="#000000" bgcolor="#FFFFFF"> One
solution to avoid keeping a CalleeDecl in
RefState is to rollback to
CallExpr::getDirectCallee() from
CheckerContext::getCalleeDec() and to print
"malloc()" in case of indirect calls.<br>
</div>
</blockquote>
<div><br>
</div>
Ok, I see.<br>
</div>
</div>
</blockquote>
<div><br>
</div>
After thinking about it some more, I do not think we
should add an extra pointer to the state to
differentiate between few allocator functions. In the
case when we do not have ownership attributes, we only
have few possible allocators, whose names we know ahead
of time. In case we support ownership attributes, we are
likely to have few allocator functions whose names we
could just store in the checker state on the first
encounter (like we store the IdentifierInfo). </div>
<div><br>
</div>
<div>In addition, we could change the ownership attributes
in such a way that each allocator would have a
corresponding deallocator; for example, if we wanted to
check matching allocators and deallocators. Annotated
deallocators won't necessarily be one of the functions
you know at compile time, so the DeallocatorKind enum
would not cover it. I think, it's best if we kept a
table on a side that would store this info (allocation
function name, deallocator) and refer to the entries in
the table from the state. (Take a look at
MacOSKeychainAPIChecker - it's very similar to malloc,
but it handles different allocator/deallocator pairs. I
think a similar solution could work in this case as
well. Other solutions that address these issues are
welcome as well!)</div>
</blockquote>
Attached is the patch that uses approach with a dynamic
table that holds both allocator name and expected
deallocator kind. This approach allows to keep any allocator
names, either known or new ones. The table could be easily
extended to hold additional data such as info about special
deallocators, etc.<br>
</div>
</blockquote>
</div>
<br>
<div>Please do not do this. The analyzer is not thread-safe today,
but that's no reason why we should make it harder to make it
thread-safe tomorrow.</div>
<div><br>
</div>
<div>Stepping back, there are many problems with this, the main
one being that just keeping track of the function name isn't
really good enough. MacOSKeychainAPIChecker can get away with it
because its functions are <br>
</div>
</blockquote>
Could you, please, send the remainder of the sentence? Tried to
guess, but failed.<br>
<br>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div><br>
</div>
<div>I think all Anna meant is to have a <i>static</i> table,
like MacOSKeychainAPIChecker. If/when we support allocator
families defined at runtime, we can do this properly, with a
mutable table as a field of the checker, but for now this is
both overkill and error-prone.</div>
</blockquote>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div>Separately, I don't understand why you chose to <i>inherit</i> from
SmallVector, rather than just using a SmallVector member.
AllocDeallocDataVec isn't conceptually a vector, it's a type
that maps family IDs to names and DeallocatorKinds.</div>
<div><br>
</div>
<div><br>
</div>
<div>Other comments:</div>
<div><br>
</div>
<div>
<div>+enum DeallocatorKind {</div>
<div>+ D_free,</div>
<div>+ D_delete,</div>
<div>+ D_deleteArray,</div>
<div>+ D_unknown</div>
<div>+};</div>
</div>
<div><br>
</div>
<div>Nitpicking: single-character enum prefixes feel strange to
me. DK_?</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>- enum Kind { // Reference to allocated memory.</div>
<div>- Allocated,</div>
<div>- // Reference to released/freed memory.</div>
<div>- Released,</div>
<div>- // The responsibility for freeing resources
has transfered from</div>
<div>- // this reference. A relinquished symbol
should not be freed.</div>
<div>- Relinquished } K;</div>
<div>+ // First two bits of Kind represent memory kind</div>
<div>+ static const unsigned K_MASK = 0x3;</div>
<div>+ // The rest bits keep an index to the AllocDeallocData</div>
<div>+ // that hold information about the allocator and
deallocator functions</div>
<div>+ static const unsigned I_BASE = 2;</div>
<div>+ static const unsigned I_MASK = ~0 << I_BASE;</div>
<div>+ // A storage for memory kind and index to the
AllocDeallocData</div>
<div>+ enum KindPlusData { // Reference to allocated memory.</div>
<div>+ Allocated,</div>
<div>+ // Reference to released/freed
memory.</div>
<div>+ Released,</div>
<div>+ // The responsibility for freeing
resources has transfered</div>
<div>+ // from this reference. A
relinquished symbol should not be</div>
<div>+ // freed.</div>
<div>+ Relinquished</div>
<div>+ } K;</div>
<div> const Stmt *S;</div>
</div>
<div><br>
</div>
<div>This should be implemented using bitfields, and the bitfields
should go after the Stmt pointer so that the total object size
is smaller.</div>
</blockquote>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div><br>
</div>
<div><br>
</div>
<div>
<div>
<div>+ /// Auxiliary functions that return kind and print
names of </div>
<div>+ /// allocators/deallocators</div>
<div>+ DeallocatorKind GetDeallocKind(CheckerContext &C,
const Expr *E) const;</div>
<div>+ void PrintAllocDeallocName(raw_ostream &os,
CheckerContext &C, </div>
<div>+ const Expr *E) const;</div>
<div>+ void PrintAllocDeallocName(raw_ostream &os,
CheckerContext &C, </div>
<div>+ const RefState *RS) const;</div>
<div>+ void PrintExpectedAllocName(raw_ostream &os,
CheckerContext &C, </div>
<div>+ const Expr *DeallocExpr)
const;</div>
</div>
</div>
<div><br>
</div>
<div>Hm. Doxygen will attach the comment to the first declaration
and ignore the others. You can use <a moz-do-not-send="true"
href="http://www.stack.nl/%7Edimitri/doxygen/manual/grouping.html#memgroup">groups</a> to
sort of get the effect you want, but it might be worth just
being clearer anyway.</div>
<div><br>
</div>
<div>Also, the convention for function/method names in LLVM is now
lowerCamelCase.</div>
<div><br>
</div>
<div>Thank you for switching to raw_ostream. :-)</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ bool isDefaultNonPtrPlacementNewDelete(const
FunctionDecl *FD,</div>
<div>+ CheckerContext
&C) const;</div>
</div>
<div><br>
</div>
<div>Very confusing method name. isStandardNewDelete is probably
fine. There's nothing about "pointers" or "placement" that's
strange here -- what you want to know is that this is the global
declaration of operator new, and that it's not provided by the
user.</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ if (FD->isReservedGlobalPlacementOperator())</div>
</div>
<div>
<div>+ return false;</div>
</div>
<div><br>
</div>
<div>This is <i>still</i> the wrong check. You need to check if
there are <i>any</i> placement arguments, and you need to check
that the Decl is not part of a class. Both are easy:</div>
</blockquote>
This check is done to throw out standard placement new/deletes with
an additional pointer parameter but to handle standard placement
nothrow new/deletes. <br>
As an option we can detect nothrow new/deletes and skip all other
placement versions.<br>
<br>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div>if (FD->getNumParams() != 1 || FD->isVariadic())</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>return
false;</div>
<div>if (isa<CXXMethodDecl>(FD))</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>return
false;</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ // Process direct calls to operator
new/new[]/delete/delete[] functions</div>
<div>+ // as distinct from new/new[]/delete/delete[]
expressions that are </div>
<div>+ // processed by "checkPostStmt(const CXXNewExpr *"
and </div>
<div>+ // "checkPostStmt(const CXXDeleteExpr *"
respectively</div>
</div>
<div><br>
</div>
<div>Missing a period, and also looks rather odd with the
unbalanced parens. Maybe something like "...processed by the
checkPostStmt callbacks for CXXNewExpr and CXXDeleteExpr"? Also,
you can't really use "respectively" here because you're matching
two items up with four items, although it is pretty clear what
you mean.</div>
<div><br>
</div>
<div>Good thought here, though—I wouldn't have remembered to do
this. Hopefully someday it will be unimportant anyway when we
use CallEvent to model the allocator and deallocator calls
inside CXXNewExpr and CXXDeleteExpr.</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ assert(0 && "not a new/delete oparator");</div>
</div>
<div><br>
</div>
<div>This is better written llvm_unreachable("not a new/delete
operator");</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ // the return value from operator new is already bound
and we don't want to </div>
<div>+ // break this binding so we call MallocUpdateRefState
instead of MallocMemAux</div>
</div>
<div><br>
</div>
<div>Capitalization, period.</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ StringRef Name = "";</div>
<div>+ if (const CallExpr *CE = dyn_cast<CallExpr>(E)) {</div>
<div>+ if (const FunctionDecl *FD = C.getCalleeDecl(CE)) {</div>
<div>+ if (FD->getDeclName().isIdentifier()) {</div>
<div>+ Name = FD->getName();</div>
<div>+ }</div>
<div>+ }</div>
<div>+ }</div>
<div>+</div>
<div>+ unsigned Idx = AllocDeallocData.GetOrInsert(Name,
dKind);</div>
<div> // Set the symbol's state to Allocated.</div>
<div>- return state->set<RegionState>(Sym,
RefState::getAllocated(CE));</div>
<div>-</div>
<div>+ return state->set<RegionState>(Sym,
RefState::getAllocated(E, Idx));</div>
</div>
<div><br>
</div>
<div>Yeah, honestly it makes a lot more sense to me to just store
the dKind in the RefState, and lazily derive the name when we
need to print it at the end. (And not do it so specifically.)</div>
</blockquote>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ if (const CXXDeleteExpr *DE =
dyn_cast_or_null<CXXDeleteExpr>(E))</div>
<div>+ return DE->isArrayForm() ? D_deleteArray :
D_delete;</div>
</div>
<div><br>
</div>
<div>You already tested for null, so you can just use dyn_cast
here.</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ if (isa<ObjCMessageExpr>(E))</div>
<div>+ return D_free;</div>
</div>
<div><br>
</div>
<div>There are no Objective-C messages that free memory
immediately; just take this out.</div>
</blockquote>
Should be fixed somehow for situations when an ObjCMessageExpr is
passed as a deallocator to FreeMemAux(). Just removing the above
code will cause GetDeallocKind() to return D_unknown for an
ObjCMessageExpr and we end up with assert(0 && "unknown or
unhandled DeallocatorKind"); triggering in
PrintExpectedAllocName(). malloc.mm test contain testcases
illustrating this.<br>
<br>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ // get the exact name of an allocation function</div>
</div>
<div><br>
</div>
<div>Capitalization, spelling.</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ if (FD->getKind() == Decl::Function) {</div>
</div>
<div><br>
</div>
<div>Please don't do this; it rules out C++ methods. Actually, how
about this, for the whole method?</div>
<div><br>
</div>
<div>
<div><font face="Courier">bool
MallocChecker::PrintAllocDeallocName(raw_ostream &os,
CheckerContext &C, </font></div>
<div><font face="Courier">
const Expr *E) const {</font></div>
<div><font face="Courier"> if (const CallExpr *CE =
dyn_cast<CallExpr>(E)) {</font></div>
<div><font face="Courier"> // FIXME: This doesn't handle
indirect calls.</font></div>
<div><font face="Courier"> const FunctionDecl *FD =
CE->getDirectCallee();</font></div>
<div><font face="Courier"> if (!FD)</font></div>
<div><font face="Courier"> return false;</font></div>
<div><font face="Courier"> </font></div>
<div><font face="Courier"> os << *FD;</font></div>
<div><font face="Courier"> if
(!FD->isOverloadedOperator())</font></div>
<div><font face="Courier"> os << "()";</font></div>
<div><font face="Courier"> return true;</font></div>
<div><font face="Courier"> }</font></div>
<div><font face="Courier"><br>
</font></div>
<div><font face="Courier"> if (const ObjCMessageExpr *Msg =
dyn_cast<ObjCMessageExpr>(E)) {</font></div>
<div><font face="Courier"> if (Msg->isInstanceMessage())</font></div>
<div><font face="Courier"> os << "-";</font></div>
<div><font face="Courier"> else</font></div>
<div><font face="Courier"> os << "+";</font></div>
<div><font face="Courier"> os <<
Msg->getSelector().getAsString();</font></div>
<div><font face="Courier"> return true;</font></div>
<div><font face="Courier"> }</font></div>
<div><font face="Courier"><br>
</font></div>
<div><font face="Courier"> if (const CXXNewExpr *NE =
dyn_cast<CXXNewExpr>(E)) {</font></div>
<div><font face="Courier"> os <<
*NE->getOperatorNew();</font></div>
<div><font face="Courier"> return true;</font></div>
<div><font face="Courier"> }</font></div>
<div><font face="Courier"><br>
</font></div>
<div><font face="Courier"> if (const CXXDeleteExpr *DE =
dyn_cast<CXXDeleteExpr>(E)) {</font></div>
<div><font face="Courier"> os <<
*DE->getOperatorDelete();</font></div>
<div><font face="Courier"> return true;</font></div>
<div><font face="Courier"> }</font></div>
<div><font face="Courier"><br>
</font></div>
<div><font face="Courier"> return false;</font></div>
<div><font face="Courier">}</font></div>
</div>
<div><font face="Courier"><br>
</font></div>
<div>This actually seems generally useful and could even go on
CheckerContext as a convenience helper. Note the bool return, so
that rather than come up with some placeholder text like
"unknown means", we can just rephrase the message to not mention
the allocator.</div>
</blockquote>
Great!<br>
<br>
<blockquote
cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
type="cite">
<div><br>
</div>
<div><br>
</div>
<div>
<div>+ switch(dKind) {</div>
<div>+ case D_free: os << "malloc()"; return;</div>
<div>+ case D_delete: os << "operator new"; return;</div>
<div>+ case D_deleteArray: os << "operator new[]";
return;</div>
<div>+ case D_unknown:</div>
<div>+ default: assert(0 && "unknown or unhandled
DeallocatorKind");</div>
<div>+ }</div>
</div>
<div><br>
</div>
<div>LLVM style is to not include default cases for an
enum-covered switch statement. Also llvm_unreachable again.</div>
<div><br>
</div>
<div>Thanks for your patience in iterating on this!</div>
<div>Jordan</div>
</blockquote>
So we rollback to the approach with a single DeallocatorKind
implicitly stored inside RefState::Kind as subkinds are stored
inside SVal::BaseKind in Sval.h, right?<br>
<br>
<pre class="moz-signature" cols="72">--
Anton</pre>
</body>
</html>