<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 23.02.2013 6:47, Jordan Rose wrote:<br>
    </div>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite"><br>
      <div>
        <div>On Feb 22, 2013, at 7:24 , Anton Yartsev <<a
            moz-do-not-send="true" href="mailto:anton.yartsev@gmail.com">anton.yartsev@gmail.com</a>>
          wrote:</div>
        <br class="Apple-interchange-newline">
        <blockquote type="cite">
          <div text="#000000" bgcolor="#FFFFFF">
            <div class="moz-cite-prefix">On 22.02.2013 9:30, Anna Zaks
              wrote:<br>
            </div>
            <blockquote
              cite="mid:254908E4-011B-4100-BE1A-48A980CDD204@apple.com"
              type="cite"> <br>
              <div>
                <div>On Feb 21, 2013, at 6:26 PM, Anna Zaks <<a
                    moz-do-not-send="true" href="mailto:ganna@apple.com">ganna@apple.com</a>>

                  wrote:</div>
                <br class="Apple-interchange-newline">
                <blockquote type="cite">
                  <div style="word-wrap: break-word; -webkit-nbsp-mode:
                    space; -webkit-line-break: after-white-space;"><br>
                    <div>
                      <div>On Feb 21, 2013, at 6:00 PM, Anton Yartsev
                        <<a moz-do-not-send="true"
                          href="mailto:anton.yartsev@gmail.com">anton.yartsev@gmail.com</a>>

                        wrote:</div>
                      <br class="Apple-interchange-newline">
                      <blockquote type="cite">
                        <div text="#000000" bgcolor="#FFFFFF">
                          <blockquote
                            cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
                            type="cite"> <br>
                            <div>
                              <div>On Feb 19, 2013, at 10:18 PM, Anton
                                Yartsev <<a moz-do-not-send="true"
                                  href="mailto:anton.yartsev@gmail.com">anton.yartsev@gmail.com</a>>


                                wrote:</div>
                              <br class="Apple-interchange-newline">
                              <blockquote type="cite">Hi, Jordan. Thanx
                                for the review!<br>
                                <br>
                                Attached is the new version of the patch
                                with all the comments addressed. Also
                                added support for directly called
                                operator new()/new[]() and operator
                                delete()<br>
                                <br>
                                There is currently one problem with
                                handling of operator delete(). The
                                following test<br>
                                <br>
                                void testDeleteOp1() {<br>
                                 int *p = (int *)malloc(sizeof(int));<br>
                                 operator delete(p); // FIXME: should
                                complain "Argument to operator delete()
                                was allocated by malloc(), not operator
                                new"<br>
                                }<br>
                                <br>
                                produce no warnings as attached RefState
                                seem to be missing at the point when
                                checkPostStmt(const CallExpr *CE,
                                CheckerContext &C) callback is
                                called for operator delete(p).<br>
                                I haven't investigated the problem
                                deeply yet, intend to address it
                                parallel with the review.<br>
                                <br>
                                <blockquote type="cite">+  if
                                  (NE->getNumPlacementArgs())<br>
                                  +    return;<br>
                                  +  // skip placement new operators as
                                  they may not allocate memory<br>
                                  <br>
                                  Two comments here:<br>
                                  - Please make sure all comments are
                                  complete, capitalized, and punctuated
                                  sentences. (This has the important
                                  one, "complete"....just missing
                                  capitalization and punctuation.)<br>
                                </blockquote>
                                I'll try. Unfortunately I am not as good
                                in English as I want to be, so sorry for
                                my grammar, syntax, and punctuation.<br>
                                <br>
                                -- <br>
                                Anton<br>
                                <br>
                                <span><MallocChecker_v2.patch></span></blockquote>
                              <div><br>
                              </div>
                            </div>
                          </blockquote>
                          <br>
                          Hi Anna. Thanks for your comments! Attached is
                          the new patch.<br>
                          <br>
                          <blockquote
                            cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
                            type="cite">
                            <div>Just adding another kind as extra
                              enumeration values does not seem right.
                              Another option is to make Kind be a
                              pointer to a static array, which contains
                              objects recording all necessary info about
                              each kind (<span style="color: rgb(0, 97,
                                65); font-family: Monaco; font-size:
                                11px; ">MacOSKeychainAPIChecker</span> uses


                              this approach). This is probably an
                              overkill for now, but is another option.</div>
                          </blockquote>
                          Not sure that I have got an idea.<br>
                          The memory and deallocator kind are both set
                          for a RefState. Do you mean creating a static
                          array with 'memory kinds' * 'deallocator kind'
                          elements for all possible combinations? Also
                          there is no necessary info other then the kind
                          itself.<br>
                          Left for now.<br>
                        </div>
                      </blockquote>
                      <div><br>
                      </div>
                      <div>I think of ToBeReleasedWith* as being
                        different types of Allocate; I don't think they
                        should be separate values in the same enum. It's
                        also unfortunate to have to copy the constant
                        values in both places - DeallocatorKind and
                        RefState::Kind. Maybe you could restructure this
                        similarly to how this is done in SVals.h?<br>
                        <blockquote type="cite">
                          <div text="#000000" bgcolor="#FFFFFF"><br>
                            <blockquote
                              cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
                              type="cite">
                              <div>+  const FunctionDecl
                                *getCalleeDecl() const { return
                                CalleeDecl; }</div>
                              <div>Do we only store the name of the
                                allocator declaration here? </div>
                            </blockquote>
                          </div>
                        </blockquote>
                        <div><br>
                        </div>
                        <div>If the Decl is always an allocator Decl, we
                          should change the name of the method to be
                          more descriptive.</div>
                      </div>
                    </div>
                  </div>
                </blockquote>
              </div>
            </blockquote>
            <blockquote
              cite="mid:254908E4-011B-4100-BE1A-48A980CDD204@apple.com"
              type="cite">
              <div>
                <blockquote type="cite">
                  <div style="word-wrap: break-word; -webkit-nbsp-mode:
                    space; -webkit-line-break: after-white-space;">
                    <div>
                      <blockquote type="cite">
                        <div text="#000000" bgcolor="#FFFFFF">
                          <blockquote
                            cite="mid:2CAEFC0E-8B7A-4301-9143-CC1EC37A582E@apple.com"
                            type="cite">
                            <div>Do we need to store this in the state?
                              (Growing states implies memory overhead.)
                              Can't this be easily implied from the
                              kind?</div>
                          </blockquote>
                          Kind can't give us information about the name
                          of an allocator that can be malloc(),
                          realloc(), a user function with
                          ownership_takes attribute, etc.</div>
                      </blockquote>
                      <blockquote type="cite">
                        <div text="#000000" bgcolor="#FFFFFF"> One
                          solution to avoid keeping a CalleeDecl in
                          RefState is to rollback to
                          CallExpr::getDirectCallee() from
                          CheckerContext::getCalleeDec() and to print
                          "malloc()" in case of indirect calls.<br>
                        </div>
                      </blockquote>
                      <div><br>
                      </div>
                      Ok, I see.<br>
                    </div>
                  </div>
                </blockquote>
                <div><br>
                </div>
                After thinking about it some more, I do not think we
                should add an extra pointer to the state to
                differentiate between few allocator functions. In the
                case when we do not have ownership attributes, we only
                have few possible allocators, whose names we know ahead
                of time. In case we support ownership attributes, we are
                likely to have few allocator functions whose names we
                could just store in the checker state on the first
                encounter (like we store the IdentifierInfo).  </div>
              <div><br>
              </div>
              <div>In addition, we could change the ownership attributes
                in such a way that each allocator would have a
                corresponding deallocator; for example, if we wanted to
                check matching allocators and deallocators. Annotated
                deallocators won't necessarily be one of the functions
                you know at compile time, so the DeallocatorKind enum
                would not cover it. I think, it's best if we kept a
                table on a side that would store this info (allocation
                function name, deallocator) and refer to the entries in
                the table from the state. (Take a look at
                MacOSKeychainAPIChecker - it's very similar to malloc,
                but it handles different allocator/deallocator pairs. I
                think a similar solution could work in this case as
                well. Other solutions that address these issues are
                welcome as well!)</div>
            </blockquote>
            Attached is the patch that uses approach with a dynamic
            table that holds both allocator name and expected
            deallocator kind. This approach allows to keep any allocator
            names, either known or new ones. The table could be easily
            extended to hold additional data such as info about special
            deallocators, etc.<br>
          </div>
        </blockquote>
      </div>
      <br>
      <div>Please do not do this. The analyzer is not thread-safe today,
        but that's no reason why we should make it harder to make it
        thread-safe tomorrow.</div>
      <div><br>
      </div>
      <div>Stepping back, there are many problems with this, the main
        one being that just keeping track of the function name isn't
        really good enough. MacOSKeychainAPIChecker can get away with it
        because its functions are <br>
      </div>
    </blockquote>
    Could you, please, send the remainder of the sentence? Tried to
    guess, but failed.<br>
    <br>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div><br>
      </div>
      <div>I think all Anna meant is to have a <i>static</i> table,
        like MacOSKeychainAPIChecker. If/when we support allocator
        families defined at runtime, we can do this properly, with a
        mutable table as a field of the checker, but for now this is
        both overkill and error-prone.</div>
    </blockquote>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div>Separately, I don't understand why you chose to <i>inherit</i> from

        SmallVector, rather than just using a SmallVector member.
        AllocDeallocDataVec isn't conceptually a vector, it's a type
        that maps family IDs to names and DeallocatorKinds.</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>Other comments:</div>
      <div><br>
      </div>
      <div>
        <div>+enum DeallocatorKind {</div>
        <div>+  D_free,</div>
        <div>+  D_delete,</div>
        <div>+  D_deleteArray,</div>
        <div>+  D_unknown</div>
        <div>+};</div>
      </div>
      <div><br>
      </div>
      <div>Nitpicking: single-character enum prefixes feel strange to
        me. DK_?</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>-  enum Kind { // Reference to allocated memory.</div>
        <div>-              Allocated,</div>
        <div>-              // Reference to released/freed memory.</div>
        <div>-              Released,</div>
        <div>-              // The responsibility for freeing resources
          has transfered from</div>
        <div>-              // this reference. A relinquished symbol
          should not be freed.</div>
        <div>-              Relinquished } K;</div>
        <div>+  // First two bits of Kind represent memory kind</div>
        <div>+  static const unsigned K_MASK = 0x3;</div>
        <div>+  // The rest bits keep an index to the AllocDeallocData</div>
        <div>+  // that hold information about the allocator and
          deallocator functions</div>
        <div>+  static const unsigned I_BASE = 2;</div>
        <div>+  static const unsigned I_MASK = ~0 << I_BASE;</div>
        <div>+  // A storage for memory kind and index to the
          AllocDeallocData</div>
        <div>+  enum KindPlusData { // Reference to allocated memory.</div>
        <div>+                     Allocated,</div>
        <div>+                     // Reference to released/freed
          memory.</div>
        <div>+                     Released,</div>
        <div>+                     // The responsibility for freeing
          resources has transfered</div>
        <div>+                     // from this reference. A
          relinquished symbol should not be</div>
        <div>+                     // freed.</div>
        <div>+                     Relinquished</div>
        <div>+  } K;</div>
        <div>   const Stmt *S;</div>
      </div>
      <div><br>
      </div>
      <div>This should be implemented using bitfields, and the bitfields
        should go after the Stmt pointer so that the total object size
        is smaller.</div>
    </blockquote>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>
          <div>+  /// Auxiliary functions that return kind and print
            names of </div>
          <div>+  /// allocators/deallocators</div>
          <div>+  DeallocatorKind GetDeallocKind(CheckerContext &C,
            const Expr *E) const;</div>
          <div>+  void PrintAllocDeallocName(raw_ostream &os,
            CheckerContext &C, </div>
          <div>+                             const Expr *E) const;</div>
          <div>+  void PrintAllocDeallocName(raw_ostream &os,
            CheckerContext &C, </div>
          <div>+                             const RefState *RS) const;</div>
          <div>+  void PrintExpectedAllocName(raw_ostream &os,
            CheckerContext &C, </div>
          <div>+                              const Expr *DeallocExpr)
            const;</div>
        </div>
      </div>
      <div><br>
      </div>
      <div>Hm. Doxygen will attach the comment to the first declaration
        and ignore the others. You can use <a moz-do-not-send="true"
href="http://www.stack.nl/%7Edimitri/doxygen/manual/grouping.html#memgroup">groups</a> to
        sort of get the effect you want, but it might be worth just
        being clearer anyway.</div>
      <div><br>
      </div>
      <div>Also, the convention for function/method names in LLVM is now
        lowerCamelCase.</div>
      <div><br>
      </div>
      <div>Thank you for switching to raw_ostream. :-)</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  bool isDefaultNonPtrPlacementNewDelete(const
          FunctionDecl *FD,</div>
        <div>+                                         CheckerContext
          &C) const;</div>
      </div>
      <div><br>
      </div>
      <div>Very confusing method name. isStandardNewDelete is probably
        fine. There's nothing about "pointers" or "placement" that's
        strange here -- what you want to know is that this is the global
        declaration of operator new, and that it's not provided by the
        user.</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  if (FD->isReservedGlobalPlacementOperator())</div>
      </div>
      <div>
        <div>+    return false;</div>
      </div>
      <div><br>
      </div>
      <div>This is <i>still</i> the wrong check. You need to check if
        there are <i>any</i> placement arguments, and you need to check
        that the Decl is not part of a class. Both are easy:</div>
    </blockquote>
    This check is done to throw out standard placement new/deletes with
    an additional pointer parameter but to handle standard placement
    nothrow new/deletes. <br>
    As an option we can detect nothrow new/deletes and skip all other
    placement versions.<br>
    <br>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div>if (FD->getNumParams() != 1 || FD->isVariadic())</div>
      <div><span class="Apple-tab-span" style="white-space:pre"> </span>return
        false;</div>
      <div>if (isa<CXXMethodDecl>(FD))</div>
      <div><span class="Apple-tab-span" style="white-space:pre"> </span>return
        false;</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+      // Process direct calls to operator
          new/new[]/delete/delete[] functions</div>
        <div>+      // as distinct from new/new[]/delete/delete[]
          expressions that are </div>
        <div>+      // processed by "checkPostStmt(const CXXNewExpr *"
          and </div>
        <div>+      // "checkPostStmt(const CXXDeleteExpr *"
          respectively</div>
      </div>
      <div><br>
      </div>
      <div>Missing a period, and also looks rather odd with the
        unbalanced parens. Maybe something like "...processed by the
        checkPostStmt callbacks for CXXNewExpr and CXXDeleteExpr"? Also,
        you can't really use "respectively" here because you're matching
        two items up with four items, although it is pretty clear what
        you mean.</div>
      <div><br>
      </div>
      <div>Good thought here, though—I wouldn't have remembered to do
        this. Hopefully someday it will be unimportant anyway when we
        use CallEvent to model the allocator and deallocator calls
        inside CXXNewExpr and CXXDeleteExpr.</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+        assert(0 && "not a new/delete oparator");</div>
      </div>
      <div><br>
      </div>
      <div>This is better written llvm_unreachable("not a new/delete
        operator");</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  // the return value from operator new is already bound
          and we don't want to </div>
        <div>+  // break this binding so we call MallocUpdateRefState
          instead of MallocMemAux</div>
      </div>
      <div><br>
      </div>
      <div>Capitalization, period.</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  StringRef Name = "";</div>
        <div>+  if (const CallExpr *CE = dyn_cast<CallExpr>(E)) {</div>
        <div>+    if (const FunctionDecl *FD = C.getCalleeDecl(CE)) {</div>
        <div>+      if (FD->getDeclName().isIdentifier()) {</div>
        <div>+        Name = FD->getName();</div>
        <div>+      }</div>
        <div>+    }</div>
        <div>+  }</div>
        <div>+</div>
        <div>+  unsigned Idx = AllocDeallocData.GetOrInsert(Name,
          dKind);</div>
        <div>   // Set the symbol's state to Allocated.</div>
        <div>-  return state->set<RegionState>(Sym,
          RefState::getAllocated(CE));</div>
        <div>-</div>
        <div>+  return state->set<RegionState>(Sym,
          RefState::getAllocated(E, Idx));</div>
      </div>
      <div><br>
      </div>
      <div>Yeah, honestly it makes a lot more sense to me to just store
        the dKind in the RefState, and lazily derive the name when we
        need to print it at the end. (And not do it so specifically.)</div>
    </blockquote>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  if (const CXXDeleteExpr *DE =
          dyn_cast_or_null<CXXDeleteExpr>(E))</div>
        <div>+    return DE->isArrayForm() ? D_deleteArray :
          D_delete;</div>
      </div>
      <div><br>
      </div>
      <div>You already tested for null, so you can just use dyn_cast
        here.</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  if (isa<ObjCMessageExpr>(E))</div>
        <div>+    return D_free;</div>
      </div>
      <div><br>
      </div>
      <div>There are no Objective-C messages that free memory
        immediately; just take this out.</div>
    </blockquote>
    Should be fixed somehow for situations when an ObjCMessageExpr is
    passed as a deallocator to FreeMemAux(). Just removing the above
    code will cause GetDeallocKind() to return D_unknown for an
    ObjCMessageExpr and we end up with assert(0 && "unknown or
    unhandled DeallocatorKind"); triggering in
    PrintExpectedAllocName().  malloc.mm test contain testcases
    illustrating this.<br>
    <br>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  // get the exact name of an allocation function</div>
      </div>
      <div><br>
      </div>
      <div>Capitalization, spelling.</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+      if (FD->getKind() == Decl::Function) {</div>
      </div>
      <div><br>
      </div>
      <div>Please don't do this; it rules out C++ methods. Actually, how
        about this, for the whole method?</div>
      <div><br>
      </div>
      <div>
        <div><font face="Courier">bool
            MallocChecker::PrintAllocDeallocName(raw_ostream &os,
            CheckerContext &C, </font></div>
        <div><font face="Courier">                                     
                const Expr *E) const {</font></div>
        <div><font face="Courier">  if (const CallExpr *CE =
            dyn_cast<CallExpr>(E)) {</font></div>
        <div><font face="Courier">    // FIXME: This doesn't handle
            indirect calls.</font></div>
        <div><font face="Courier">    const FunctionDecl *FD =
            CE->getDirectCallee();</font></div>
        <div><font face="Courier">    if (!FD)</font></div>
        <div><font face="Courier">      return false;</font></div>
        <div><font face="Courier">    </font></div>
        <div><font face="Courier">    os << *FD;</font></div>
        <div><font face="Courier">    if
            (!FD->isOverloadedOperator())</font></div>
        <div><font face="Courier">      os << "()";</font></div>
        <div><font face="Courier">    return true;</font></div>
        <div><font face="Courier">  }</font></div>
        <div><font face="Courier"><br>
          </font></div>
        <div><font face="Courier">  if (const ObjCMessageExpr *Msg =
            dyn_cast<ObjCMessageExpr>(E)) {</font></div>
        <div><font face="Courier">    if (Msg->isInstanceMessage())</font></div>
        <div><font face="Courier">      os << "-";</font></div>
        <div><font face="Courier">    else</font></div>
        <div><font face="Courier">      os << "+";</font></div>
        <div><font face="Courier">    os <<
            Msg->getSelector().getAsString();</font></div>
        <div><font face="Courier">    return true;</font></div>
        <div><font face="Courier">  }</font></div>
        <div><font face="Courier"><br>
          </font></div>
        <div><font face="Courier">  if (const CXXNewExpr *NE =
            dyn_cast<CXXNewExpr>(E)) {</font></div>
        <div><font face="Courier">    os <<
            *NE->getOperatorNew();</font></div>
        <div><font face="Courier">    return true;</font></div>
        <div><font face="Courier">  }</font></div>
        <div><font face="Courier"><br>
          </font></div>
        <div><font face="Courier">  if (const CXXDeleteExpr *DE =
            dyn_cast<CXXDeleteExpr>(E)) {</font></div>
        <div><font face="Courier">    os <<
            *DE->getOperatorDelete();</font></div>
        <div><font face="Courier">    return true;</font></div>
        <div><font face="Courier">  }</font></div>
        <div><font face="Courier"><br>
          </font></div>
        <div><font face="Courier">  return false;</font></div>
        <div><font face="Courier">}</font></div>
      </div>
      <div><font face="Courier"><br>
        </font></div>
      <div>This actually seems generally useful and could even go on
        CheckerContext as a convenience helper. Note the bool return, so
        that rather than come up with some placeholder text like
        "unknown means", we can just rephrase the message to not mention
        the allocator.</div>
    </blockquote>
    Great!<br>
    <br>
    <blockquote
      cite="mid:1BCDB3E6-F065-458F-9A83-E93381CD9BFC@apple.com"
      type="cite">
      <div><br>
      </div>
      <div><br>
      </div>
      <div>
        <div>+  switch(dKind) {</div>
        <div>+    case D_free: os << "malloc()"; return;</div>
        <div>+    case D_delete: os << "operator new"; return;</div>
        <div>+    case D_deleteArray: os << "operator new[]";
          return;</div>
        <div>+    case D_unknown:</div>
        <div>+    default: assert(0 && "unknown or unhandled
          DeallocatorKind");</div>
        <div>+  }</div>
      </div>
      <div><br>
      </div>
      <div>LLVM style is to not include default cases for an
        enum-covered switch statement. Also llvm_unreachable again.</div>
      <div><br>
      </div>
      <div>Thanks for your patience in iterating on this!</div>
      <div>Jordan</div>
    </blockquote>
    So we rollback to the approach with a single DeallocatorKind
    implicitly stored inside RefState::Kind as subkinds are stored
    inside SVal::BaseKind in Sval.h, right?<br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Anton</pre>
  </body>
</html>