<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    This sounds like a facility that can get fairly complicated and will
    never be completely reliable or do exactly what we want. I guess it
    could be taught to handle simple cases but the old approach will
    never really be going away.<br>
    <br>
    Say, if we are to write such prototypes for C++ collection methods
    we'll probably want to completely drop template arguments because we
    can't list all possible arguments. This basically means that using
    the actual compiler to parse such prototypes will inevitably fail in
    this case. Another recurring problem with C++ is inline namespaces,
    say the inline namespace __1 that shows up in the libc++ method
    prototypes and should be actively ignored by any such system.<br>
    <br>
    In C some standard functions are implemented as macros expanding to
    builtins and such builtins can potentially have more arguments than
    the function they implement (extra arguments automatically filled in
    by the macro).<br>
    <br>
    I think it's better to target only plain C functions with this and
    do a completely dumb custom parser for the prototypes. Probably also
    drop support for hard-to-parse types like function pointers.
    Anything beyond that sounds questionable to me.<br>
    <br>
    <div class="moz-cite-prefix">On 9/22/20 7:30 AM, Gábor Márton via
      cfe-dev wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAH6rKyCudMAzOnLJthR9MO9+Anf-Loc82+j6ThN-e8hse1pA2g@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">> <span style="font-family:arial,sans-serif">Why?
          This could simplify the type matching code in the Checker
          extremely. Besides, whenever we reach up to a point where we
          can read up summaries from e.g. YAML files (maybe when we
          merge with the TaintChecker) then the user could specify the
          signatures as they would write that in C/C++, which seems to
          be an ultimate convenience.</span>
        <div><span style="font-family:arial,sans-serif"><br>
          </span></div>
        <div><span style="font-family:arial,sans-serif">Another use case
            could be to boost up the CallDescriptionMap by using the
            same infrastructure. Currently we match by function names
            and by argument numbers and this has caused bugs already.</span></div>
        <div><font face="arial, sans-serif">Imagine this:</font></div>
        <div><font face="monospace"> 
            CallDescriptionMap<FnDescription> FnDescriptions = {<br>
                  {{"FILE *fopen(const char *pathname, const char
            *mode)"}, // parse and match by the full signature</font></div>
        <div><font face="monospace">      {nullptr,
            &StreamChecker::evalFopen, ArgNone}},<br>
          </font></div>
        <div><font face="arial, sans-serif"><br>
          </font></div>
        <div><font face="arial, sans-serif">Cheers,</font></div>
        <div><font face="arial, sans-serif">Gabor</font></div>
        <div><span style="font-family:arial,sans-serif"><br>
          </span></div>
        <div><span style="font-family:arial,sans-serif"><br>
          </span></div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, Sep 22, 2020 at 3:30
          PM Gábor Márton <<a href="mailto:martongabesz@gmail.com"
            moz-do-not-send="true">martongabesz@gmail.com</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div dir="ltr">Hi,
            <div><br>
            </div>
            <div>Here is an example of adding a function summary in
              the StdLibraryFunctionsChecker:</div>
            <div><font face="monospace">    // ssize_t recv(int sockfd,
                void *buf, size_t len, int flags);<br>
                    addToFunctionSummaryMap(<br>
                        "recv",</font></div>
            <div><font face="monospace">        Signature(</font>ArgTypes{IntTy,
              VoidPtrTy, SizeTy, IntTy}, RetType{Ssize_tTy}),<font
                face="monospace"><br>
                        Summary(NoEvalCall)<br>
                            .ArgConstraint(ArgumentCondition(0,
                WithinRange, Range(0, IntMax)))<br>
                           
                .ArgConstraint(BufferSize(/*Buffer=*/ArgNo(1),<br>
                                                     
                /*BufSize=*/ArgNo(2))));</font><br>
            </div>
            <div><br>
            </div>
            <div>Instead, I'd like to have the following in the future:</div>
            <div><font face="monospace">    addToFunctionSummaryMap(</font></div>
            <div><font face="monospace">        "recv"<br>
                        Signature("ssize_t recv(int sockfd, void *buf,
                size_t len, int flags);</font><span
                style="font-family:monospace">"),</span></div>
            <div><font face="monospace">        Summary(NoEvalCall)<br>
                            .ArgConstraint(ArgumentCondition(0,
                WithinRange, Range(0, IntMax)))<br>
                           
                .ArgConstraint(BufferSize(/*Buffer=*/ArgNo(1),<br>
                                                     
                /*BufSize=*/ArgNo(2))));</font><br>
            </div>
            <div><font face="monospace"><br>
              </font></div>
            <div><font face="arial, sans-serif">Why? This could simplify
                the type matching code in the Checker extremely.
                Besides, whenever we reach up to a point where we can
                read up summaries from e.g. YAML files (maybe when we
                merge with the TaintChecker) then the user could specify
                the signatures as they would write that in C/C++, which
                seems to be an ultimate convenience.</font></div>
            <div><font face="monospace"><br>
              </font></div>
            <div><font face="arial, sans-serif">To achieve this I have
                to parse the string given to the Signature in the
                ASTContext of the TU that is being analyzed. I
                am considering two options to develop this:</font></div>
            <div><font face="arial, sans-serif">1) Seems like
                BodyFarm/ModelInjector does something similar (it reads
                function bodies from model files). However, I am not
                sure if that solution is flexible enough. Gabor, what do
                you think, would it make sense to extend into this
                direction, could we handle C++ declarations as well?
                What other weak points or difficulties do you see?</font></div>
            <div><font face="arial, sans-serif">2) Maybe we could use
                the parser with a custom ExternalASTSource
                implementation that could do the job. Actually, this is
                how LLDB does it, the implementation of the </font>ExternalASTSource
              interface uses the ASTImporter under the hood. I am not
              sure if ASTImporter could be used for this, but maybe some
              parts of it, we could reuse.</div>
            <div><br>
            </div>
            <div>Thanks,</div>
            <div>Gabor</div>
          </div>
        </blockquote>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
cfe-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>