[cfe-dev] [RFC][analyzer][StdLibraryFunctionsChecker] Parsing signatures?

Gábor Márton via cfe-dev cfe-dev at lists.llvm.org
Tue Sep 22 07:30:15 PDT 2020


> Why? This could simplify the type matching code in the Checker extremely.
Besides, whenever we reach up to a point where we can read up summaries
from e.g. YAML files (maybe when we merge with the TaintChecker) then the
user could specify the signatures as they would write that in C/C++, which
seems to be an ultimate convenience.

Another use case could be to boost up the CallDescriptionMap by using the
same infrastructure. Currently we match by function names and by argument
numbers and this has caused bugs already.
Imagine this:
  CallDescriptionMap<FnDescription> FnDescriptions = {
      {{"FILE *fopen(const char *pathname, const char *mode)"}, // parse
and match by the full signature
      {nullptr, &StreamChecker::evalFopen, ArgNone}},

Cheers,
Gabor



On Tue, Sep 22, 2020 at 3:30 PM Gábor Márton <martongabesz at gmail.com> wrote:

> Hi,
>
> Here is an example of adding a function summary in
> the StdLibraryFunctionsChecker:
>     // ssize_t recv(int sockfd, void *buf, size_t len, int flags);
>     addToFunctionSummaryMap(
>         "recv",
>         Signature(ArgTypes{IntTy, VoidPtrTy, SizeTy, IntTy},
> RetType{Ssize_tTy}),
>         Summary(NoEvalCall)
>             .ArgConstraint(ArgumentCondition(0, WithinRange, Range(0,
> IntMax)))
>             .ArgConstraint(BufferSize(/*Buffer=*/ArgNo(1),
>                                       /*BufSize=*/ArgNo(2))));
>
> Instead, I'd like to have the following in the future:
>     addToFunctionSummaryMap(
>         "recv"
>         Signature("ssize_t recv(int sockfd, void *buf, size_t len, int
> flags);"),
>         Summary(NoEvalCall)
>             .ArgConstraint(ArgumentCondition(0, WithinRange, Range(0,
> IntMax)))
>             .ArgConstraint(BufferSize(/*Buffer=*/ArgNo(1),
>                                       /*BufSize=*/ArgNo(2))));
>
> Why? This could simplify the type matching code in the Checker extremely.
> Besides, whenever we reach up to a point where we can read up summaries
> from e.g. YAML files (maybe when we merge with the TaintChecker) then the
> user could specify the signatures as they would write that in C/C++, which
> seems to be an ultimate convenience.
>
> To achieve this I have to parse the string given to the Signature in the
> ASTContext of the TU that is being analyzed. I am considering two options
> to develop this:
> 1) Seems like BodyFarm/ModelInjector does something similar (it reads
> function bodies from model files). However, I am not sure if that solution
> is flexible enough. Gabor, what do you think, would it make sense to extend
> into this direction, could we handle C++ declarations as well? What other
> weak points or difficulties do you see?
> 2) Maybe we could use the parser with a custom ExternalASTSource
> implementation that could do the job. Actually, this is how LLDB does it,
> the implementation of the ExternalASTSource interface uses the
> ASTImporter under the hood. I am not sure if ASTImporter could be used for
> this, but maybe some parts of it, we could reuse.
>
> Thanks,
> Gabor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200922/9c2f955a/attachment-0001.html>


More information about the cfe-dev mailing list