[cfe-dev] [Analyzer][RFC] Function Summaries

Gábor Horváth via cfe-dev cfe-dev at lists.llvm.org
Thu Feb 6 10:40:32 PST 2020

On Thu, Feb 6, 2020 at 7:59 AM Gábor Márton <martongabesz at gmail.com> wrote:

> APINotes is a generic framework for attributing any functions. And the
> metadata read from the .apinotes files is stored in the AST and can be
> queried later as `Decl->getAttr<...>()`. This mechanism could be very
> useful for the whole community, i.e. for optimization, and for other tools.
> However, with analyzer specific function summaries we don't want to have
> the burden of creating such a generic framework that is suitable for the
> whole community. The summary based experiments are way too immature now,
> and we don't want to turn our focus on creating such a generic framework.

For prototyping purposes anything that works is OK.  The question is, what
format should we choose once we know the direction we want to take. I
think we should revisit the question of using API notes once we see what
info is stored in the summaries after the prototype considered successful.

> Integrating APINotes to mainstream Clang would require coordination and
> consensus within the **whole** Clang community.

There were earlier discussions about upstreaming APINotes. As far as I
remember the consensus was that the community would love to see it upstream
if C++ support is implemented. Whatever format we choose we do need to
implement C++ support (e.g. supporting overloaded functions).  I would
expect the community to be onboard if we choose to go with API notes. So I
think the question is not whether we would have consensus, it is more like
is it less or more code to write/maintain.

> That would require huge efforts from all of us and we would quickly loose
> the focus from the original goal which is to experiment with summaries
> **within** the analyzer.

To experiment, any format is OK. The reason why I am pushing for API Notes
is because it has been in production for years and it might be less work to
adapt that to our needs than doing everything from scratch. For a prototype
we do not need to solve certain questions like what is the distribution
model, what is the most reasonable way for the driver to handle the related
flags, have easy to understand diagnostics for all the cases like missing
files etc. Moreover, since Objective-C is one of the supported languages by
the analyzer, an upstream solution should support that as well. APINotes
already have this support. A custom solution would need this to be designed
from scratch. So I think at this point we cannot know whether using API
notes would be a huge effort or actually saving some effort.

> Thus, we'd like to keep the summaries describing format inside the
> premises of the analyzer.
Instead of APINotes, we propose to factor out the YAML parsing part from
> GenericTaintChecker into a modeling checker that would populate a GDM with
> the summaries metadata. That data could be used then in any checker, e.g.
> by the Taint or the StdLibraryFunctions checker.

Here is some documentation about the APINotes format here
I see no reason why it could not be extended for the needs of the analyzer.
It actually already contains some info useful for analysis, for example you
can use it to annotate the nullability for each parameter. It is a trivial
step to add range information etc.
Again, once we have a summary format the covers our need we will know more,
but for now, I do not see why this format (with potential extensions) would
not fit our needs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200206/e1e89011/attachment-0001.html>

More information about the cfe-dev mailing list