[cfe-dev] [RFC] A dataflow analysis framework for Clang AST

Tue Oct 12 18:39:38 PDT 2021

-vsavchenko at apple.com (which keeps bouncing)

On Tue, Oct 12, 2021 at 8:52 PM Yitzhak Mandelbaum <yitzhakm at google.com>
wrote:

> Gabor, Artem,
>
> Thanks for your support, and, Artem, thank you for quoting as plain text
> -- you're right, the system shunted it to the moderator because of its size.
>
> To answer some of the questions/comments raised:
> > There is a nice summary of some of the ad-hoc solutions we have in
> Clang in this survey: [cfe-dev] A survey of dataflow analyses in Clang
> (llvm.org)
> <https://lists.llvm.org/pipermail/cfe-dev/2020-October/066937.html>
> > Do you plan to replace the ad-hoc solutions in the long run?
>
> We mention this briefly in the RFC as motivation, but to reiterate here:
> we'd like the framework to at least be *suitable* to replace these ad-hoc
> solutions. Whether we would get to this ourselves or encourage others,
> review patches, etc. I can't say at this point.  I suspect we'll refactor
> at least a few ourselves just to test out the robustness of the framework.
>
> Just to reiterate, you want to abstract away both from the dataflow
>> algorithm and the semantics of C++. E.g., the user might not need to handle
>> the myriad of ways someone can assign a value to a variable (overloaded
>> operator=, output argument, primitive assignment, etc), only a single
>> abstract concept of assignment.
>
>
> Exactly!  Our first version already does this to some extent, we just feel
> it's somewhat tailored to our current needs and could use some
> generalization.  For example, it would be great to know when an lvalue is
> read for the last time and written to another lvalue, because that would
> suggest a potential `std::move`, among other things. Yet, that balance of
> detail and abstraction is not yet available. We only model the *effects* of
> reads and writes without explicitly abstracting them (say, tagging each CFG
> node w/ the set of reads and writes).  So, we think there's room to explore
> the design space here, but also think what we have will be valuable to a
> number of analyses right away.
>
> While most of the checks in CSA are using symbolic execution, there are
>> also some syntactic rules. One dilemma of implementing a new check is where
>> to put it (CSA? Tidy? Warnings?). While it would be great to get rid of
>> this dilemma, I think we want to keep all of the options open. I just
>> wanted to make it explicit what kinds of layering constraints we have. It
>> should be possible to add CSA checks/features using the new dataflow
>> framework.
>
>
> Great point! We hadn't thought about this possibility, but there's no good
> reason to rule out use in CSA.  Our key layering constraint is that it be
> available in clang-tidy. So, if we put it somewhere central, like
> clang/include/clang/Analysis, would that be suitable for use in CSA? We're
> certainly open to other suggestions as well.
>
> Valeriy's -Wcompletion-handler being
>> one of the notable latest additions - a backwards analysis over finite
>> state lattice with a sophisticated system of notes attached to the
>> warning to explain what's happening (generalizing over the latter may
>> possibly be a goal for your framework as well).
>
>
> Yes, we're definitely interested in being able to produce warnings that
> give the user a clear idea of what is wrong and (if possible) how to fix
> it.  This is complicated once we involve the SAT solver, since it tends to
> give us simple yes/no answers. But, actionable warnings seem crucial for
> successful user adoption.
>
> I'm also very curious about your SAT solver implementation; I have a
>> recent hobby of plugging various solvers into the static analyzer (cf.
>> https://reviews.llvm.org/D110125) because it obviously needs some -
>> arguably as much as your endeavor.
>
>
> Yes, we'd love to tell you more, but I'll have to defer to sgatev@, since
> he's the author and most familiar with the details.
>
> Cheers,
> Yitzhak
>
>
> On Tue, Oct 12, 2021 at 5:34 PM Artem Dergachev <noqnoqneo at gmail.com>
> wrote:
>
>> Sounds like the mail didn't hit the mailing list due to size
>> restriction. I'm quoting in plain text now to make sure it definitely
>> hits the list in some form.
>>
>> Like Gabor already said, It's amazing that this is happening. I
>> completely agree with your goals and I think we've discussed a lot of
>> this offline earlier. The amount of existing flow-sensitive warnings may
>> be low but it's steadily growing, Valeriy's -Wcompletion-handler being
>> one of the notable latest additions - a backwards analysis over finite
>> state lattice with a sophisticated system of notes attached to the
>> warning to explain what's happening (generalizing over the latter may
>> possibly be a goal for your framework as well).
>>
>> Aside from the ability to cover *the other 50% of warnings* compared to
>> what the static analyzer already covers (may-problems vs. must-problems
>> aka all-paths problems), another strong benefit of flow sensitive
>> warnings is their performance which allows us to use them as compiler
>> warnings (as opposed to separate tools like clang-tidy or static
>> analyzer). This in turn delivers the warning to every user of the
>> compiler which increases the user coverage dramatically. This
>> consideration may even apply to may-problems in which the static
>> analyzer already excels technically.
>>
>> I'm also very curious about your SAT solver implementation; I have a
>> recent hobby of plugging various solvers into the static analyzer (cf.
>> https://reviews.llvm.org/D110125) because it obviously needs some -
>> arguably as much as your endeavor.
>>
>>
>> On 10/12/21 8:32 AM, Yitzhak Mandelbaum wrote:
>> > Summary
>> >
>> > We propose a new static analysis framework for implementing Clang
>> > bug-finding and refactoring tools (including ClangTidy checks) based
>> > on a dataflow algorithm.
>> >
>> >
>> > Some ClangTidy checks and core compiler warnings are already
>> > dataflowbased. However, due to a lack of a reusable framework, all
>> > reimplement common dataflow-analysis and CFG-traversal algorithms. In
>> > addition to supporting new checks, the framework should allow us to
>> > clean up those parts of Clang and ClangTidy.
>> >
>> >
>> > We intend that the new framework should have a learning curve similar
>> > to AST matchers and not require prior expertise in static analysis.
>> > Our goal is a simple API that isolates the user from the dataflow
>> > algorithm itself. The initial version of the framework combines
>> > support for classic dataflow analysesover the Clang CFG with symbolic
>> > evaluation of the code and a simple model of local variables and the
>> > heap. A number of useful checks can be expressed solely in terms of
>> > this model, saving their implementers from having to reimplement
>> > modeling of the semantics of C++ AST nodes. Ultimately, we want to
>> > extend this convenience to a wide variety of checks, so that they can
>> > be written in terms of the abstract semantics of the program (as
>> > modeled by the framework), while only matching and modeling AST nodes
>> > for key parts of their domain (e.g., custom data types or functions).
>> >
>> >
>> >   Motivation
>> >
>> > Clang has a few subsystems that greatly help with implementing
>> > bug-finding and refactoring tools:
>> >
>> >  *
>> >
>> >     AST matchersallow one to easily find program fragments whose AST
>> >     matches a given pattern. This works well for finding program
>> >     fragments whose desired semantics translates to a small number of
>> >     possible syntactic expressions (e.g., "find a call to
>> >     strlen()where the argument is a null pointer constant"). Note that
>> >     patterns expressed by AST matchers often check program properties
>> >     that hold for allexecution paths.
>> >
>> >  *
>> >
>> >     Clang Static Analyzercan find program paths that end in a state
>> >     that satisfies a specific user-defined property, regardless of how
>> >     the code was expressed syntactically (e.g., "find a call to
>> >     strlen()where the argument is dynamically equal to null"). CSA
>> >     abstracts from the syntax of the code, letting analyses focus on
>> >     its semantics. While not completely eliding syntax, symbolic
>> >     evaluation abstracts over many syntactic details. Semantically
>> >     equivalent program fragments result in equivalent program states
>> >     computed by the CSA.
>> >
>> >
>> > When implementing certain bug-finding and refactoring tools, it is
>> > often necessary to find program fragments where a given property holds
>> > on all execution paths, while placing as few demands as possible on
>> > how the user writes their code. The proposed dataflow analysis
>> > framework aims to satisfy both of these needs simultaneously.
>> >
>> >
>> >     Example: Ensuring that std::optionalis used safely
>> >
>> > Consider implementing a static analysis tool that proves that every
>> > time a std::optionalis unwrapped, the program has ensured that the
>> > precondition of the unwrap operation (the optional has a value) is
>> > satisfied.
>> >
>> >
>> > voidTestOK(std::optional<int>x){
>> >
>> > if(x.has_value()){
>> >
>> > use(x.value());
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > voidTestWithWarning(std::optional<int>x){
>> >
>> > use(x.value());// warning: unchecked access to optional value
>> >
>> > }
>> >
>> >
>> > If we implemented this check in CSA, it would likely find the bug in
>> > TestWithWarning, and would not report anything in TestOK. However, CSA
>> > did not prove the desired property (all optional unwraps are checked
>> > on all paths) for TestOK; CSA just could not find a path through
>> > TestOKwhere this property is violated.
>> >
>> >
>> > A simple function like TestOKonly has two paths and would be
>> > completely explored by the CSA. However, when loops are involved, CSA
>> > uses heuristics to decide when to stop exploring paths in a function.
>> > As a result, CSA would miss this bug:
>> >
>> >
>> > voidMissedBugByCSA(std::optional<int>x){
>> >
>> > for(inti =0;i <10;i++){
>> >
>> > if(i <5){
>> >
>> > use(i);
>> >
>> > }else{
>> >
>> > use(x.value());// CSA does not report a warning
>> >
>> > }
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > If we implemented this check with AST matchers (as typically done in
>> > ClangTidy), we would have to enumerate way too many syntactic
>> > patterns. Consider just a couple coding variations that users expect
>> > this analysis to understand to be safe:
>> >
>> >
>> > voidTestOK(std::optional<int>x){
>> >
>> > if(x){
>> >
>> > use(x.value());
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > voidTestOK(std::optional<int>x){
>> >
>> > if(x.has_value()==true){
>> >
>> > use(x.value());
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > voidTestOK(std::optional<int>x){
>> >
>> > if(!x.has_value()){
>> >
>> > use(0);
>> >
>> > }else{
>> >
>> > use(x.value());
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > voidTestOK(std::optional<int>x){
>> >
>> > if(!x.has_value())
>> >
>> > return;
>> >
>> > use(x.value());
>> >
>> > }
>> >
>> >
>> > To summarize, CSA can find someoptional-unwrapping bugs, but it does
>> > not prove that the code performs a check on allpaths. AST matchers can
>> > prove it for a finite set of hard-coded syntactic cases. But, when the
>> > set of safe patterns is infinite, as is often the case, AST matchers
>> > can only capture a subset. Capturing the full set requires building a
>> > framework like the one presented in this document.
>> >
>> >
>> >     Comparison
>> >
>> >
>> >
>> >
>> > AST Matchers Library
>> >
>> >
>> >
>> > Clang Static Analyzer
>> >
>> >
>> >
>> > Dataflow Analysis Framework
>> >
>> > Proves that a property holds on all execution paths
>> >
>> >
>> >
>> > Possible, depends on the matcher
>> >
>> >
>> >
>> > No
>> >
>> >
>> >
>> > Yes
>> >
>> > Checks can be written in terms of program semantics
>> >
>> >
>> >
>> > No
>> >
>> >
>> >
>> > Yes
>> >
>> >
>> >
>> > Yes
>> >
>> >
>> >     Further Examples
>> >
>> > While we have started with a std::optionalchecker, we have a number of
>> > further use cases where this framework can help. Note that some of
>> > these are already implemented in complex, specialized checks, either
>> > directly in the compiler or as clang-tidy checks. A key goal of our
>> > framework is to simplify the implementation of such checks in the
>> future.
>> >
>> >  *
>> >
>> >     “the pointer may be null when dereferenced”, “the pointer is
>> >     always null when dereferenced”,
>> >
>> >  *
>> >
>> >     "unnecessary null check on a pointer passed to delete/free",
>> >
>> >  *
>> >
>> >     “this raw pointer variable could be refactored into a
>> >     std::unique_ptr”,
>> >
>> >  *
>> >
>> >     “expression always evaluates to a constant”, “if condition always
>> >     evaluates to true/false”,
>> >
>> >  *
>> >
>> >     “basic block is never executed”, “loop is executed at most once”,
>> >
>> >  *
>> >
>> >     “assigned value is always overwritten”,
>> >
>> >  *
>> >
>> >     “value was used after having been moved”,
>> >
>> >  *
>> >
>> >     “value guarded by a lock was used without taking the lock”,
>> >
>> >  *
>> >
>> >     “unnecessary existence check before inserting a value into a
>> >     std::map”.
>> >
>> >
>> > Some of these use cases are covered by existing ClangTidy checks that
>> > are implemented with pattern matching. We believe there is an
>> > opportunity to simplify their implementation by refactoring them to
>> > use the dataflow framework, and simultaneously improve the coverage
>> > and precision of how they model C++ semantics.
>> >
>> >
>> >   Overview
>> >
>> > The initial version of our framework provides the following key
>> features:
>> >
>> >  *
>> >
>> >     a forward dataflow algorithm, parameterized by a user-defined
>> >     value domain and a transfer function, which describes how program
>> >     statements modify domain values.
>> >
>> >  *
>> >
>> >     a built-in model of local variables and the heap. This model
>> >     provides two benefits to users: it tracks the flow of values,
>> >     accounting for a wide variety of C++ language constructs, in a way
>> >     that is reusable across many different analyses; it discriminates
>> >     value approximations based on path conditions, allowing for higher
>> >     precision than dataflow alone.
>> >
>> >  *
>> >
>> >     an interface to an SMT solver to verify safety of operations like
>> >     dereferencing a pointer,
>> >
>> >  *
>> >
>> >     test infrastructure for validating results of the dataflow
>> algorithm.
>> >
>> >
>> >     Dataflow Algorithm
>> >
>> > Our framework solves forward dataflow equations over a product of a
>> > user-defined value domain and a built-in model of local variables and
>> > the heap, called the environment. The user implements the transfer
>> > function for their value domain, while the framework implements the
>> > transfer function that operates on the environment, only requiring a
>> > user-defined widening operator to avoid infinite growth of the
>> > environment model. We intend to expand the framework to also support
>> > backwards dataflow equations at some point in the future. The
>> > limitation to the forward direction is not essential.
>> >
>> >
>> > In more detail, the algorithm is parameterized by:
>> >
>> >
>> >  *
>> >
>> >     a user-defined, finite-height partial order, with a join function
>> >     and maximal element,
>> >
>> >  *
>> >
>> >     an initial element from the user-defined lattice,
>> >
>> >  *
>> >
>> >     a (monotonic) transfer function for statements, which maps between
>> >     lattice elements,
>> >
>> >  *
>> >
>> >     comparison and merge functions for environment values.
>> >
>> >
>> > We expand on the first three elements here and return to the
>> > comparison and merge functions in the next section.
>> >
>> >
>> >       The Lattice and Analysis Abstractions
>> >
>> > We start with a representation of a lattice:
>> >
>> > classBoundedJoinSemiLattice{
>> >
>> > public:
>> >
>> > friendbooloperator==(constBoundedJoinSemiLattice&lhs,
>> >
>> > constBoundedJoinSemiLattice&rhs);
>> >
>> > friendbooloperator<=(constBoundedJoinSemiLattice&lhs,
>> >
>> > constBoundedJoinSemiLattice&rhs);
>> >
>> > staticBoundedJoinSemiLatticeTop();
>> >
>> > enumclassJoinEffect{
>> >
>> > Changed,
>> >
>> > Unchanged,
>> >
>> > };
>> >
>> > JoinEffectJoin(constBoundedJoinSemiLattice&element);
>> >
>> > };
>> >
>> > Not all of these declarations are necessary. Specifically,
>> > operator==and operator<=can be implemented in terms of Joinand
>> > Topisn’t used by the dataflow algorithm. We include the operators
>> > because direct implementations are likely more efficient, and
>> > Topbecause it provides evidence of a well-founded bounded join
>> > semi-lattice.
>> >
>> >
>> > The Joinoperation is unusual in its choice of mutating the object
>> > instance and returning an indication of whether the join resulted in
>> > any changes. A key step of the fixpoint algorithm is code like j =
>> > Join(x, y); if (j == x) …. This formulation of Joinallows us to
>> > express this code instead as if (x.Join(y) == Unchanged).... For
>> > lattices whose instances are expensive to create or compare, we expect
>> > this formulation to improve the performance of the dataflow analysis.
>> >
>> >
>> > Based on the lattice abstraction, our analysis is similarly traditional:
>> >
>> > classDataflowAnalysis{
>> >
>> > public:
>> >
>> > usingBoundedJoinSemiLattice=…;
>> >
>> > BoundedJoinSemiLatticeInitialElement();
>> >
>> > BoundedJoinSemiLatticeTransferStmt(
>> >
>> > constclang::Stmt*stmt,constBoundedJoinSemiLattice&element,
>> >
>> >       clang::dataflow::Environment&environment);
>> >
>> > };
>> >
>> >
>> >     Environment: A Built-in Lattice
>> >
>> > All user-defined operations have access to an environment, which
>> > encapsulates the program context of whatever program element is being
>> > considered. It contains:
>> >
>> >  *
>> >
>> >     a path condition,
>> >
>> >  *
>> >
>> >     a storage model (which models both local variables and the heap).
>> >
>> >
>> > The built-in transfer functions model the effects of C++ language
>> > constructs (for example, variable declarations, assignments, pointer
>> > dereferences, arithmetic etc.) by manipulating the environment, saving
>> > the user from doing so in custom transfer functions.
>> >
>> >
>> > The path condition accumulates the set of boolean conditions that are
>> > known to be true on every path from function entry to the current
>> > program point.
>> >
>> >
>> > The environment maintains the maps from program declarations and
>> > pointer values to storage locations. It also maintains the map of
>> > storage locations to abstract values. Storage locations can be atomic
>> > (“scalar”) or aggregate multiple sublocations:
>> >
>> >  *
>> >
>> >     ScalarStorageLocationis a storage location that is not subdivided
>> >     further for the purposes of abstract interpretation, for example
>> >     bool, int*.
>> >
>> >  *
>> >
>> >     AggregateStorageLocationis a storage location which can be
>> >     subdivided into smaller storage locations that can be
>> >     independently tracked by abstract interpretation. For example, a
>> >     struct with public members.
>> >
>> >
>> > In addition to this storage structure, our model tracks three kinds of
>> > values: basic values (like integers and booleans), pointers (which
>> > reify storage location into the value space) and records with named
>> > fields, where each field is itself a value. For aggregate values, we
>> > ensure coherence between the structure represented in storage
>> > locations and that represented in the values: for a given aggregate
>> > storage location Laggwith child f:
>> >
>> >  *
>> >
>> >     valueAt(Lagg) is a record, and
>> >
>> >  *
>> >
>> >     valueAt(child(Lagg, f)) = child(valueAt(Lagg), f),
>> >
>> > where valueAtmaps storage locations to values.
>> >
>> >
>> > In our first iteration, the only basic values that we will track are
>> > boolean values. Additional values and variables may be relevant, but
>> > only in so far as they constitute part of a boolean expression that we
>> > care about. For example, if (x > 5) { … }will result in our tracking x
>> > > 5, but not any particular knowledge about the integer-valued variable
>> x.
>> >
>> >
>> > The path conditions and the model refer to the same sets of values and
>> > locations.
>> >
>> >
>> >       Accounting for infinite height
>> >
>> > Despite the simplicity of our storage model, its elements can grow
>> > infinitely large: for example, path conditions on loop back edges,
>> > records when modeling linked lists, and even just the number of
>> > locations for a simple for-loop.  Therefore, we provide the user with
>> > two opportunities to interpret the abstract values so as to bound it:
>> > comparison of abstract values and a merge operation at control-flow
>> joins.
>> >
>> >
>> > // Compares values from two different environments for semantic
>> > equivalence.
>> >
>> > boolCompare(
>> >
>> >       clang::dataflow::Value*value1,clang::dataflow::Environment&env1,
>> >
>> >       clang::dataflow::Value*value2,clang::dataflow::Environment&env2);
>> >
>> >
>> > // Given `value1` (w.r.t. `env1`) and `value2` (w.r.t. `env2`), returns
>> a
>> >
>> > // `Value` (w.r.t. `merged_env`) that approximates `value1` and
>> > `value2`. This
>> >
>> > // could be a strict lattice join, or a more general widening operation.
>> >
>> >   clang::dataflow::Value*Merge(
>> >
>> >       clang::dataflow::Value*value1,clang::dataflow::Environment&env1,
>> >
>> >       clang::dataflow::Value*value2,clang::dataflow::Environment&env2,
>> >
>> >       clang::dataflow::Environment&merged_env);
>> >
>> >
>> >     Verification by SAT Solver
>> >
>> > The path conditions are critical in refining our approximation of the
>> > local variables and heap. At an abstract level, they don’t need to
>> > directly influence the execution of the dataflow analysis. Instead, we
>> > can imagine that, once concluded, our results are annotated
>> > with/incorporate path conditions, which in turn allows us to make more
>> > precise conclusions about our program from the analysis results.
>> >
>> >
>> > For example, if the analysis concludes that a variable conditionally
>> > holds a non-null pointer, and, for a given program point, the path
>> > condition implies said condition, we can conclude that a dereference
>> > of the pointer at that point is safe. In this sense, the verification
>> > of the safety of a deference is separate from the analysis, which
>> > models memory states with formulae. That said, invocation of the
>> > solver need not wait until after the analysis has concluded. It can
>> > also be used during the analysis, for example, in defining the
>> > widening operation in Merge.
>> >
>> >
>> > With that, we can see the role of an SAT solver: to solve questions of
>> > the form “Does the path condition satisfy the pre-condition computed
>> > for this pointer variable at this program point”? To that end, we
>> > include an API for asking such questions of a SAT solver. The API is
>> > compatible with Z3 and we also include our own (simple) SAT solver,
>> > for users who may not want to integrate with Z3.
>> >
>> >
>> >     Testing dataflow analysis
>> >
>> > We provide test infrastructure that allows users to easily write tests
>> > that make assertions about a program state computed by dataflow
>> > analysis at any code point. Code is annotated with labels on points of
>> > interest and then test expectations are written with respect to the
>> > labeled points.
>> >
>> >
>> > For example, consider an analysis that computes the state of
>> > optionalvalues.
>> >
>> > ExpectDataflowAnalysis<OptionalChecker>(R"(
>> >
>> > void f(std::optional<int> opt) {
>> >
>> >   if (opt.has_value()) {
>> >
>> >     // [[true-branch]]
>> >
>> >   } else {
>> >
>> >     // [[false-branch]]
>> >
>> >   }
>> >
>> > })",[](constauto&results){
>> >
>> >
>>   EXPECT_LE(results.ValueAtCodePoint("true-branch"),OptionalLattice::Engaged());
>> >
>> >
>>   EXPECT_LE(results.ValueAtCodePoint("false-branch"),OptionalLattice::NullOpt());
>> >
>> > });
>> >
>> >
>> >   Timeline
>> >
>> > We have a fully working implementation of the framework as proposed
>> > here. We also are interested in contributing a clang-tidy checker for
>> > std::optionaland related types as soon as possible, given its
>> > demonstrated efficacy and interest from third parties.  To that end,
>> > as soon as we have consensus for starting this work in the clang
>> > repository, we plan to start sending patches for review. Given that
>> > the framework is relatively small, we expect it will fit within 5
>> > (largish) patches, meaning we can hope to have it available by
>> > December 2021.
>> >
>> >
>> >   Future Work
>> >
>> > We consider our first version of the framework as experimental. It
>> > suits the needs of a number of particular checks that we’ve been
>> > writing, but it has some obvious limitations with respect to general
>> > use.  After we commit our first version, here are some planned areas
>> > of improvement and investigation.
>> >
>> >
>> > Backwards analyses.The framework only solves forward dataflow
>> > equations. This clearly prevents implementing some very familiar and
>> > useful analyses, like variable liveness. We intend to expand the
>> > framework to also support backwards dataflow equations. There’s
>> > nothing inherent to our framework that limits it to forwards.
>> >
>> >
>> > Reasoning about non-boolean values.Currently, we can symbolically
>> > model equality of booleans, but nothing else. Yet, for locations on
>> > the store that haven’t changed, we can understand which values are
>> > identical (because the input value object is passed by the transfer
>> > function to the output). For example,
>> >
>> > voidf(bool*a,bool*b){
>> >
>> >     std::optional<int>opt;
>> >
>> > if(a ==b){opt =42;}
>> >
>> > if(a ==b){opt.value();}
>> >
>> > }
>> >
>> > Since neither anor bhave changed, we can reason that the two
>> > occurrences of the expression a == bevaluate to the same value (even
>> > though we don't know the exact value). We intend to improve the
>> > framework to support this increased precision, given that it will not
>> > cost us any additional complexity.
>> >
>> >
>> > Improved model of program state.A major obstacle to writing
>> > high-quality checks for C++ code is the complexity of C++’s
>> > semantics.  In writing our framework, we’ve come to see the value of
>> > shielding the user from the need to handle these directly by modeling
>> > program state and letting the user express their check in terms of
>> > that model, after it has been solved for each program point. As is,
>> > our model currently supports a number of checks, including the
>> > std::optionalcheck and another that infers which pointer parameters
>> > are out parameters. We intend to expand the set of such checks, with
>> > improvements to the model and the API for expressing domain-specific
>> > semantics relevant to a check. Additionally, we are considering
>> > explicitly tracking other abstract program properties, like reads and
>> > writes to local variables and memory, so that users need not account
>> > for all the syntactic variants of these operations.
>> >
>> >
>> > Refactoring existing checks.Once the framework is more mature and
>> > stable, we would like to refactor existing checks that use custom,
>> > ad-hoc implementations of dataflow to use this framework.
>> >
>> >
>> >   Conclusion
>> >
>> > We have proposed an addition to the Clang repository of a new static
>> > analysis framework targeting all-pathsanalyses. The intended clients
>> > of such analyses are code-checking and transformation tools like
>> > clang-tidy, where sound analyses can support sound transformations of
>> > code. We discussed the key features and APIs of our framework and how
>> > they fit together into a new analysis framework. Moreover, our
>> > proposal is based on experience with a working implementation that
>> > supports a number of practical, high-precision analyses, including one
>> > for safe use of types like std::optionaland another that finds missed
>> > opportunities for use of std::move.
>> >
>> >
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211012/84e22eac/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4000 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211012/84e22eac/attachment-0001.bin>