[PATCH] D127898: [clang][dataflow] Add API to separate analysis from diagnosis

Sam Estep via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jun 23 09:56:05 PDT 2022


samestep marked an inline comment as done.
samestep added inline comments.


================
Comment at: clang/include/clang/Analysis/FlowSensitive/Diagnosis.h:38-42
+llvm::DenseSet<Diag> diagnoseCFG(
+    const ControlFlowContext &CFCtx,
+    std::vector<llvm::Optional<TypeErasedDataflowAnalysisState>> &BlockStates,
+    const Environment &Env, TypeErasedDataflowAnalysis &Analysis,
+    Diagnosis<Lattice, llvm::DenseSet<Diag>> Diagnose) {
----------------
xazax.hun wrote:
> xazax.hun wrote:
> > I feel a bit uneasy about this API. The need of helpers in the tests is a signal that this API might not be ergonomic as is. One needs to pass a lot of stuff in, and it is really error prone. For example, the user might get confused what `Environment` to pass in. Or one might pass the wrong `Analysis` if there are multiple similar ones in scope. I wonder if there is a way to reduce the ceremony needed to get diagnostics out. How about adding an optional argument to `runDataflowAnalysis` function to collect the results? I wonder if that would be a bit cleaner. You do not need to change the return type of that function, the collector object could take a reference to the `llvm::DenseSet<Diag>` that it populates. 
> Moreover, if we pass the diagnostic function to `runDataflowAnalysis`, the fact whether we collect diagnostic during the fixed-point iteration or after that in a separate pass becomes an implementation detail. We can easily change back and forth without affecting the callers. The current API mandates an implementation approach and makes it harder to change in the future.
I think this makes a lot of sense. I agree that the current set of parameters is unwieldy. I hadn't thought about your point of hiding the separate pass as an implementation detail, but that also makes sense to me. @gribozavr2 @ymandel @sgatev thoughts?


================
Comment at: clang/include/clang/Analysis/FlowSensitive/Diagnosis.h:38
+template <typename Lattice, typename Diag>
+llvm::DenseSet<Diag> diagnoseCFG(
+    const ControlFlowContext &CFCtx,
----------------
gribozavr2 wrote:
> This function seems pretty general and not necessarily tied to diagnostics.
> 
> WDYT about using "visitor" in the name?
> 
> For example, "visitDataflowAnalysisResults".
> 
> The "Diagnosis" would also become "DataflowAnalysisResultVisitor".
Sure, that sounds fine to me. What do you think about @xazax.hun's suggested alternate API? If we go with that, would we rename it to say "visitor" regardless?


================
Comment at: clang/include/clang/Analysis/FlowSensitive/Diagnosis.h:42
+    const Environment &Env, TypeErasedDataflowAnalysis &Analysis,
+    Diagnosis<Lattice, llvm::DenseSet<Diag>> Diagnose) {
+  llvm::DenseSet<Diag> Diags;
----------------
gribozavr2 wrote:
> I'm leaning towards std::vector<Diag> instead of a set, to avoid requiring that Diag is hashable. Users who need hash-based deduplication can easily do it themselves (but why would they have duplication? we only visit each Stmt once)
This makes a lot of sense, I'll make that change. Somehow along the way I forgot that the whole reason we even needed a set was because we were using it as a lattice when computing a fixpoint, but you're right that since we're pulling this out of the fixpoint iteration, we a vector would do fine.


================
Comment at: clang/include/clang/Analysis/FlowSensitive/Diagnosis.h:57
+  }
+  return std::move(Diags);
+}
----------------
samestep wrote:
> gribozavr2 wrote:
> > sgatev wrote:
> > > Better to remove this and rely on NRVO? https://en.cppreference.com/w/cpp/language/copy_elision
> > https://stackoverflow.com/questions/14856344/when-should-stdmove-be-used-on-a-function-return-value
> Ah OK, I'm not a C++ expert so I wasn't sure when to write `std::move` and when not to. Will do, thanks!
Thanks!


================
Comment at: clang/include/clang/Analysis/FlowSensitive/MatchSwitch.h:49
+template <typename DiagsT> struct DiagnoseState {
+  DiagnoseState(DiagsT &Diags, const Environment &Env)
+      : Diags(Diags), Env(Env) {}
----------------
gribozavr2 wrote:
> samestep wrote:
> > sgatev wrote:
> > > samestep wrote:
> > > > sgatev wrote:
> > > > > Move this to Diagnosis.h?
> > > > I'd prefer to keep it in `MatchSwitch.h` alongside `TransferState`, unless we plan to, e.g., move that type to `DataflowAnalysis.h`.
> > > Fair enough, but generally I see `TransferState` and `DiagnoseState` as very specific to the dataflow analysis framework whereas `MatchSwitchBuilder` seems to be a bit more generic so I'd consider separating them.
> > Sure, if people think it makes sense to put those two types elsewhere then we can do that in a followup.
> For now, since there is only a single user, I'd actually prefer to move it into UncheckedOptionalAccessDiagnosis itself. It is a trivial component, basically a std::pair, it seems that exposing it as a public API might not be that helpful.
That's also fine; I can make that change here.


================
Comment at: clang/include/clang/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.h:85
+private:
+  ASTContext &Context;
+  MatchSwitch<DiagnoseState<llvm::DenseSet<SourceLocation>>>
----------------
xazax.hun wrote:
> Do we expect this to modify the ASTContext? Could this be a const ref instead?
I'd guess it shouldn't; I was mostly just copying the `ASTContext &` field from the `DataflowAnalysis` class, which is not `const`. I can try changing it to `const` here, though.


================
Comment at: clang/include/clang/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.h:78
+public:
+  UncheckedOptionalAccessDiagnosis(
+      ASTContext &AstContext, UncheckedOptionalAccessModelOptions Options = {});
----------------
gribozavr2 wrote:
> "Diagnosis" sounds like the result. Should this be a "Diagnoser"?
Sure, I can change it to say "Diagnoser" instead (unless we want to just replace it with "Visitor").


================
Comment at: clang/unittests/Analysis/FlowSensitive/TestingSupport.h:92
     std::function<AnalysisT(ASTContext &, Environment &)> MakeAnalysis,
-    std::function<void(
-        llvm::ArrayRef<std::pair<
-            std::string, DataflowAnalysisState<typename AnalysisT::Lattice>>>,
-        ASTContext &)>
-        Expectations,
+    std::function<void(AnalysisResults)> Expectations,
     ArrayRef<std::string> Args,
----------------
gribozavr2 wrote:
> It is better to name a function with a verb. WDYT about "VerifyResults"?
> 
> If you agree, please also update the functions below.
Sounds good to me, I can do that.


================
Comment at: clang/unittests/Analysis/FlowSensitive/TestingSupport.h:190
+              });
+        }
+        Expectations(Results, AnalysisResults.Context);
----------------
gribozavr2 wrote:
> Can we use diagnoseCFG to implement the loop above?
Possibly! I'll look into it.


================
Comment at: clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp:1925
   //  )",
-  //      UnorderedElementsAre(Pair("check-1", "unsafe: input.cc:10:9"),
-  //                           Pair("check-2", "safe")));
+  //      "safe");
 }
----------------
gribozavr2 wrote:
> The original was unsafe, is it an intentional change?
Oh! Good catch, I didn't notice that. Will fix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127898/new/

https://reviews.llvm.org/D127898



More information about the cfe-commits mailing list