[PATCH] D122231: [clang][dataflow] Add support for `value_or` in a comparison.

Wed Mar 30 01:02:35 PDT 2022

sgatev accepted this revision.
sgatev added inline comments.

================
Comment at: clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:279
+                 cast_or_null<BoolValue>(Env.getValue(*ComparisonExprLoc))) {
+    Env.setValue(*ComparisonExprLoc,
+                 Env.makeAnd(*CurrentValue, ComparisonValue));
----------------
ymandel wrote:
> sgatev wrote:
> > ymandel wrote:
> > > ymandel wrote:
> > > > xazax.hun wrote:
> > > > > I am still wondering a bit about this case. 
> > > > > 
> > > > > We generate: `HasValueVal and ContentsNotEqX and CurrentValue`.'
> > > > > I wonder if we want: `HasValueVal  and (ContentsNotEqX  <=> CurrentValue)` instead?  Or even `HasValueVal  and CurrentValue`?
> > > > I don't think that the iff version is right, but `HasValueVal  and CurrentValue` could be. My concern is that we're not guaranteed that `CurrentValue` is populated. And, even if we were, it doesn't feel quite right. Assuming its a high fidelity model, we get (logically): `HasValue(opt) and Ne(ValueOr(opt,X),X)`. Then, when negated (say, on an else branch) we get `not(HasValue(opt)) or not(Ne(ValueOr(opt,X),X))` which is equivalent to `not(HasValue(opt)) or Eq(ValueOr(opt,X),X)`. While  true, it seems redundant, since the first clause should be derivable from the second (assuming an interpretatable semantics to the `ValueOr` predicate).
> > > > 
> > > > Regardless, it might be better to step back and figure out how this should be done systematically. I'll try to come back with a proposal on that.
> > > > Regardless, it might be better to step back and figure out how this should be done systematically. I'll try to come back with a proposal on that.
> > > 
> > > Here's what I have: in general, we're aiming for all models to be a sound (over) approximation of reality. That is what we're doing here as well. Yet, that poses a problem for the interpretation of the boolean not operator. If its operand is an overapproximation, then I believe the naive approach gives you an under approximation. That's the problem we're hitting when reasoning about the negation.
> > > 
> > > I'm not sure how to handle this. Stanislav -- have we dealt with this issue before?
> > > 
> > > That said, if we go back to the previous approach, of adding the information to the path condition, I think we avoid this problem, since the path conditions don't get negated.  To Gabor's earlier point:
> > > > There is an implication in the reverse direction as well. In case we know the optional is empty, we can prune one of the branches from the analysis. Is it possible to implement that with the current status of the framework?
> > > I think is covered by the condition we're adding. Namely:
> > > ```
> > > ExprValue => has_value
> > > ```
> > > where `ExprValue` is the truth value of the boolean expression.
> > > 
> > > So, the implication in the reverse direction is:
> > > ```
> > > !has_value => !ExprValue
> > > ```
> > > that is, if we know the optional doesn't hold a value, then we know that `opt.value_or(X) = X`
> > > 
> > > But, that implication is the contrapositive of our own, so I think it's already implicitly covered by adding the single implication. Does that sound right?
> > I'm not following where `Env.makeAnd(*CurrentValue, ComparisonValue)` comes from so I'd question whether it's sound or not. I would have expected to see something like `ExprValue => has_value` (which I believe was the case in the first iteration) and I see no issues with the contrapositive. If you have `x => y` and `not y` in the flow condition, you'll be able to infer that `not x` is true (assuming no other statements for `x`). How we use this to prune branches from the analysis is a question of its own.
> I think the new version resolves this?
Yes, modelling these using implications looks good to me!

================
Comment at: clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:275
+      State.Env.getValue(*ValueOrPredExpr, SkipPast::None));
+  if (ExprValue == nullptr) {
+    auto &ExprLoc = State.Env.createStorageLocation(*ValueOrPredExpr);
----------------
Why do this conditionally? I think we should set a value regardless of whether another model has already done so.

================
Comment at: clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:525
+      // opt.value_or(X) != X
+      .CaseOf<Expr>(isValueOrCondition(), transferValueOrNotEqX)
+
----------------
Call this `isValueOrNotEqX` for consistency?

================
Comment at: clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp:1720
+    void target() {
+      $ns::$optional<int*> opt;
+      if (opt.value_or(nullptr) != nullptr) {
----------------
I suggest making `opt` a parameter of `target` in all tests because in the current setup a more advanced analysis would identify one of the code paths we exercise as dead.

================
Comment at: clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp:1782
+    namespace std {
+      struct string {
+        string(const char*);
----------------
Let's move this to a string header and remove the definition in the test above.

================
Comment at: clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp:1811-1812
+    void target() {
+      $ns::$optional<int> opt;
+      auto *opt_p = &opt;
+      if (opt_p->value_or(0) != 0) {
----------------
These can be combined in a `$ns::$optional<int> *opt` parameter.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122231/new/

https://reviews.llvm.org/D122231