[PATCH] D74973: [analyzer] StdLibraryFunctionsChecker refactor w/ inheritance

Gabor Marton via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 6 09:54:04 PST 2020


martong marked 4 inline comments as done.
martong added a comment.

In D74973#1900852 <https://reviews.llvm.org/D74973#1900852>, @Szelethus wrote:

> I have some high level questions, you have spent far more time with this code and I'm happy to be proven wrong! :)
>
> In D74973#1889188 <https://reviews.llvm.org/D74973#1889188>, @martong wrote:
>
> > > Is really more kind of constraint needed than range constraint?
> >
> > Yes, there are other constraints I am planning to implement:
> >
> > - Size requirements E.g.: asctime_s(char *buf, rsize_t bufsz, const struct tm *time_ptr); `buf` size must be at least `bufsz`.
> > - Not-null
> > - Not-uninitalized
> > - Not-tainted
>
>
> Are we really sure that we need to express that with constraints? Can't we just change the name of `ValueRange` (or encapsulate it in another class) and add more fields to it, such as taintedness or initializedness? Is there an incentive to keep `ValueRange` lean?


Yes there is: separation of concerns, i.e. ValueRange should handle Values with int ranges. One class should handle only one well established responsibility.

> This doesn't look too bad:
> 
>   auto Getc = [&]() {
>     return Summary(ArgTypes{Irrelevant}, RetType{IntTy}, NoEvalCall)
>         .Case(
>             {ReturnValueDescription(RangeConstraints(WithinRange, {{EOFv, EOFv}, {0, UCharMax}},
>                                     Tainted, Non_Uninitialized});
>   };
> 
> 
> 
> 
>>> A non-null can be represented as range constraint too.
>> 
>> Actually, to implement that we should have a branch in all `ValueRange::apply*` functions that handles `Loc` SVals. Unfortunately, a pointer cannot be handled as `NonLoc`, and the current Range based implementation handles `NonLoc`s only.
> 
> So, why didn't we take that route instead? Marking a pointer non-null seems to be a less invasive change.

The answer is the same here. I think we should not mix too much implementation of different cases into one monumental function/class.

> 
> 
>>> The compare constraint is used only for the return value for which a special `ReturnConstraint` can be used to handle the return value not like a normal argument (and then the `Ret` special value is not needed).
>> 
>> The Compare constraint is already forced into a Range "concept" whereas it has nothing to do with ranges. By handling compare constraints separately, we attach a single responsibility to each constraint class, instead of having a monolithic god constraint class. Take a look at this coerced data representation that we have today in ValueRange:
>> 
>>   BinaryOperator::Opcode getOpcode() const {
>>     assert(Kind == ComparesToArgument);
>>     assert(Args.size() == 1);
>>     BinaryOperator::Opcode Op =
>>         static_cast<BinaryOperator::Opcode>(Args[0].first);
>>     assert(BinaryOperator::isComparisonOp(Op) &&
>>            "Only comparison ops are supported for ComparesToArgument");
>>     return Op;
>>   }
>>    
>> 
>> Subclasses are a good way to get rid of this not-so-intuitive structure and assertions.
> 
> Having more fields feels like another possible solution.

Yes, having fields is another approach of having an enum and a union (i.e. tagged union). A tagged union is a variant basically. So adding more fields would finally be very similar to a variant based solution. And that is useful only if we know the set of classes beforehand and we want to gradually implement new operations on them.
In our case we know that we need 3 operations: apply, negate, warning and no more. And we want to gradually add new classes: NonNull, NotUninitialized, ...



================
Comment at: clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:151
+
+  using ValueConstraintPtr = std::shared_ptr<ValueConstraint>;
+  /// The complete list of constraints that defines a single branch.
----------------
Szelethus wrote:
> gamesh411 wrote:
> > martong wrote:
> > > Note here, we need a copyable, polymorphic and default initializable type (vector needs that). A raw pointer were good, however, we cannot default initialize that. unique_ptr makes the Summary class non-copyable, therefore not an option.
> > > Releasing the copyablitly requirement would render the initialization of the Summary map infeasible.
> > > Perhaps we could come up with a [[ https://www.youtube.com/watch?v=bIhUE5uUFOA | type erasure technique without inheritance ]] once we consider the shared_ptr as restriction, but for now that seems to be overkill.
> > std::variant (with std::monostate for the default constructibility) would also be an option  (if c++17 were supported). But this is not really an issue, i agree with that.
> Ugh, we've historically been very hostile towards virtual functions. We don't mind them that much when they don't have to run a lot, like during bug report construction, but as a core part of the analysis, I'm not sure what the current stance is on it.
> 
> I'm not inherently (haha) against it, and I'm fine with leaving this as-is for the time being, though I'd prefer if you placed a `TODO` to revisit this issue.
> std::variant (with std::monostate for the default constructibility) would also be an option (if c++17 were supported). But this is not really an issue, i agree with that.

Variant would be useful if we knew the set of classes prior and we wanted to add operations gradually. Class hierarchies (or run-time concepts [Sean Parent]) are very useful if we know the set of operations prior and we want to add classes gradually, and we have this case here.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:151
+
+  using ValueConstraintPtr = std::shared_ptr<ValueConstraint>;
+  /// The complete list of constraints that defines a single branch.
----------------
martong wrote:
> Szelethus wrote:
> > gamesh411 wrote:
> > > martong wrote:
> > > > Note here, we need a copyable, polymorphic and default initializable type (vector needs that). A raw pointer were good, however, we cannot default initialize that. unique_ptr makes the Summary class non-copyable, therefore not an option.
> > > > Releasing the copyablitly requirement would render the initialization of the Summary map infeasible.
> > > > Perhaps we could come up with a [[ https://www.youtube.com/watch?v=bIhUE5uUFOA | type erasure technique without inheritance ]] once we consider the shared_ptr as restriction, but for now that seems to be overkill.
> > > std::variant (with std::monostate for the default constructibility) would also be an option  (if c++17 were supported). But this is not really an issue, i agree with that.
> > Ugh, we've historically been very hostile towards virtual functions. We don't mind them that much when they don't have to run a lot, like during bug report construction, but as a core part of the analysis, I'm not sure what the current stance is on it.
> > 
> > I'm not inherently (haha) against it, and I'm fine with leaving this as-is for the time being, though I'd prefer if you placed a `TODO` to revisit this issue.
> > std::variant (with std::monostate for the default constructibility) would also be an option (if c++17 were supported). But this is not really an issue, i agree with that.
> 
> Variant would be useful if we knew the set of classes prior and we wanted to add operations gradually. Class hierarchies (or run-time concepts [Sean Parent]) are very useful if we know the set of operations prior and we want to add classes gradually, and we have this case here.
> Ugh, we've historically been very hostile towards virtual functions. We don't mind them that much when they don't have to run a lot, like during bug report construction, but as a core part of the analysis, I'm not sure what the current stance is on it.

I did not find any evidence for this statement. Consider as a counter example the ExternalASTSource interface in Clang, which is filled with virtual functions and is part of the C/C++ lookup mechanism, which is quite on the hot path of C/C++ parsing I think. Did not find any prohibition in LLVM coding guidelines neither. I do believe virtual functions have their use cases exactly where (runtime) polimorphism is needed, such as in this patch.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74973/new/

https://reviews.llvm.org/D74973





More information about the cfe-commits mailing list