[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place

Thu Feb 23 14:05:00 PST 2023

NoQ added a comment.

Yeah looks like I replied without properly reading the patch.

`TaintBugReport` is brilliant and we already have a precedent for subclassing `BugReport` in another checker. However I'm somewhat worried that once we start doing more of this, we'll eventually end up with multiple inheritance situations when the report needs multiple kinds of information. So at a glance my approach with a "generic data map" in bugreport objects looks a bit more future-proof to me. Also a bit easier to set up, no need to deal with custom RTTI.

So I think interestingness is just an example of such "generic data" attached to bug report. Interestingness is also somewhat confusing, because indeed, there are existing interesting rules, and I don't think anybody remembers what they are or what was even the purpose of having interestingness in the first place. Interestingness is currently used for tracking symbols with `trackExpressionValue()`, and we have those tracking kinds added by @Szelethus to make tracking behave slightly differently. So, yeah, I think interestingness shouldn't be used; it's already in use. I think it should be generalized upon i.e. just let checkers track whatever/however they want.

I guess my main point is, there shouldn't be a need to assist tracking by adding extra information to the program state. Information in the state should ideally be "material" to program execution, "tangible", it has to describe something that's actually stored somewhere in memory (either by directly defining it, or by constraining it). In particular, if two nodes result in indistinguishable future behavior of the program, we're supposed to merge them; but any "immaterial" bits of information in the state will prevent that from happening.

In our case it should be enough to have the lambda for propagation method ask "Hey, is this freshly produced propagation target value relevant to this specific report?" and if yes, mark the corresponding propagation source value as relevant to the report as well; also emit a note and "consume" the mark on the target value. Such chain of local decisions can easily replace the global taint flow identifier, and it's more flexible because this way the flow doesn't need to be "linear", it may branch in various ways and that's ok.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144269/new/

https://reviews.llvm.org/D144269