[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place

Daniel Krupp via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Apr 19 08:52:02 PDT 2023


dkrupp marked an inline comment as done.
dkrupp added a comment.

@steakhal  thanks for your review. All your remarks have been fixed.



================
Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:147-150
 bool taint::isTainted(ProgramStateRef State, const Stmt *S,
                       const LocationContext *LCtx, TaintTagType Kind) {
-  SVal val = State->getSVal(S, LCtx);
-  return isTainted(State, val, Kind);
+  return !getTaintedSymbols(State, S, LCtx, Kind).empty();
 }
----------------
steakhal wrote:
> TBH I'm not sure if I like that now we allocate unbounded amount of times (because of `getTaintedSymbols()` is recursive and returns by value), where we previously did not.
> 
> What we could possibly do is to compute the elements of this sequence lazily.
> I'm thinking of the `llvm::mapped_iterator`, but I'm not sure if it's possible to have something like that as a return type as it might encode the map function in the type or something like that.
> Anyway, I'm just saying that it would be nice to not do more than it's necessary, and especially not allocate a lot of short-lived objects there.
> 
> Do you think there is a way to have the cake and eat it too?
> 
> ---
> 
> I did some investigation, and one could get pretty far in the implementation, and maybe even complete it but it would be really complicated as of now. Maybe we could revisit this subject when we have coroutines.
> 
> So, I would suggest to have two sets of APIs:
>  - the usual `isTainted(.) -> bool`
>  - and a `getTaintedSymbols(.) -> vector<Sym>`
> The important point would be that the `isTainted()` version would not eagerly collect all tainted sub-syms but return on finding the first one.
> While, the `getTaintedSymbols()` would collect eagerly all of them, as its name suggests.
> 
> Imagine if `getTaintedSymbolsImpl()` had an extra flag like `bool returnAfterFirstMatch`. This way `isTainted()` can call it like that. While in the other case, the parameter would be `false`, and eagerly collect all symbols.
> 
> This is probably the best of both worlds, as it prevents `isTainted` from doing extra work and if we need to iterate over the tainted symbols, we always iterate over all of them, so doing it lazily wouldn't gain us much in that case anyway.
> As a bonus, the user-facing API would be self-descriptive.
> 
> WDYT?
Good idea. I implemented the early return option in getTaintedSymbols(). This is used now by the isTainted() function.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144269/new/

https://reviews.llvm.org/D144269



More information about the cfe-commits mailing list