[PATCH] D66014: [analyzer] Avoid unnecessary enum range check on LValueToRValue casts

Tue Aug 13 13:50:35 PDT 2019

NoQ accepted this revision.
NoQ added a comment.
This revision is now accepted and ready to land.

Thanks!

================
Comment at: clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp:94-98
+  // If cast is implicit LValueToRValue, no conversion is taking place,
+  // and therefore no range check is needed.  Don't analyze further.
+  if (CE->getCastKind() == CK_LValueToRValue)
+    return;
+
----------------
chrish_ericsson_atx wrote:
> NoQ wrote:
> > chrish_ericsson_atx wrote:
> > > NoQ wrote:
> > > > chrish_ericsson_atx wrote:
> > > > > NoQ wrote:
> > > > > > If you look at the list of cast kinds, you'll be shocked to see ridiculously weird cornercases. Even though lvalue-to-rvalue cast definitely stands out (because it's the only one that has very little to do with actual casting), i'd still be much more comfortable if we *whitelist* the casts that we check, rather than blacklist them.
> > > > > > 
> > > > > > Can you take a look at the list and find which casts are we actually talking about and hardcode them here?
> > > > > I'm happy to cross-check a list; however, I'm not aware of the list you are referring to.  Would you mind providing a bit more detail?
> > > > See **[[ https://github.com/llvm-mirror/clang/blob/release_90/include/clang/AST/OperationKinds.def | include/clang/AST/OperationKinds.def ]]**.
> > > Ah.   Okay -- That is a list I've seen before, of course...  :)   
> > > 
> > > Hmm...  regarding whitelisting versus blacklisting...  In this case, it seems to me that switching to whitelisting casts that we know we should be checking increases the risk that we would introduce a new bug -- specifically that we'd accidentally leave out cast type that should have been included, which would disable a legitimate check (that some users might already be relying on).  By contrast, blacklisting the one cast type we know should *not* be checked adds zero new risk and still fixes this bug.
> > > 
> > > The sense of risk for me comes partly from being fairly new to working on clang AST code -- this is my first change.   It strikes me that if I were to feel confident about having the "correct" whitelist, I would have to understand every cast type fairly deeply.  Given that I see several cast kinds in that list that I know nothing about, I'm concerned with the level of effort required, and that I still won't be able to fully mitigate the risk.
> > > 
> > > Can you help me understand your rationale for preferring a whitelist in this case?  Or, do you feel I've overstated the risk here?  I'm not emotionally invested in being proven correct-- just want to learn!  :)
> > > 
> > Generally, if you're not sure, try to not to warn. False positives are worse than false negatives. It is natural to start with a small set of cases in which you're sure, and then slowly cover more and more cases as you investigate further. Leave a comment if you're unsure if you covered all cases, so that people knew that the code is potentially incomplete and it's a good place to look for potential improvements.
> > 
> > When you're actively developing the checker, you should rely on your own experiments with the real-world code that you plan to analyze. When the checker is still in alpha, it's fine for it to be more aggressive than necessary, because "you're still experimenting with it", but this will need to be addressed when the checker is moved out of alpha.
> > 
> > In particular, you have the following tricks at your disposal:
> > - Take a large codebase, run your checker on it (you'll need to do this regularly anyway) and include the cast kind in the warning text. This will give you a nice list of casts that you want to support (and probably also a list of casts that you //don't// want to support).
> > - If you want to be notified of false negatives by your power-users, instead of a whitelist put an assertion (immediately before emitting the report) that the cast kind is one of the known cast kinds. This is a bit evil, of course, but it only affects the users that run analysis with assertions on, which means custom-built clang, which means power-users that actively want to report these bugs.
> > - If you don't know what code does a specific AST node kind represent, as a last resort you can always assert that this kind of node never gets constructed and see which clang tests fail.
> > - If you want to have some protection against newly introduced cast kinds, make a `switch (getCastKind())` and don't add a `default:` case into it. This way anybody who introduces a new cast kind would get a compiler warning ("unhandled case in switch") and will be forced to take a look at how this code should handle the new cast kind.
> Okay.  Very fair points.  I've been focused too much on released production code lately-- the mantra there is to be a surgical as possible, and minimize risk of any changes other than the one we are specifically trying to correct.  Of course, that doesn't preclude wider-scope change, but it requires a great deal more time and effort to validate.  In this case, where the code is alpha, and where missing a valid warning is better than issuing a spurious one, it makes sense to do as you've described.  I'll work on that.   Thanks for taking the time to explain in more detail.
> 
> Logistical question now (since this is my first change and review for LLVM/clang code) -- does this review stall until I have updated code (which may not happen quickly due to other work on my plate)?  Or do we greenlight this change, and I just follow it up later with the whitelist version in a separate review?
This change is already is improvement, so feel free to add a `// FIXME: Turn this into a whitelist.` and commit :)

Generally, patches don't need to be perfect, but they should be going in the right direction and shouldn't cause glaring regressions. A lot of review feedback is not in order to block the patch, but instead to help you make it even better and share common values and such.

================
Comment at: clang/test/Analysis/enum-cast-out-of-range.c:27-34
+enum unscoped_unspecified_t unused;
+void unusedExpr() {
+    // following line is not something that EnumCastOutOfRangeChecker should evaluate.  checker should either ignore this line
+    // or process it without producing any warnings.  However, compilation will (and should) still generate a warning having 
+    // nothing to do with this checker.
+    unused; // expected-warning {{expression result unused}}
+}
----------------
chrish_ericsson_atx wrote:
> NoQ wrote:
> > I guess this covers D33672#1537287!
> > 
> > That said, there's one thing about this test that i don't understand. Why does this AST contain an implicit lvalue-to-rvalue cast at all? This looks like a (most likely otherwise harmless) bug in the AST; if i understand correctly, this should be a freestanding `DeclRefExpr`.
> It sure looks like D33672 is the same bug, so yes, I bet it will fix that one.   This fix also addresses https://bugs.llvm.org/show_bug.cgi?id=41388.  (Forgive me if there's a way to directly link to bugs.llvm.org with this markup.)
> 
> Not sure about whether the AST should have represented this node as a DeclRefExpr-- that certainly makes some sense, but the LValueToRValue cast isn't really wrong either.   At the end of the day, this statement is useless, so I'm not sure it matters (much) how the AST represents it.  Representing it as LValueToRValue cast wouldn't cause side effects, would it? (E.g.,  cause IR to contain explicit dereferencing code?)
Aha, i think i made some sense out of it. That's different in C and C++. In particular, if you have a freestanding `*x` where `x` is a null pointer, it's UB in C but not in C++ (though `&*x` is explicitly defined to be the same as `x` in C, so it's not a UB anymore if you proceed to turn it back into a pointer rvalue). I figure, Clang models these different rules by automatically wrapping any freestanding lvalue into an implicit lvalue-to-rvalue cast in C but not in C++.

Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66014/new/

https://reviews.llvm.org/D66014