[PATCH] D137379: [-Wunsafe-buffer-usage] Add warnings for unsafe buffer accesses by array subscript operations

Mon Dec 5 15:19:02 PST 2022

xazax.hun added inline comments.

================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:168-169
+  static Matcher matcher() {
+    // FIXME: What if the index is integer literal 0? Should this be
+    // a safe gadget in this case?
+    return stmt(
----------------
NoQ wrote:
> aaron.ballman wrote:
> > xazax.hun wrote:
> > > As per some of the discussions, in the future the compiler might be able to recognize certain safe patterns, e.g., when there is a simple for loop with known bounds, or when both the index and the array size is statically known.
> > > 
> > > I think here we need to make a very important design decision: Do we want the gadgets to have the right "safety" category when it is created (e.g., we have to be able to decide if a gadget is safe or not using matchers), or do we want some mechanisms to be able to promote an unsafe gadget to be a safe one? (E.g., do we want to be able to prove some unsafe gadgets safe using dataflow analysis in a later pass?)
> > (FWIW, this is a great question and I really appreciate you asking it!)
> My original design implies that safe gadgets are a separate hierarchy, so there will be a new gadget class for index zero, and this gadget will be changed to skip index zero. But I don't immediately see why such early separation is strictly necessary, other than for a bit of extra type safety (extra virtual methods of the `UnsafeGadget` class don't make sense on safe gadgets). We *do* have time to make this distinction later, before we get to emitting warnings.
> 
> So maybe eventually we'll end up replacing `isSafe()` with a pure virtual method and deprecate `SafeGadget` and `UnsafeGadget` base classes, if we see it cause too much duplication or it turns out that the extra analysis necessary to establish safety can't be nicely implemented in ASTMatchers. In this case I'll admit that I over-engineered it a bit.
I'd strongly prefer `isSafe()` method over inheritance. While adding safe and unsafe gadgets for zero indices is not too bad, if we plan to expand this to more cases, like known array size + compile time constant index, it will get harder and harder to keep all the gadgets in sync. We can end up missing cases or detecting a code snippet both as safe and unsafe. The `isSafe` method would basically get rid of a whole class of problems, where safe and unsafe versions of the gadgets are overlapping. That being said, I am not insisting doing this change in this PR, it is OK doing the change later in a follow-up.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137379/new/

https://reviews.llvm.org/D137379