[PATCH] D137379: [-Wunsafe-buffer-usage] Add warnings for unsafe buffer accesses by array subscript operations

Mon Nov 28 13:46:01 PST 2022

NoQ added inline comments.

================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:168-169
+  static Matcher matcher() {
+    // FIXME: What if the index is integer literal 0? Should this be
+    // a safe gadget in this case?
+    return stmt(
----------------
aaron.ballman wrote:
> xazax.hun wrote:
> > As per some of the discussions, in the future the compiler might be able to recognize certain safe patterns, e.g., when there is a simple for loop with known bounds, or when both the index and the array size is statically known.
> > 
> > I think here we need to make a very important design decision: Do we want the gadgets to have the right "safety" category when it is created (e.g., we have to be able to decide if a gadget is safe or not using matchers), or do we want some mechanisms to be able to promote an unsafe gadget to be a safe one? (E.g., do we want to be able to prove some unsafe gadgets safe using dataflow analysis in a later pass?)
> (FWIW, this is a great question and I really appreciate you asking it!)
My original design implies that safe gadgets are a separate hierarchy, so there will be a new gadget class for index zero, and this gadget will be changed to skip index zero. But I don't immediately see why such early separation is strictly necessary, other than for a bit of extra type safety (extra virtual methods of the `UnsafeGadget` class don't make sense on safe gadgets). We *do* have time to make this distinction later, before we get to emitting warnings.

So maybe eventually we'll end up replacing `isSafe()` with a pure virtual method and deprecate `SafeGadget` and `UnsafeGadget` base classes, if we see it cause too much duplication or it turns out that the extra analysis necessary to establish safety can't be nicely implemented in ASTMatchers. In this case I'll admit that I over-engineered it a bit.

================
Comment at: clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp:10-13
+void foo(...);
+
+void * bar(void);
+char * baz(void);
----------------
ziqingluo-90 wrote:
> steakhal wrote:
> > I would expect this test file to grow quite a bit.
> > As such, I think we should have more self-descriptive names for these functions.
> > 
> > I'm also curious what's the purpose of `foo()`in the examples.
> Thanks for the comment.  I agree that they should have better names or at least explaining comments.
> 
> > I'm also curious what's the purpose of `foo()`in the examples.
> 
> I make all testing expressions arguments of `foo` so that I do not have to create statements to use these expressions while avoiding irrelevant warnings.
That's pretty cool but please note that when `foo()` is declared this way, it becomes a "C-style variadic function" - a very exotic construct that you don't normally see in code (the only practical example is the `printf`/`scanf` family of functions). So it may be good that we cover this exotic case from the start, but it may also be really bad that we don't cover the *basic* case.

C++ offers a different way to declare variadic functions: //variadic templates// (https://en.cppreference.com/w/cpp/language/parameter_pack). These are less valuable to test because they expand to AST that's very similar to the basic case, but that also allows you to cover the basic case better.

Or you can always make yourself happy with a few overloads that cover all your needs, it's not like we're worried about code duplication in tests:
```lang=c++
void foo(int);
void foo(int, int);
void foo(int, int, int);
void foo(int, int, int, int);
void foo(int, int, int, int, int);
void foo(int, int, int, int, int, int);
```

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137379/new/

https://reviews.llvm.org/D137379