[PATCH] D76996: [analyzer] Improve PlacementNewChecker

Karasev Nikita via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Apr 19 00:30:25 PDT 2020


f00kat marked 4 inline comments as done.
f00kat added a comment.

Ping? :)



================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:189
+  if (const MemRegion *MRegion = PlaceVal.getAsRegion()) {
+    if (const ElementRegion *TheElementRegion =
+            MRegion->getAs<ElementRegion>()) {
----------------
NoQ wrote:
> NoQ wrote:
> > The sequence of `FieldRegion`s and `ElementRegion`s on top of a base region may be arbitrary: `var.a[0].b[1][2].c.d[3]` etc.
> > 
> > I'd rather unwrap those regions one-by-one in a loop and look at the alignment of each layer.
> Alternatively, just decompose the whole region into base region and offset and see if base region has the necessary alignment and the offset is divisible by the necessary alignment.
> The sequence of FieldRegions and ElementRegions on top of a base region may be arbitrary: var.a[0].b[1][2].c.d[3] etc.

But i think(hope) I already do this and even have tests for this cases. For example
```void test22() {
  struct alignas(alignof(short)) Z {
    char p;
    char c[10];
  };

  struct Y {
    char p;
    Z b[10];
  };

  struct X {
    Y a[10];
  } Xi; // expected-note {{'Xi' initialized here}}

  // ok. 2(X align) + 1 (offset Y.p) + 1(align Z.p to 'short') + 1(offset Z.p) + 3(index)
  ::new (&Xi.a[0].b[0].c[3]) long;
}```

Cases with multidimensional arrays will also be handled correctly because method 'TheElementRegion->getAsArrayOffset()' calculates the offset for multidimensional arrays
```void testXX() {
	struct Y {
		char p;
		char b[10][10];
	};

	struct X {
		Y a[10];
	} Xi;

	::new (&Xi.a[0].b[0][0]) long;
}```

I can explain the code below for ElementRegion if needed.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:189
+  if (const MemRegion *MRegion = PlaceVal.getAsRegion()) {
+    if (const ElementRegion *TheElementRegion =
+            MRegion->getAs<ElementRegion>()) {
----------------
f00kat wrote:
> NoQ wrote:
> > NoQ wrote:
> > > The sequence of `FieldRegion`s and `ElementRegion`s on top of a base region may be arbitrary: `var.a[0].b[1][2].c.d[3]` etc.
> > > 
> > > I'd rather unwrap those regions one-by-one in a loop and look at the alignment of each layer.
> > Alternatively, just decompose the whole region into base region and offset and see if base region has the necessary alignment and the offset is divisible by the necessary alignment.
> > The sequence of FieldRegions and ElementRegions on top of a base region may be arbitrary: var.a[0].b[1][2].c.d[3] etc.
> 
> But i think(hope) I already do this and even have tests for this cases. For example
> ```void test22() {
>   struct alignas(alignof(short)) Z {
>     char p;
>     char c[10];
>   };
> 
>   struct Y {
>     char p;
>     Z b[10];
>   };
> 
>   struct X {
>     Y a[10];
>   } Xi; // expected-note {{'Xi' initialized here}}
> 
>   // ok. 2(X align) + 1 (offset Y.p) + 1(align Z.p to 'short') + 1(offset Z.p) + 3(index)
>   ::new (&Xi.a[0].b[0].c[3]) long;
> }```
> 
> Cases with multidimensional arrays will also be handled correctly because method 'TheElementRegion->getAsArrayOffset()' calculates the offset for multidimensional arrays
> ```void testXX() {
> 	struct Y {
> 		char p;
> 		char b[10][10];
> 	};
> 
> 	struct X {
> 		Y a[10];
> 	} Xi;
> 
> 	::new (&Xi.a[0].b[0][0]) long;
> }```
> 
> I can explain the code below for ElementRegion if needed.
?


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:25
 public:
   void checkPreStmt(const CXXNewExpr *NE, CheckerContext &C) const;
 
----------------
NoQ wrote:
> Before i forget: Ideally @martong should have subscribed to [[ https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerDocumentation.html#a7fdb3b5ff726f4c5e782cef0d59c01ad | `checkNewAllocator` ]] because it fires before the construct-expression whereas this callback fires after construct-expression which is too late as the UB we're trying to catch has occured much earlier.
Ops, I forgot about it. Will be fixed soon.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:25
 public:
   void checkPreStmt(const CXXNewExpr *NE, CheckerContext &C) const;
 
----------------
f00kat wrote:
> NoQ wrote:
> > Before i forget: Ideally @martong should have subscribed to [[ https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerDocumentation.html#a7fdb3b5ff726f4c5e782cef0d59c01ad | `checkNewAllocator` ]] because it fires before the construct-expression whereas this callback fires after construct-expression which is too late as the UB we're trying to catch has occured much earlier.
> Ops, I forgot about it. Will be fixed soon.
When I use checkNewAllocator instead of check::PreStmt<CXXNewExpr> an error occures in method PathDiagnosticBuilder::generate.
In the code ErrorNode->getLocation().getTag() because ProgramPoint contains an empty ProgramPointTag. 
I`m not very good with analyzer so its hard to understand what`s going on, but I`m still trying..



================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:165-166
+  if (const MemRegion *MRegion = PlaceVal.getAsRegion()) {
+    if (const FieldRegion *TheFieldRegion = MRegion->getAs<FieldRegion>())
+      MRegion = TheFieldRegion->getBaseRegion();
+
----------------
NoQ wrote:
> You're saying that `A` is a struct and `a` is of type `A` and `&a` is sufficiently aligned then for //every// field `f` in the struct `&a.f` is sufficiently aligned. I'm not sure it's actually the case.
Yeah..Maybe we should take into account struct type align plus field offset? Currently, I am relying only on the fact that just struct type is aligned properly

For example
```struct X
{
  char a;
  int b;
} x;```

Type X is aligned to 'int' and thus field 'x.a' is also aligned to 'int' because it goes first

```struct Y
{
  char a;
  int b;
  char c;
  char d;
} y;```

But here field 'y.d' is aligned to 'char'

I will learn more about RegionOffset and will be back :)


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:196-197
+    }
+  } else if (Optional<loc::ConcreteInt> TheConcreteInt =
+                 PlaceVal.getAs<loc::ConcreteInt>()) {
+    uint64_t PlaceAlign = *TheConcreteInt.getValue().getValue().getRawData();
----------------
NoQ wrote:
> I don't think you'll ever see this case in a real-world program. Even if you would, i doubt we'll behave as expected, because we have [[ https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/clang/lib/StaticAnalyzer/Core/Store.cpp#L456 | certain hacks ]] in place that mess up arithmetic on concrete pointers. I appreciate your thinking but i suggest removing this section for now as it'll probably cause more false positives than true positives.
Okay I will remove it!


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:125
+        Msg = std::string(llvm::formatv(
+            "Possibly not enough {0} bytes for array allocation which "
+            "requires "
----------------
martong wrote:
> Maybe the below could be a wording that's more easy to follow?
> `{0} bytes is possibly not enough for array allocation which ...`
Yeah. Sure. I have some troubles with english :)


================
Comment at: clang/test/Analysis/placement-new.cpp:265
+
+  // bad 2(custom align) + 1(index '2' offset)
+  ::new (&Xi.b[1]) long; // expected-warning{{Storage type is aligned to 3 bytes but allocated type is aligned to 8 bytes}} expected-note 1 {{}}
----------------
martong wrote:
> Maybe it is just me, but the contents of the parens here and above seems a bit muddled `(index '2' offset)`. This should be `(index '1' offset)`, shouldn't it?
> What is the exact meaning of the number in the hyphens (`'2'` in this case), could you please elaborate?
Yeah, sorry it is copy-paste error. Of course there should be '1'.
```// bad 2(custom align) + 1(index '1' offset)```


================
Comment at: clang/test/Analysis/placement-new.cpp:256
+
+void f9() {
+  struct X {
----------------
martong wrote:
> First I was wondering if we indeed handle correctly structs with nested arrays whose element's type is a structs with nested arrays (... and so on).
> 
> So, I tried the below test, and it seems okay. Thus I think it might be worth to add something similar to it.
> 
> ```
> void f9_1() {
>   struct Y {
>     char a;
>     alignas(alignof(short)) char b[20];
>   };
>   struct X {
>     char e;
>     Y f[20];
>   } Xi; // expected-note {{'Xi' initialized here}}
> 
>   // ok 2(custom align) + 6*1(index '6' offset)
>   ::new (&Xi.f[6].b[6]) long;
> 
>   // bad 2(custom align) + 1*1(index '1' offset)
>   ::new (&Xi.f[1].b[1]) long; // expected-warning{{Storage type is aligned to 3 bytes but allocated type is aligned to 8 bytes}} expected-note 1 {{}}
> }
> 
> ```
Thanks! Added tests for this cases.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76996/new/

https://reviews.llvm.org/D76996





More information about the cfe-commits mailing list