[PATCH] D76996: [analyzer] Improve PlacementNewChecker
Karasev Nikita via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Sun Apr 19 00:30:25 PDT 2020
f00kat marked 4 inline comments as done.
f00kat added a comment.
Ping? :)
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:189
+ if (const MemRegion *MRegion = PlaceVal.getAsRegion()) {
+ if (const ElementRegion *TheElementRegion =
+ MRegion->getAs<ElementRegion>()) {
----------------
NoQ wrote:
> NoQ wrote:
> > The sequence of `FieldRegion`s and `ElementRegion`s on top of a base region may be arbitrary: `var.a[0].b[1][2].c.d[3]` etc.
> >
> > I'd rather unwrap those regions one-by-one in a loop and look at the alignment of each layer.
> Alternatively, just decompose the whole region into base region and offset and see if base region has the necessary alignment and the offset is divisible by the necessary alignment.
> The sequence of FieldRegions and ElementRegions on top of a base region may be arbitrary: var.a[0].b[1][2].c.d[3] etc.
But i think(hope) I already do this and even have tests for this cases. For example
```void test22() {
struct alignas(alignof(short)) Z {
char p;
char c[10];
};
struct Y {
char p;
Z b[10];
};
struct X {
Y a[10];
} Xi; // expected-note {{'Xi' initialized here}}
// ok. 2(X align) + 1 (offset Y.p) + 1(align Z.p to 'short') + 1(offset Z.p) + 3(index)
::new (&Xi.a[0].b[0].c[3]) long;
}```
Cases with multidimensional arrays will also be handled correctly because method 'TheElementRegion->getAsArrayOffset()' calculates the offset for multidimensional arrays
```void testXX() {
struct Y {
char p;
char b[10][10];
};
struct X {
Y a[10];
} Xi;
::new (&Xi.a[0].b[0][0]) long;
}```
I can explain the code below for ElementRegion if needed.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:189
+ if (const MemRegion *MRegion = PlaceVal.getAsRegion()) {
+ if (const ElementRegion *TheElementRegion =
+ MRegion->getAs<ElementRegion>()) {
----------------
f00kat wrote:
> NoQ wrote:
> > NoQ wrote:
> > > The sequence of `FieldRegion`s and `ElementRegion`s on top of a base region may be arbitrary: `var.a[0].b[1][2].c.d[3]` etc.
> > >
> > > I'd rather unwrap those regions one-by-one in a loop and look at the alignment of each layer.
> > Alternatively, just decompose the whole region into base region and offset and see if base region has the necessary alignment and the offset is divisible by the necessary alignment.
> > The sequence of FieldRegions and ElementRegions on top of a base region may be arbitrary: var.a[0].b[1][2].c.d[3] etc.
>
> But i think(hope) I already do this and even have tests for this cases. For example
> ```void test22() {
> struct alignas(alignof(short)) Z {
> char p;
> char c[10];
> };
>
> struct Y {
> char p;
> Z b[10];
> };
>
> struct X {
> Y a[10];
> } Xi; // expected-note {{'Xi' initialized here}}
>
> // ok. 2(X align) + 1 (offset Y.p) + 1(align Z.p to 'short') + 1(offset Z.p) + 3(index)
> ::new (&Xi.a[0].b[0].c[3]) long;
> }```
>
> Cases with multidimensional arrays will also be handled correctly because method 'TheElementRegion->getAsArrayOffset()' calculates the offset for multidimensional arrays
> ```void testXX() {
> struct Y {
> char p;
> char b[10][10];
> };
>
> struct X {
> Y a[10];
> } Xi;
>
> ::new (&Xi.a[0].b[0][0]) long;
> }```
>
> I can explain the code below for ElementRegion if needed.
?
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:25
public:
void checkPreStmt(const CXXNewExpr *NE, CheckerContext &C) const;
----------------
NoQ wrote:
> Before i forget: Ideally @martong should have subscribed to [[ https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerDocumentation.html#a7fdb3b5ff726f4c5e782cef0d59c01ad | `checkNewAllocator` ]] because it fires before the construct-expression whereas this callback fires after construct-expression which is too late as the UB we're trying to catch has occured much earlier.
Ops, I forgot about it. Will be fixed soon.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:25
public:
void checkPreStmt(const CXXNewExpr *NE, CheckerContext &C) const;
----------------
f00kat wrote:
> NoQ wrote:
> > Before i forget: Ideally @martong should have subscribed to [[ https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerDocumentation.html#a7fdb3b5ff726f4c5e782cef0d59c01ad | `checkNewAllocator` ]] because it fires before the construct-expression whereas this callback fires after construct-expression which is too late as the UB we're trying to catch has occured much earlier.
> Ops, I forgot about it. Will be fixed soon.
When I use checkNewAllocator instead of check::PreStmt<CXXNewExpr> an error occures in method PathDiagnosticBuilder::generate.
In the code ErrorNode->getLocation().getTag() because ProgramPoint contains an empty ProgramPointTag.
I`m not very good with analyzer so its hard to understand what`s going on, but I`m still trying..
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:165-166
+ if (const MemRegion *MRegion = PlaceVal.getAsRegion()) {
+ if (const FieldRegion *TheFieldRegion = MRegion->getAs<FieldRegion>())
+ MRegion = TheFieldRegion->getBaseRegion();
+
----------------
NoQ wrote:
> You're saying that `A` is a struct and `a` is of type `A` and `&a` is sufficiently aligned then for //every// field `f` in the struct `&a.f` is sufficiently aligned. I'm not sure it's actually the case.
Yeah..Maybe we should take into account struct type align plus field offset? Currently, I am relying only on the fact that just struct type is aligned properly
For example
```struct X
{
char a;
int b;
} x;```
Type X is aligned to 'int' and thus field 'x.a' is also aligned to 'int' because it goes first
```struct Y
{
char a;
int b;
char c;
char d;
} y;```
But here field 'y.d' is aligned to 'char'
I will learn more about RegionOffset and will be back :)
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:196-197
+ }
+ } else if (Optional<loc::ConcreteInt> TheConcreteInt =
+ PlaceVal.getAs<loc::ConcreteInt>()) {
+ uint64_t PlaceAlign = *TheConcreteInt.getValue().getValue().getRawData();
----------------
NoQ wrote:
> I don't think you'll ever see this case in a real-world program. Even if you would, i doubt we'll behave as expected, because we have [[ https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/clang/lib/StaticAnalyzer/Core/Store.cpp#L456 | certain hacks ]] in place that mess up arithmetic on concrete pointers. I appreciate your thinking but i suggest removing this section for now as it'll probably cause more false positives than true positives.
Okay I will remove it!
================
Comment at: clang/lib/StaticAnalyzer/Checkers/CheckPlacementNew.cpp:125
+ Msg = std::string(llvm::formatv(
+ "Possibly not enough {0} bytes for array allocation which "
+ "requires "
----------------
martong wrote:
> Maybe the below could be a wording that's more easy to follow?
> `{0} bytes is possibly not enough for array allocation which ...`
Yeah. Sure. I have some troubles with english :)
================
Comment at: clang/test/Analysis/placement-new.cpp:265
+
+ // bad 2(custom align) + 1(index '2' offset)
+ ::new (&Xi.b[1]) long; // expected-warning{{Storage type is aligned to 3 bytes but allocated type is aligned to 8 bytes}} expected-note 1 {{}}
----------------
martong wrote:
> Maybe it is just me, but the contents of the parens here and above seems a bit muddled `(index '2' offset)`. This should be `(index '1' offset)`, shouldn't it?
> What is the exact meaning of the number in the hyphens (`'2'` in this case), could you please elaborate?
Yeah, sorry it is copy-paste error. Of course there should be '1'.
```// bad 2(custom align) + 1(index '1' offset)```
================
Comment at: clang/test/Analysis/placement-new.cpp:256
+
+void f9() {
+ struct X {
----------------
martong wrote:
> First I was wondering if we indeed handle correctly structs with nested arrays whose element's type is a structs with nested arrays (... and so on).
>
> So, I tried the below test, and it seems okay. Thus I think it might be worth to add something similar to it.
>
> ```
> void f9_1() {
> struct Y {
> char a;
> alignas(alignof(short)) char b[20];
> };
> struct X {
> char e;
> Y f[20];
> } Xi; // expected-note {{'Xi' initialized here}}
>
> // ok 2(custom align) + 6*1(index '6' offset)
> ::new (&Xi.f[6].b[6]) long;
>
> // bad 2(custom align) + 1*1(index '1' offset)
> ::new (&Xi.f[1].b[1]) long; // expected-warning{{Storage type is aligned to 3 bytes but allocated type is aligned to 8 bytes}} expected-note 1 {{}}
> }
>
> ```
Thanks! Added tests for this cases.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D76996/new/
https://reviews.llvm.org/D76996
More information about the cfe-commits
mailing list