[all-commits] [llvm/llvm-project] 820403: [analyzer] Never create Regions wrapping reference...

Balazs Benics via All-commits all-commits at lists.llvm.org
Fri Nov 29 11:36:45 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 820403c4e04db1f4adc8528bec33d393a5be3856
      https://github.com/llvm/llvm-project/commit/820403c4e04db1f4adc8528bec33d393a5be3856
  Author: Balazs Benics <benicsbalazs at gmail.com>
  Date:   2024-11-29 (Fri, 29 Nov 2024)

  Changed paths:
    M clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
    M clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
    M clang/lib/StaticAnalyzer/Core/MemRegion.cpp
    M clang/lib/StaticAnalyzer/Core/ProgramState.cpp
    M clang/test/Analysis/initializer.cpp

  Log Message:
  -----------
  [analyzer] Never create Regions wrapping reference TypedValueRegions (NFCI) (#118096)

Like in the test case:
```c++
struct String {
  String(const String &) {}
};

struct MatchComponent {
  unsigned numbers[2];
  String prerelease;
  MatchComponent(MatchComponent const &) = default;
};

MatchComponent get();
void consume(MatchComponent const &);

MatchComponent parseMatchComponent() {
  MatchComponent component = get();
  component.numbers[0] = 10;
  component.numbers[1] = 20;
  return component; // We should have no stack addr escape warning here.
}

void top() {
  consume(parseMatchComponent());
}
```

When calling `consume(parseMatchComponent())` the
`parseMatchComponent()` would return a copy of a temporary of
`component`. That copy would invoke the
`MatchComponent::MatchComponent(const MatchComponent &)` ctor.

That ctor would have a (reference typed) ParamVarRegion, holding the
location (lvalue) of the object we are about to copy (&component). So
far so good, but just before evaluating the binding operation for
initializing the `numbers` field of the temporary, we evaluate the
ArrayInitLoopExpr representing the by-value elementwise copy of the
array `component.numbers`. This is represented by a LazyCompoundVal,
because we (usually) don't just copy large arrays and bind many
individual direct bindings. Rather, we take a snapshot by using a LCV.

However, notice that the LCV representing this copy would look like
this:
  lazyCompoundVal{ParamVarRegion{"reference param of cctor"}.numbers}

Notice that it refers to the numbers field of a reference. It would be
much better to desugar the reference to the actual object, thus it
should be: `lazyCompoundVal{component.numbers}`

Actually, when binding the result of the ArrayInitLoopExpr to the
`temp_object.numbers` in the compiler-generated member initializer of
the cctor, we should have two individual direct bindings because this is
a "small array":
```
  binding &Element{temp_object.numbers, 0} <- loadFrom(&Element{component.numbers, 0})
  binding &Element{temp_object.numbers, 1} <- loadFrom(&Element{component.numbers, 1})
```
Where `loadFrom(...)` would be:
```
  loadFrom(&Element{component.numbers, 0}): 10 U32b
  loadFrom(&Element{component.numbers, 1}): 20 U32b
```
So the store should look like this, after PostInitializer of
`temp_object.numbers`:
```
  temp_object at offset  0: 10 U32b
  temp_object at offset 32: 20 U32b
```

The lesson is that it's okay to have TypedValueRegions of references as
long as we don't form subregions of those. If we ever want to refer to a
subregion of a "reference" we actually meant to "desugar" the reference
and slice a subregion of the pointee of the reference instead.

Once this canonicalization is in place, we can also drop the special
handling of references in `ProcessInitializer`, because now reference
TypedValueRegions are eagerly desugared into their referee region when
forming a subregion of it.

There should be no practical differences, but there are of course bugs
that this patch may surface.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list