[clang] [clang] Fix a crash from nested ArrayInitLoopExpr (PR #67722)

via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 28 14:46:33 PDT 2023


isuckatcs wrote:

> I don't think I understand the bug based on your description.

I'll try my best to explain it a bit better.

So, this is the function that runs and triggers the assertion failure at one point.
```c++
bool ArrayExprEvaluator::VisitArrayInitLoopExpr(const ArrayInitLoopExpr *E) {
  LValue CommonLV;
  if (E->getCommonExpr() &&
      !Evaluate(Info.CurrentCall->createTemporary( // assertion failure triggered from createTemporary()
                    E->getCommonExpr(),
                    getStorageType(Info.Ctx, E->getCommonExpr()),
                    ScopeKind::FullExpression, CommonLV),
                Info, E->getCommonExpr()->getSourceExpr()))
    return false;

  auto *CAT = cast<ConstantArrayType>(E->getType()->castAsArrayTypeUnsafe());

  uint64_t Elements = CAT->getSize().getZExtValue();
  Result = APValue(APValue::UninitArray(), Elements, Elements);

  LValue Subobject = This;
  Subobject.addArray(Info, E, CAT);

  bool Success = true;
  for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
    if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                         Info, Subobject, E->getSubExpr()) ||
        !HandleLValueArrayAdjustment(Info, E, Subobject,
                                     CAT->getElementType(), 1)) {
      if (!Info.noteFailure())
        return false;
      Success = false;
    }
  }
  return Success;
}
```

Let's see how it runs on the following snippet and how the AST looks like:
```c++
  S s[2][2];

  auto [a1,a2] = s; // the ast is only shown for this line
```
```
`-DeclStmt ...
  `-DecompositionDecl ... used 'S[2][2]' cinit
    |-ArrayInitLoopExpr... 'S[2][2]'
    | |-OpaqueValueExpr ... 'S[2][2]' lvalue
    | | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
    | `-ArrayInitLoopExpr ... 'S[2]'
    |   |-OpaqueValueExpr ... 'S[2]' lvalue
    |   | `-ArraySubscriptExpr ... 'S[2]' lvalue
    |   |   |-ImplicitCastExpr ... 'S (*)[2]' <ArrayToPointerDecay>
    |   |   | `-OpaqueValueExpr ... 'S[2][2]' lvalue
    |   |   |   `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
    |   |   `-ArrayInitIndexExpr 'unsigned long'
    |   `-<initializer used for each array member>
    |-BindingDecl ... a1 'S[2]'
    | `-...
    `-BindingDecl ... a2 'S[2]'
      `-...
```

A quick anatomy guide for `ArrayInitLoopExpr`:
```c++
ArrayInitLoopExpr
|-OpaqueValueExpr // getCommonExpr() returns this
| `-DeclRefExpr <source_array> // always wrapped inside an OpaqueValueExpr
`-<initializer_used_for_each_element> // getSubExpr() returns this
  `-ArrayInitIndexExpr // a placeholder node denoting the "current" index of the element being initialized
```
For more details please check the clang documentation of [ArrayInitLoopExpr](https://clang.llvm.org/doxygen/classclang_1_1ArrayInitLoopExpr.html) and [ArrayInitIndexExpr](https://clang.llvm.org/doxygen/classclang_1_1ArrayInitIndexExpr.html)

So, let's see the step by step execution of `VisitArrayInitLoopExpr()`.
```c++
-ArrayInitLoopExpr... 'S[2][2]' // <-- we are here
 |-OpaqueValueExpr ... 'S[2][2]' lvalue
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]'
```

1. We create a temporary region for the `OpaqueValueExpr`, returned by `E->getCommonExpr()` and try to evaluate the `DeclRefExpr` inside this temporary.

```c++
if (
    E->getCommonExpr() &&
    !Evaluate(Info.CurrentCall->createTemporary(E->getCommonExpr(), ...), Info, E->getCommonExpr()->getSourceExpr())
)
    return false;
```
I say we create it for the `OpaqueValueExpr`, because that will be the key of the region inside the stack frame.
```c++
// Key is passed from `E->getCommonExpr()`.
template<typename KeyT>
APValue &CallStackFrame::createTemporary(const KeyT *Key, QualType T,
                                         ScopeKind Scope, LValue &LV) {
  unsigned Version = getTempVersion();
  APValue::LValueBase Base(Key, Index, Version);
  LV.set(Base);
  return createLocal(Base, Key, T, Scope);
}
```
```c++
APValue &CallStackFrame::createLocal(APValue::LValueBase Base, const void *Key, ...) {
  ...
  unsigned Version = Base.getVersion();
  APValue &Result = Temporaries[MapKeyTy(Key, Version)];
  assert(Result.isAbsent() && "local created multiple times");
  ...
}
```
So, from what we know, currently the stack frame looks like this:
```
{
  OpaqueValueExpr ... 'S[2][2]' (temporary)
}
```

2. We start iterating over the initializer that is used for each element, returned by `getSubExpr()` as many times, as many elements the source array has and attempt to analyze it.
```c++
for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
    if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                         Info, Subobject, E->getSubExpr()) ||
        !HandleLValueArrayAdjustment(Info, E, Subobject,
                                     CAT->getElementType(), 1)) {
      if (!Info.noteFailure())
        return false;
      Success = false;
    }
  }
```

3. Since in the current case, this initializer also happens to be an `ArrayInitLoopExpr`, we go back to step 1 again.
```c++
-ArrayInitLoopExpr... 'S[2][2]'
 |-OpaqueValueExpr ... 'S[2][2]' lvalue
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]' // <-- we are here
   |-OpaqueValueExpr ... 'S[2]' lvalue // <-- temporary created for this
   | `-ArraySubscriptExpr ... 'S[2]' lvalue
   |   |-ImplicitCastExpr ... 'S (*)[2]' <ArrayToPointerDecay>
   |   | `-OpaqueValueExpr ... 'S[2][2]' lvalue
   |   |   `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
   |   `-ArrayInitIndexExpr 'unsigned long'
   `-<initializer used for each array member>
```
After creating the the new temporary, the stack frame looks like this:
```
{
  OpaqueValueExpr ... 'S[2]' (temporary)
  OpaqueValueExpr ... 'S[2][2]' (temporary)
}
```
Now let's pretend that the loop presented in step 2 finished execution inside `ArrayInitLoopExpr ... 'S[2]'` and jump back to the point where we were before step 3.

4. We finished evaluating the `ArrayInitLoopExpr ... 'S[2]'` once, so `!Info.noteFailure()` is checked.
```c++
-ArrayInitLoopExpr... 'S[2][2]' // <-- we are here
 |-OpaqueValueExpr ... 'S[2][2]' lvalue
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]' // <-- we evaluated this in step 3
```
```c++
for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
    if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                         Info, Subobject, E->getSubExpr()) ||
        !HandleLValueArrayAdjustment(Info, E, Subobject,
                                     CAT->getElementType(), 1)) {
      if (!Info.noteFailure()) // <-- we are here
        return false;
      Success = false;
    }
  }
```
In the snippet that we're currently analyzing, `Info.noteFailure()` will return `false`, so we don't loop over the same `Expr` once again (note that the array has 2 elements and because of that the loop would iterate twice), but return `false` immediately, so there is no assertion failure.

Note that the stack frame still looks like this:
```
{
  OpaqueValueExpr ... 'S[2]' (temporary) <-- yes, this is not cleaned up
  OpaqueValueExpr ... 'S[2][2]' (temporary)
}
```


Now let's see the snippet that hits the assertion failure and why it actually happens, although I think I spoiled it already.
```c++
S s[2][2];

int res[4];

res[0] = [s] { return s[0][0].i; }();
```
Everything up until step 4 is exactly the same as it was with the previous snippet. The difference comes in step 4, where `!Info.noteFailure()` return `true`, so we attempt to evaluate the same `ArrayInitLoopExpr ... 'S[2]'` once again.

Remember that when we enter that `Expr`, we are at step 1 and create a temporary, however as I noted previously an `OpaqueValueExpr ... 'S[2]' (temporary)` is already in the stack frame and we attempt to create it once again since the key expression is the same. Creating the same local variable twice is not allowed, so we fail the assertion.
```c++
APValue &CallStackFrame::createLocal(...) {
  ...
  unsigned Version = Base.getVersion();
  APValue &Result = Temporaries[MapKeyTy(Key, Version)];
  assert(Result.isAbsent() && "local created multiple times"); // <- it is not absent
  ...
}
```

And the reason for why `!Info.noteFailure()` returns `true` for this snippet is that `CheckingForUndefinedBehavior` is `true`, which is returned like this:
```c++
    [[nodiscard]] bool noteFailure() {
      bool KeepGoing = keepEvaluatingAfterFailure();
      EvalStatus.HasSideEffects |= KeepGoing;
      return KeepGoing;
    }
```
```c++
    bool keepEvaluatingAfterFailure() const override {
      if (!StepsLeft)
        return false;

      switch (EvalMode) {
      case EM_ConstantExpression:
      case EM_ConstantExpressionUnevaluated:
      case EM_ConstantFold:
      case EM_IgnoreSideEffects:
        return checkingPotentialConstantExpression() ||
               checkingForUndefinedBehavior();
      }
      llvm_unreachable("Missed EvalMode case");
    }
```
```c++
    /// Are we checking an expression for overflow?
    // FIXME: We should check for any kind of undefined or suspicious behavior
    // in such constructs, not just overflow.
    bool checkingForUndefinedBehavior() const override {
      return CheckingForUndefinedBehavior;
    }
```
I didn't investigate where this gets set to `true` or should it be checked at all, since I thought the root cause of the problem was that the temporary is not cleaned up, so I wanted to address that.

https://github.com/llvm/llvm-project/pull/67722


More information about the cfe-commits mailing list