[clang] [Clang][Sema] Fix malformed AST for anonymous class access in template. (PR #90842)

via cfe-commits cfe-commits at lists.llvm.org
Thu May 2 02:59:58 PDT 2024


https://github.com/martinboehme created https://github.com/llvm/llvm-project/pull/90842

# Observed erroneous behavior

Prior to this change, a `MemberExpr` that accesses an anonymous class might have
a prvalue as its base (even though C++ mandates that the base of a `MemberExpr`
must be a glvalue), if the code containing the `MemberExpr` was in a template.

Here's an example on [godbolt](https://godbolt.org/z/Gz1Mer9oz) (that is
essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

```cxx
struct S {
  struct {
    int i;
  };
};
```

It then accesses the member `i` using the expression `S().i`.

When we do this in a non-template function, we get the following AST:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

As expected, the AST contains a `MaterializeTemporarExpr` to materialize the
prvalue `S()` before accessing its members.

When we perform this access in a function template (that doesn't actually even
use its template parameter), the AST for the template itself looks the same as
above. However, the AST for an instantiation of the template looks different:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

Note how the inner `MemberExpr` (the one accessing the anonymous struct) acts on
a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen, probably
because CodeGen is set up to deal with `MemberExpr`s on rvalues in C. However,
it does cause issues in the dataflow framework, which only supports C++ today
and expects the base of a `MemberExpr` to be a glvalue.

Beyond the issues with the dataflow framework, I think this issue should be
fixed because it goes contrary to what the C++ standard mandates, and the AST
produced for the non-template case indicates that we want to follow the C++
rules here.

# Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, `TreeTransform` [strips any
`MaterializeTemporaryExpr`s](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L14853)
from the AST.

It is therefore up to
[`TreeTransform::RebuildMemberExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2853)
to recreate a `MaterializeTemporaryExpr` if needed. In the [general
case](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2915),
it does this: It calls `Sema::BuildMemberReferenceExpr()`, which ensures that
the base is a glvalue by
[materializing a
temporary](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/SemaExprMember.cpp#L1016)
if needed. However, when `TreeTransform::RebuildMemberExpr()` encounters an
anonymous class, it [calls
`Sema::BuildFieldReferenceExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2880),
which, unlike `Sema::BuildMemberReferenceExpr()`, does not make sure that the
base is a glvalue.

# Proposed fix

I considered several possible ways to fix this issue:

-  Add logic to `Sema::BuildFieldReferenceExpr()` that materializes a temporary
   if needed. This appears to work, but it feels like the fix is in the wrong
   place:
   - AFAIU, other callers of `Sema::BuildFieldReferenceExpr()` don't need this
     logic.
   - The issue is caused by `TreeTransform` removing the
     `MaterializeTemporaryExpr`, so it seems the fix should also be in
     `TreeTransform`

-  Materialize the temporary directly in `TreeTransform::RebuildMemberExpr()` if
   needed (within the case that deals with anonymous classes).

   This would work, too, but it would duplicate logic that already exists in
   `Sema::BuildMemberReferenceExpr()` (which we leverage for the general case).

-  Use `Sema::BuildMemberReferenceExpr()` instead of
   `Sema::BuildFieldReferenceExpr()` for the anonymous class case, so that it
   also uses the existing logic for materializing the temporary.

   This is the option I've decided to go with here. There's a slight wrinkle in
   that we create a `LookupResult` that claims we looked up the unnamed field
   for the anonymous class -- even though we would obviously never be able to
   look up an unnamed field. I think this is defensible and still better than
   the other alternatives, but I would welcome feedback on this from others who
   know the code better.


>From 4cbb2ae74abba0cff20d2c9018680c7bcc743316 Mon Sep 17 00:00:00 2001
From: Martin Braenne <mboehme at google.com>
Date: Thu, 2 May 2024 09:59:16 +0000
Subject: [PATCH] [Clang][Sema] Fix malformed AST for anonymous class access in
 template.

# Observed erroneous behavior

Prior to this change, a `MemberExpr` that accesses an anonymous class might have
a prvalue as its base (even though C++ mandates that the base of a `MemberExpr`
must be a glvalue), if the code containing the `MemberExpr` was in a template.

Here's an example on [godbolt](https://godbolt.org/z/Gz1Mer9oz) (that is
essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

```cxx
struct S {
  struct {
    int i;
  };
};
```

It then accesses the member `i` using the expression `S().i`.

When we do this in a non-template function, we get the following AST:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

As expected, the AST contains a `MaterializeTemporarExpr` to materialize the
prvalue `S()` before accessing its members.

When we perform this access in a function template (that doesn't actually even
use its template parameter), the AST for the template itself looks the same as
above. However, the AST for an instantiation of the template looks different:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

Note how the inner `MemberExpr` (the one accessing the anonymous struct) acts on
a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen, probably
because CodeGen is set up to deal with `MemberExpr`s on rvalues in C. However,
it does cause issues in the dataflow framework, which only supports C++ today
and expects the base of a `MemberExpr` to be a glvalue.

Beyond the issues with the dataflow framework, I think this issue should be
fixed because it goes contrary to what the C++ standard mandates, and the AST
produced for the non-template case indicates that we want to follow the C++
rules here.

# Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, `TreeTransform` [strips any
`MaterializeTemporaryExpr`s](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L14853)
from the AST.

It is therefore up to
[`TreeTransform::RebuildMemberExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2853)
to recreate a `MaterializeTemporaryExpr` if needed. In the [general
case](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2915),
it does this: It calls `Sema::BuildMemberReferenceExpr()`, which ensures that
the base is a glvalue by
[materializing a
temporary](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/SemaExprMember.cpp#L1016)
if needed. However, when `TreeTransform::RebuildMemberExpr()` encounters an
anonymous class, it [calls
`Sema::BuildFieldReferenceExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2880),
which, unlike `Sema::BuildMemberReferenceExpr()`, does not make sure that the
base is a glvalue.

# Proposed fix

I considered several possible ways to fix this issue:

-  Add logic to `Sema::BuildFieldReferenceExpr()` that materializes a temporary
   if needed. This appears to work, but it feels like the fix is in the wrong
   place:
   - AFAIU, other callers of `Sema::BuildFieldReferenceExpr()` don't need this
     logic.
   - The issue is caused by `TreeTransform` removing the
     `MaterializeTemporaryExpr`, so it seems the fix should also be in
     `TreeTransform`

-  Materialize the temporary directly in `TreeTransform::RebuildMemberExpr()` if
   needed (within the case that deals with anonymous classes).

   This would work, too, but it would duplicate logic that already exists in
   `Sema::BuildMemberReferenceExpr()` (which we leverage for the general case).

-  Use `Sema::BuildMemberReferenceExpr()` instead of
   `Sema::BuildFieldReferenceExpr()` for the anonymous class case, so that it
   also uses the existing logic for materializing the temporary.

   This is the option I've decided to go with here. There's a slight wrinkle in
   that we create a `LookupResult` that claims we looked up the unnamed field
   for the anonymous class -- even though we would obviously never be able to
   look up an unnamed field. I think this is defensible and still better than
   the other alternatives, but I would welcome feedback on this from others who
   know the code better.
---
 clang/lib/Sema/TreeTransform.h                | 17 +++++--
 clang/test/AST/ast-dump-anonymous-class.cpp   | 49 +++++++++++++++++++
 .../Analysis/FlowSensitive/TransferTest.cpp   | 35 +++++++++++++
 3 files changed, 98 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/AST/ast-dump-anonymous-class.cpp

diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h
index f47bc219e6fa32..9c800c5705c7c6 100644
--- a/clang/lib/Sema/TreeTransform.h
+++ b/clang/lib/Sema/TreeTransform.h
@@ -2876,10 +2876,21 @@ class TreeTransform {
         return ExprError();
       Base = BaseResult.get();
 
+      // We want to use `BuildMemberReferenceExpr()` so we can use its logic
+      // that materializes `Base` into a temporary if it's a prvalue.
+      // To do so, we need to create a `LookupResult` for `Member`, even though
+      // it's an unnamed field (that we could never actually have looked up).
+      // This small hack seems preferable to duplicating the logic for
+      // materializing the temporary.
+      LookupResult R(getSema(), MemberNameInfo, Sema::LookupMemberName);
+      R.addDecl(Member);
+      R.resolveKind();
+
       CXXScopeSpec EmptySS;
-      return getSema().BuildFieldReferenceExpr(
-          Base, isArrow, OpLoc, EmptySS, cast<FieldDecl>(Member),
-          DeclAccessPair::make(FoundDecl, FoundDecl->getAccess()), MemberNameInfo);
+      return getSema().BuildMemberReferenceExpr(
+          Base, Base->getType(), OpLoc, isArrow, EmptySS, TemplateKWLoc,
+          FirstQualifierInScope, R, ExplicitTemplateArgs,
+          /*S*/ nullptr);
     }
 
     CXXScopeSpec SS;
diff --git a/clang/test/AST/ast-dump-anonymous-class.cpp b/clang/test/AST/ast-dump-anonymous-class.cpp
new file mode 100644
index 00000000000000..393c084c913d15
--- /dev/null
+++ b/clang/test/AST/ast-dump-anonymous-class.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -ast-dump %s \
+// RUN: | FileCheck -strict-whitespace %s
+
+struct S {
+  struct {
+    int i;
+  };
+};
+
+int accessInRegularFunction() {
+  return S().i;
+  // CHECK: FunctionDecl {{.*}} accessInRegularFunction 'int ()'
+  // CHECK:      |   `-ReturnStmt {{.*}}
+  // CHECK-NEXT: |     `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT: |       `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT: |         `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT: |           `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT: |             `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT: |               `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
+}
+
+// AST should look the same in a function template with an unused template
+// parameter.
+template <class>
+int accessInFunctionTemplate() {
+  return S().i;
+  // CHECK: FunctionDecl {{.*}} accessInFunctionTemplate 'int ()'
+  // CHECK:      |   `-ReturnStmt {{.*}}
+  // CHECK-NEXT: |     `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT: |       `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT: |         `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT: |           `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT: |             `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT: |               `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
+}
+
+// AST should look the same in an instantiation of the function template.
+// This is a regression test: The AST used to contain the
+// `MaterializeTemporaryExpr` in the wrong place, causing a `MemberExpr` to have
+// a prvalue base (which is not allowed in C++).
+template int accessInFunctionTemplate<int>();
+  // CHECK: FunctionDecl {{.*}} accessInFunctionTemplate 'int ()' explicit_instantiation_definition
+  // CHECK:          `-ReturnStmt {{.*}}
+  // CHECK-NEXT:       `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT:         `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT:           `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT:             `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT:               `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT:                 `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index 301bec32c0cf1d..3f68adaf1c62d2 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -27,6 +27,14 @@
 #include <string>
 #include <utility>
 
+namespace clang {
+namespace dataflow {
+namespace {
+AST_MATCHER(FunctionDecl, isTemplated) { return Node.isTemplated(); }
+} // namespace
+} // namespace dataflow
+} // namespace clang
+
 namespace {
 
 using namespace clang;
@@ -7205,4 +7213,31 @@ TEST(TransferTest, ConditionalRelation) {
       });
 }
 
+// This is a crash repro.
+// We used to crash while transferring `S().i` because Clang contained a bug
+// causing the AST to be malformed.
+TEST(TransferTest, AnonymousUnionMemberExprInTemplate) {
+  using ast_matchers::functionDecl;
+  using ast_matchers::hasName;
+  using ast_matchers::unless;
+
+  std::string Code = R"cc(
+    struct S {
+      struct {
+        int i;
+      };
+    };
+
+    template <class>
+    void target() {
+        S().i;
+    }
+
+    template void target<int>();
+  )cc";
+  auto Matcher = functionDecl(hasName("target"), unless(isTemplated()));
+  ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code, Matcher),
+                    llvm::Succeeded());
+}
+
 } // namespace



More information about the cfe-commits mailing list