[clang] fcd020d - [Clang][Sema] Fix malformed AST for anonymous class access in template. (#90842)

via cfe-commits cfe-commits at lists.llvm.org
Tue May 14 00:45:57 PDT 2024


Author: martinboehme
Date: 2024-05-14T09:45:54+02:00
New Revision: fcd020d561f28a2b33b6cc12a5a0164a6d5e4172

URL: https://github.com/llvm/llvm-project/commit/fcd020d561f28a2b33b6cc12a5a0164a6d5e4172
DIFF: https://github.com/llvm/llvm-project/commit/fcd020d561f28a2b33b6cc12a5a0164a6d5e4172.diff

LOG: [Clang][Sema] Fix malformed AST for anonymous class access in template. (#90842)

# Observed erroneous behavior

Prior to this change, a `MemberExpr` that accesses an anonymous class
might have a prvalue as its base (even though C++ mandates that the base
of a `MemberExpr` must be a glvalue), if the code containing the
`MemberExpr` was in a template.

Here's an example on [godbolt](https://godbolt.org/z/Gz1Mer9oz) (that is
essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

```cxx
struct S {
  struct {
    int i;
  };
};
```

It then accesses the member `i` using the expression `S().i`.

When we do this in a non-template function, we get the following AST:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

As expected, the AST contains a `MaterializeTemporarExpr` to materialize
the prvalue `S()` before accessing its members.

When we perform this access in a function template (that doesn't
actually even use its template parameter), the AST for the template
itself looks the same as above. However, the AST for an instantiation of
the template looks different:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

Note how the inner `MemberExpr` (the one accessing the anonymous struct)
acts on a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen,
probably because CodeGen is set up to deal with `MemberExpr`s on rvalues
in C. However, it does cause issues in the dataflow framework, which
only supports C++ today and expects the base of a `MemberExpr` to be a
glvalue.

Beyond the issues with the dataflow framework, I think this issue should
be fixed because it goes contrary to what the C++ standard mandates, and
the AST produced for the non-template case indicates that we want to
follow the C++ rules here.

# Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, `TreeTransform` [strips any
`MaterializeTemporaryExpr`s](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L14853)
from the AST.

It is therefore up to
[`TreeTransform::RebuildMemberExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2853)
to recreate a `MaterializeTemporaryExpr` if needed. In the [general
case](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2915),
it does this: It calls `Sema::BuildMemberReferenceExpr()`, which ensures
that the base is a glvalue by [materializing a
temporary](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/SemaExprMember.cpp#L1016)
if needed. However, when `TreeTransform::RebuildMemberExpr()` encounters
an anonymous class, it [calls
`Sema::BuildFieldReferenceExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2880),
which, unlike `Sema::BuildMemberReferenceExpr()`, does not make sure
that the base is a glvalue.

# Proposed fix

I considered several possible ways to fix this issue:

- Add logic to `Sema::BuildFieldReferenceExpr()` that materializes a
temporary if needed. This appears to work, but it feels like the fix is
in the wrong place:
- AFAIU, other callers of `Sema::BuildFieldReferenceExpr()` don't need
this logic.
- The issue is caused by `TreeTransform` removing the
`MaterializeTemporaryExpr`, so it seems the fix should also be in
`TreeTransform`

- Materialize the temporary directly in
`TreeTransform::RebuildMemberExpr()` if needed (within the case that
deals with anonymous classes).

This would work, too, but it would duplicate logic that already exists
in `Sema::BuildMemberReferenceExpr()` (which we leverage for the general
case).

- Use `Sema::BuildMemberReferenceExpr()` instead of
`Sema::BuildFieldReferenceExpr()` for the anonymous class case, so that
it also uses the existing logic for materializing the temporary.

This is the option I've decided to go with here. There's a slight
wrinkle in that we create a `LookupResult` that claims we looked up the
unnamed field for the anonymous class -- even though we would obviously
never be able to look up an unnamed field. I think this is defensible
and still better than the other alternatives, but I would welcome
feedback on this from others who know the code better.

Added: 
    clang/test/AST/ast-dump-anonymous-class.cpp

Modified: 
    clang/docs/ReleaseNotes.rst
    clang/lib/Sema/TreeTransform.h
    clang/unittests/Analysis/FlowSensitive/TransferTest.cpp

Removed: 
    


################################################################################
diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 28ac54127383a..49ab222bec405 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -715,6 +715,7 @@ Bug Fixes to AST Handling
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 - Clang now properly preserves ``FoundDecls`` within a ``ConceptReference``. (#GH82628)
 - The presence of the ``typename`` keyword is now stored in ``TemplateTemplateParmDecl``.
+- Fixed malformed AST generated for anonymous union access in templates. (#GH90842)
 
 Miscellaneous Bug Fixes
 ^^^^^^^^^^^^^^^^^^^^^^^

diff  --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h
index 2d903dc525563..c039b95293af2 100644
--- a/clang/lib/Sema/TreeTransform.h
+++ b/clang/lib/Sema/TreeTransform.h
@@ -2874,10 +2874,21 @@ class TreeTransform {
         return ExprError();
       Base = BaseResult.get();
 
+      // `TranformMaterializeTemporaryExpr()` removes materialized temporaries
+      // from the AST, so we need to re-insert them if needed (since
+      // `BuildFieldRefereneExpr()` doesn't do this).
+      if (!isArrow && Base->isPRValue()) {
+        BaseResult = getSema().TemporaryMaterializationConversion(Base);
+        if (BaseResult.isInvalid())
+          return ExprError();
+        Base = BaseResult.get();
+      }
+
       CXXScopeSpec EmptySS;
       return getSema().BuildFieldReferenceExpr(
           Base, isArrow, OpLoc, EmptySS, cast<FieldDecl>(Member),
-          DeclAccessPair::make(FoundDecl, FoundDecl->getAccess()), MemberNameInfo);
+          DeclAccessPair::make(FoundDecl, FoundDecl->getAccess()),
+          MemberNameInfo);
     }
 
     CXXScopeSpec SS;

diff  --git a/clang/test/AST/ast-dump-anonymous-class.cpp b/clang/test/AST/ast-dump-anonymous-class.cpp
new file mode 100644
index 0000000000000..393c084c913d1
--- /dev/null
+++ b/clang/test/AST/ast-dump-anonymous-class.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -ast-dump %s \
+// RUN: | FileCheck -strict-whitespace %s
+
+struct S {
+  struct {
+    int i;
+  };
+};
+
+int accessInRegularFunction() {
+  return S().i;
+  // CHECK: FunctionDecl {{.*}} accessInRegularFunction 'int ()'
+  // CHECK:      |   `-ReturnStmt {{.*}}
+  // CHECK-NEXT: |     `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT: |       `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT: |         `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT: |           `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT: |             `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT: |               `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
+}
+
+// AST should look the same in a function template with an unused template
+// parameter.
+template <class>
+int accessInFunctionTemplate() {
+  return S().i;
+  // CHECK: FunctionDecl {{.*}} accessInFunctionTemplate 'int ()'
+  // CHECK:      |   `-ReturnStmt {{.*}}
+  // CHECK-NEXT: |     `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT: |       `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT: |         `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT: |           `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT: |             `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT: |               `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
+}
+
+// AST should look the same in an instantiation of the function template.
+// This is a regression test: The AST used to contain the
+// `MaterializeTemporaryExpr` in the wrong place, causing a `MemberExpr` to have
+// a prvalue base (which is not allowed in C++).
+template int accessInFunctionTemplate<int>();
+  // CHECK: FunctionDecl {{.*}} accessInFunctionTemplate 'int ()' explicit_instantiation_definition
+  // CHECK:          `-ReturnStmt {{.*}}
+  // CHECK-NEXT:       `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT:         `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT:           `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT:             `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT:               `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT:                 `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing

diff  --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index e1fb16b64fd6f..5c0582e872eb0 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -27,6 +27,14 @@
 #include <string>
 #include <utility>
 
+namespace clang {
+namespace dataflow {
+namespace {
+AST_MATCHER(FunctionDecl, isTemplated) { return Node.isTemplated(); }
+} // namespace
+} // namespace dataflow
+} // namespace clang
+
 namespace {
 
 using namespace clang;
@@ -7416,4 +7424,31 @@ TEST(TransferTest, ConditionalRelation) {
       });
 }
 
+// This is a crash repro.
+// We used to crash while transferring `S().i` because Clang contained a bug
+// causing the AST to be malformed.
+TEST(TransferTest, AnonymousUnionMemberExprInTemplate) {
+  using ast_matchers::functionDecl;
+  using ast_matchers::hasName;
+  using ast_matchers::unless;
+
+  std::string Code = R"cc(
+    struct S {
+      struct {
+        int i;
+      };
+    };
+
+    template <class>
+    void target() {
+        S().i;
+    }
+
+    template void target<int>();
+  )cc";
+  auto Matcher = functionDecl(hasName("target"), unless(isTemplated()));
+  ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code, Matcher),
+                    llvm::Succeeded());
+}
+
 } // namespace


        


More information about the cfe-commits mailing list