[clang] [clang][dataflow] Fix an issue with `Environment::getResultObjectLocation()`. (PR #75483)

Fri Dec 15 01:45:38 PST 2023

================
@@ -726,27 +726,69 @@ void Environment::setStorageLocation(const Expr &E, StorageLocation &Loc) {
   // so allow these as an exception.
   assert(E.isGLValue() ||
          E.getType()->isSpecificBuiltinType(BuiltinType::BuiltinFn));
-  setStorageLocationInternal(E, Loc);
+  const Expr &CanonE = ignoreCFGOmittedNodes(E);
+  assert(!ExprToLoc.contains(&CanonE));
+  ExprToLoc[&CanonE] = &Loc;
 }
 
 StorageLocation *Environment::getStorageLocation(const Expr &E) const {
   // See comment in `setStorageLocation()`.
   assert(E.isGLValue() ||
          E.getType()->isSpecificBuiltinType(BuiltinType::BuiltinFn));
-  return getStorageLocationInternal(E);
+  auto It = ExprToLoc.find(&ignoreCFGOmittedNodes(E));
+  return It == ExprToLoc.end() ? nullptr : &*It->second;
+}
+
+// Returns whether a prvalue of record type is the one that originally
+// constructs the object (i.e. it doesn't propagate it from one of its
+// children).
+static bool isOriginalRecordConstructor(const Expr &RecordPRValue) {
----------------
martinboehme wrote:

> I am wondering whether it is better to enumerate the "transparent" AST nodes where we need to do the propagation. My questions are:
> 
> * Roughly how many such transparent nodes do we have?

[This function](https://github.com/google/crubit/blob/f2bba114d350a0e3c59676504e02047a74637b20/lifetime_analysis/object_repository.cc#L316) in the Crubit lifetime analysis gives a rough idea. It does essentially what the FIXME in the documentation for `getResultObjectLocation()` says we eventually want to do also in the framework.

>From this, we see that there are more "transparent" nodes than "is original record constructor" nodes. Also, the "is original record constructor" nodes are the ones for which we know that the framework produces a `RecordValue` -- so it seemed to make more sense to enumerate those.

> * In case we miss something, is it less desirable to erroneously propagate something or to erroneously create a new PRValue?

I think it's a wash. In both cases:

* We will hit one of the assertions (or will perform the fallback behavior in non-assertion compiles)
* We will model the behavior of the code incorrectly. I don't think one of the two variants of wrong behavior will clearly be more benign in all cases -- I think there are some cases where we would prefer one variant and some cases where we would prefer the other, but of course ultimately we should fix such errors anyway.

https://github.com/llvm/llvm-project/pull/75483