[PATCH] D149144: [clang][dataflow] Eliminate intermediate `ReferenceValue`s from `Environment::DeclToLoc`.

Tue May 2 07:21:49 PDT 2023

mboehme marked 5 inline comments as done.
mboehme added inline comments.

================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:307-311
+      auto &ParamLoc =
+          createStorageLocation(ParamDecl->getType().getNonReferenceType());
       setStorageLocation(*ParamDecl, ParamLoc);
-      if (Value *ParamVal = createValue(ParamDecl->getType()))
-        setValue(ParamLoc, *ParamVal);
+      if (Value *ParamVal =
+              createValue(ParamDecl->getType().getNonReferenceType()))
----------------
ymandel wrote:
> Maybe comment as to why we're not use the `VarDecl` overloads of `createStorageLocation` and why we're specifically using `getNonReferenceType` for `createValue`.
Added comment on the `getNonReferenceType()`.

As to why we're not using the `VarDecl` overload -- it's really just because we don't need the "stable" location behavior here. Wasn't sure if this rises to the level of needing to be addressed in a comment, so I've left it out for now. WDYT?

================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:616
+  bool erased = DeclToLoc.erase(&D);
+  if (!erased)
+    D.dump();
----------------
xazax.hun wrote:
> Would we dump this in release builds? Do we want to limit this to debug/assert builds?
True, we'd want to do this only in debug builds. For simplicity, I've taken out the `dump` entirely -- it's easy enough to add back if we do actually encounter assertion failures here.

================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:631
+
+  assert(dyn_cast_or_null<ReferenceValue>(getValue(*Loc)) == nullptr);
+
----------------
ymandel wrote:
> would `isa_and_nonnull` work?
Good point, done!

================
Comment at: clang/lib/Analysis/FlowSensitive/Transfer.cpp:329
+          // against this above.
+          ProcessVarDecl(*VD);
+          auto *VDLoc = Env.getStorageLocation(*VD);
----------------
ymandel wrote:
> why the recursive call rather than relying on what we know about their structure?
Not sure exactly what you're asking?

If we don't want to make a recursive call here, we'd need to duplicate behavior that `ProcessVarDecl()` already contains and inline that code here. It seems preferable to just call `ProcessVarDecl()` instead? (Or to put it differently, what is a reason against making this call?)

================
Comment at: clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp:329
+static void
+builtinTransferScopeEnd(const CFGScopeEnd &Elt,
+                        TypeErasedDataflowAnalysisState &InputState) {
----------------
xazax.hun wrote:
> I think one question is whether we are interested in ScopeEnd or LifetimeEnd, see the discussions in https://reviews.llvm.org/D15031 and https://reviews.llvm.org/D16403, specifically Devin's comment: 
> 
> >>! In D16403#799926, @dcoughlin wrote:
> >> @dcoughlin As a reviewer of both patches - could you tell us what's the difference between them? And how are we going to resolve this issue?
> > 
> > These two patches are emitting markers in the CFG for different things.
> > 
> > Here is how I would characterize the difference between the two patches.
> > 
> > - Despite its name, https://reviews.llvm.org/D15031, is really emitting markers for when the lifetime of a C++ object in an automatic variable ends.  For C++ objects with non-trivial destructors, this point is when the destructor is called. At this point the storage for the variable still exists, but what you can do with that storage is very restricted by the language because its contents have been destroyed. Note that even with the contents of the object have been destroyed, it is still meaningful to, say, take the address of the variable and construct a new object into it with a placement new. In other words, the same variable can have different objects, with different lifetimes, in it at different program points.
> > 
> > - In contrast, the purpose of this patch (https://reviews.llvm.org/D16403) is to add markers for when the storage duration for the variable begins and ends (this is, when the storage exists). Once the storage duration has ended, you can't placement new into the variables address, because another variable may already be at that address.
> > 
> > I don't think there is an "issue" to resolve here.  We should make sure the two patches play nicely with each other, though. In particular, we should make sure that the markers for when lifetime ends and when storage duration ends are ordered correctly.
> 
> 
> What I wanted to add, I wonder if we might not get ScopeEnd event for temporaries and there might be other differences. The Clang implementation of P1179 used LifetimeEnd instead of ScopeEnd, and I believe probably most clients are more interested in LifetimeEnd events rather than ScopeEnd.
> 
> I think I understand why you went with ScopeEnd for this specific problem, but to avoid the confusion from having both in the Cfg (because I anticipate someone will need LifetimeEnd at some point), I wonder if we can make this work with LifetimeEnd. 
Hm, thanks for bringing it up.

Conincidentally, I realized while chasing another issue that `CFGScopeEnd` isn't the right construct here. I assumed that we would get a `CFGScopeEnd` for every variable, but I now realize that we only get a `CFGScopeEnd` for the first variable in the scope.

So `CFGLifetimeEnds` definitely looks like the right construct to use here, and indeed it's what I originally tried to use. Unfortuately, `CFG::BuildOptions::AddLifetime` cannot be used at the same time as `AddImplicitDtors`, which we already use. We don't actually need the `CFGElement`s added by `AddImplicitDtors`, but we do need `AddTemporaryDtors`, and that in turn can only be turned on if `AddImplicitDtors` is turned on.

It looks as if I will need to break one of these constraints. It looks as if the constraint that is easiest to break is that `AddLifetime` currently cannot be used at the same time as `AddImplicitDtors`. I'm not sure why this constraint currently exists (I'd appreciate any insights you or others may have), but I suspect it's because it's hard to ensure that the `CFGElement`s added by `AddImplicitDtors` are ordered correctly with respect to the `CFGLifetimeEnds` elements.

Anyway, this requires a change to `CFG` one way or another. I'll work on that next. I'll then make the required changes to this patch and will ask for another look before submitting.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149144/new/

https://reviews.llvm.org/D149144