[PATCH] D104827: [DebugInfo] Enforce implicit constraints on `distinct` MDNodes

Stephen Tozer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 16 04:03:29 PST 2021


StephenTozer added inline comments.


================
Comment at: llvm/lib/IR/DebugInfoMetadata.cpp:1668-1673
+  auto *Uniqued = uniquify();
+  if (Uniqued != this) {
+    auto &Store = getContext().pImpl->MetadataAsValues;
+    auto MAV = Store.find(this);
+    if (MAV != Store.end())
+      MAV->second->handleChangedMetadata(Uniqued);
----------------
StephenTozer wrote:
> scott.linder wrote:
> > dexonsmith wrote:
> > > I'm sorry; I'm lacking some context here; I'd appreciate a bit of hand-holding.
> > > 
> > > I see how this updates the MetadataAsValue reference.
> > > 
> > > Can you walk me through in way way it's enforced that DIArgList is only referenced by MetadataAsValue? It sounds like textual IR has something, but can you confirm? What about the C++ API? The verifier? And what about historical bitcode... i.e., what can be found there?
> > > - If there are holes in enforcement somehow, what's going to happen to the dangling references to this node?
> > > - Can/should we delete this node now? And/or somehow mark it as dead, so that ValueAsMetadata will look through it?
> > > Can you walk me through in way way it's enforced that DIArgList is only referenced by MetadataAsValue?
> > 
> > I don't know that it is enforced currently, but I believe it was an intended restriction. The LLVM IR langref entry for [[ https://llvm.org/docs/LangRef.html#diarglist | DIArgList  ]] currently says (emphasis mine):
> > 
> > > Because a DIArgList may refer to local values within a function, **it must only be used as a function argument**, must always be inlined, and cannot appear in named metadata.
> > 
> > The only way I can interpret the bold portion is to make it equivalent to "it must only be used through a MetadataAsValue", as that is the only means of using metadata as a function argument.
> > 
> > > It sounds like textual IR has something, but can you confirm? What about the C++ API? The verifier? And what about historical bitcode... i.e., what can be found there?
> > 
> > Would it be appropriate to just add as a verifier check? No matter how the IR is produced it would then fail to verify, so we would cover any current or future APIs. For bitcode I've implemented an upgrade path to force `IsDistinct=false` when loading old bitcode.
> > 
> > > If there are holes in enforcement somehow, what's going to happen to the dangling references to this node?
> > > Can/should we delete this node now? And/or somehow mark it as dead, so that ValueAsMetadata will look through it?
> > 
> > I need to look closer at the lifetime of metadata nodes, it seems clear there is more to do here I'm just not sure exactly what that is.
> > Can you walk me through in way way it's enforced that DIArgList is only referenced by MetadataAsValue? It sounds like textual IR has something, but can you confirm? What about the C++ API? The verifier? And what about historical bitcode... i.e., what can be found there?
> 
> In textual IR parsing, a DIArgList cannot be parsed without a reference to `PerFunctionState`, which is provided for metadata exclusively when parsing the contents of `MetadataAsValue`. There is no explicit assertion to this effect, and there is also no assertion in Bitcode parsing or in the verifier either - this should probably be added! Although I don't think there should be any existing cases of this, as the Bitcode writer can also only write `DIArgList`s as part of a `MetadataAsValue` and this has been true since the metadata was added, so the only possible scenario would be someone manually editing a bitcode file I think.
Just to clarify, because it's slightly unclear in my last comment: I believe that the right move is to just add a verifier check. It doesn't really make sense for a DIArgList to exist outside of a MetadataAsValue in any circumstance, as it may refer to local function values and so must exist as a Value within a function, and the only way for that to happen is by being wrapped in a MetadataAsValue. As such, I don't think we need to put assertions all over the place to catch this, only to express it in the verifier as a rule of IR that will likely never be hit.

The equivalent assertion for LocalAsMetadata, which must be wrapped by MetadataAsValue for the same reason, is in `Verifier::visitMDNode`. I've also noticed while checking this that `Verifier::visitMetadataAsValue` does not call `Verifier::visitDIArgList` if it contains one, which should also be adjusted (this doesn't need to be a part of this patch).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104827/new/

https://reviews.llvm.org/D104827



More information about the llvm-commits mailing list