[PATCH] D93656: Moving UniqueInternalLinkageNamesPass to the start of IR pipelines.

Tue Dec 22 13:39:29 PST 2020

dblaikie added inline comments.

================
Comment at: llvm/include/llvm/IR/DebugInfoMetadata.h:2056-2059
+  void replaceRawLinkageName(MDString *LinkageName) {
+    replaceOperandWith(3, LinkageName);
+  }
+
----------------
hoy wrote:
> tmsriram wrote:
> > dblaikie wrote:
> > > The need to add this API makes me a bit uncertain that it's the right direction - any chance you could find other examples of Function duplication in LLVM passes (maybe the partial inliner? Perhaps when it partially inlines an external function it has to clone the function so the external version remains identical?) and how they deal with debug info? Perhaps there's an existing API/mechanism to update the DISubprogram as you want without adding this.
> > This does not seem like a radical new API to me.  We could use existing setOperand or replaceOperandWith here but need a public wrapper to expose it, just like getRawLinkageName.  For example, DICompositeType is mutated like this with buildODRType.
> Yeah, it would be nice to leverage an existing API like this but unfortunately I haven't found any.
> 
> Regarding passes that duplicate functions, I think the renaming done by `UniqueInternalLinkageNamesPass` is transparent to subsequent passes, since it is placed very early in the pipeline before any inlining or cloning passes.
Not super radical, no - but more "we haven't had a need for this so far, I wonder how other similar use cases (if there are any) deal with this situation without this API addition"

> Regarding passes that duplicate functions, I think the renaming done by UniqueInternalLinkageNamesPass is transparent to subsequent passes, since it is placed very early in the pipeline before any inlining or cloning passes.

Sorry, I wasn't meaning to ask "how does this pass interact with those other passes" but "how do those other passes deal with debug info, and can we learn anything from them/use similar techniques here that they use in their implementations"

================
Comment at: llvm/lib/Transforms/Utils/UniqueInternalLinkageNames.cpp:48
+      if (UniqueDebugAndProfileNames) {
+        F.addFnAttr("sample-profile-suffix-elision-policy", "selected");
+        if (DISubprogram *SP = F.getSubprogram()) {
----------------
hoy wrote:
> dblaikie wrote:
> > What's this about/for? Should this be in an independent change?
> This is about how the sample profile loader deals with renamed functions. The sample loader will use the original function name (i.e, by ignoring all suffixes appended) to retrieve a profile by default. With the attribute set here. the sample loader will only ignore certain suffixes not including the one (.__uniq.) appended here. Please see `getCanonicalFnName` in `SampleProf.h` for details. The reason that names are uniquefied here for AutoFDO is to have the sample loader differentiate same-named static functions, so the change is sort of related here.
> 
> I'll put more comments for that.  
Sounds like an independent patch, perhaps? (ie: Adding this attribute is relevant even if we weren't fixing the debug info name? And fixing the debug info name would be relevant even if we weren't adding this attribute) It'd be good for it to be in a separate patch to clarify the intent, test coverage, etc.

================
Comment at: llvm/lib/Transforms/Utils/UniqueInternalLinkageNames.cpp:51
+          auto Name = MDB.createString(F.getName());
+          SP->replaceRawLinkageName(Name);
+        }
----------------
tmsriram wrote:
> hoy wrote:
> > tmsriram wrote:
> > > Do we need to check if it had a rawLinkageName before replacing it?  
> > Good point. `rawLinkageName` is missing for C programs. I think it might still be appropriate to assign a new linkage name in the case so as to differ from the original source function name. What do you think?
> @dblaikie Adding the expert here.  As far as I understand, the linkage name is the mangled name and this would capture this too correctly. 
Best guess, yes.

Though the C case is interesting - it means you'll end up with C functions with the same DWARF 'name' but different linkage name. I don't know what that'll do to DWARF consumers - I guess they'll probably be OK-ish with it, as much as they are with C++ overloading. I think there are some cases of C name mangling so it's probably supported/OK-ish with DWARF Consumers. Wouldn't hurt for you to take a look/see what happens in that case with a debugger like gdb/check other cases of C name mangling to see what DWARF they end up creating (with both GCC and Clang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93656/new/

https://reviews.llvm.org/D93656