[PATCH] D73307: Unique Names for Functions with Internal Linkage

Reid Kleckner via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 19 14:15:20 PDT 2020


rnk added a comment.

I guess my initial reaction is that it is disappointing that downstream consumers can't cope with non-unique symbol names (What is the point of having internal linkage if we can't rely on it?), but given @mtrofin's input about AFDO, this seems like it may be a practical solution. I can think of at least two other cases where I have wanted similar functionality, the first being unique names for CodeView types in anonymous namespaces, the second being Jumbo support, producing one object from two .cpp source inputs.

To handle the case for type identifiers, we came up with this solution:
https://github.com/llvm/llvm-project/blob/master/clang/lib/AST/MicrosoftMangle.cpp#L401
Your solution is pretty similar.

In D73307#1839829 <https://reviews.llvm.org/D73307#1839829>, @MaskRay wrote:

> I haven't followed Propeller development, but hope someone with profile experience can confirm InternalLinkage is the only linkage we need to care about (otherwise the option will be a misnomer if we ever extend it) and check whether this feature is useful on its own.


There is one other "local" linkage type, `private` linkage, which I believe corresponds with temporary assembler labels (.L*) that don't survive into the object file symbol table. I think the main use case for private globals is to save on symbol table size, so if the user passes this flag, it seems reasonable to upgrade the linkage to internal, and assign a unique name.

There is also the case of unnamed globals, which I think are private. I don't think clang ever generates these, so they are not very interesting.

I grepped the clang/test subdirectory for 'define.*private', and it seems there are no tests for private functions, so I doubt clang ever emits them.

---

At a higher level, should this just be an IR pass that clang adds into the pipeline when the flag is set? It should be safe to rename internal functions and give private functions internal linkage. It would be less invasive to clang and have better separation of concerns. As written, this is based on the source filename on the module, which is accessible from IR. The only reason I can think of against this is that the debug info might refer to the function linkage name, but maybe that is calculated later during codegen.



================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1062
 
   return std::string(Out.str());
 }
----------------
We seem to be defeating NRVO with our "clever" small string type. :(


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1125
   const auto *ND = cast<NamedDecl>(GD.getDecl());
   std::string MangledName = getMangledNameImpl(*this, GD, ND);
 
----------------
There are several other callers of getMangledNameImpl. Should they also use this suffix? They mostly have to do with function multiversioning. Those seem like they could be internal and need this treatment.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1131
+  if (getCodeGenOpts().UniqueInternalFuncNames &&
+      dyn_cast<FunctionDecl>(GD.getDecl()) &&
+      getFunctionLinkage(GD) == llvm::GlobalValue::InternalLinkage &&
----------------
`isa` would be more idiomatic than testing `dyn_cast` results for truth.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1135
+    llvm::MD5 Md5;
+    Md5.update(getModule().getSourceFileName());
+    llvm::MD5::MD5Result R;
----------------
davidxl wrote:
> Source filenames are not guaranteed to be unique, or it does contain the path as well?
It appears to contain the path as the compiler receives it on the command line. However, it is possible to design a build where this string is not unique:
```
cd foo
clang -c name_conflict.c 
cd ../bar
clang -c name_conflict.c
clang foo/name_conflict.o bar/name_conflict.o
```

I don't think we should try to handle this case, but we should document the limitation somewhere. Some build systems do operate changing the working directory as they go.

Can you add some to the clang/docs/UsersManual.rst file? I'd expect it to show up here:
https://clang.llvm.org/docs/UsersManual.html#command-line-options

You can elaborate that the unique names are calculated using the source path passed to the compiler, and in a build system where that is not unique, the function names may not be unique.

---

I have also used this construct in the past for building single-file ABI compatibility test cases:

```
// foo.cpp
#ifdef PART1
// ...
#elif defined(PART2)
// ...
#endif

$ cc -c foo.cpp -DPART1
$ cc -c foo.cpp -DPART2
```


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1139
+    SmallString<32> Str;
+    llvm::MD5::stringifyResult(R, Str);
+    std::string UniqueSuffix = ("." + Str).str();
----------------
I think it would be preferable to calculate the source filename MD5 hash once, when CodeGenModule is constructed, stringify it, and then use that every time we mangle a name.


================
Comment at: clang/test/CodeGen/unique-internal-funcnames.c:3
+
+// RUN: %clang -target x86_64 -S -o - %s | FileCheck %s --check-prefix=PLAIN
+// RUN: %clang -target x86_64 -S -funique-internal-funcnames -o -  %s | FileCheck %s --check-prefix=UNIQUE
----------------
tmsriram wrote:
> davidxl wrote:
> > MaskRay wrote:
> > > You can hardly find any .c -> .s test in clang/test. We mostly do .c -> .ll testing. `.ll` -> `.s` are in llvm/test/CodeGen. 
> > Is this convention documented somewhere? Any concerns of producing .s?
> There are more than 200 tests in clang that do -S from .c -> .s and tens of tests in CodeGen alone.  I can change this to .ll but curious where this "hardly any" comes from?
I can't specifically find documentation for this convention, but it is generally understood that every feature should have targeted unit tests that exercise the minimum amount of functionality. In this case, that means using `%clang_cc1 -emit-llvm -triple NNN` and checking the IR.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73307/new/

https://reviews.llvm.org/D73307





More information about the cfe-commits mailing list