[PATCH] D127751: [MergeFunctions] Preserve symbols used llvm.used/llvm.compiler.used

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 15 14:20:14 PDT 2022


efriedma added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/MergeFunctions.cpp:841
+      // usually from inline asm.
+      if (G->hasGlobalUnnamedAddr() && !Used.contains(G)) {
         // G might have been a key in our GlobalNumberState, and it's illegal
----------------
nikic wrote:
> MaskRay wrote:
> > efriedma wrote:
> > > Amanieu wrote:
> > > > efriedma wrote:
> > > > > I'm a bit suspicious of the hasGlobalUnnamedAddr() check here.  hasGlobalUnnamedAddr() generally implies it's legal to merge two constants/functions, but we don't actually figure out if we're going to perform an actual merge at this point; that doesn't happen until we try to discard it, or call writeThunkOrAlias.  (unnamed_addr only allows rewriting all the uses of a variable/function at once, not just some of them.  Otherwise you end up with weird inconsistencies where a value isn't equal to itself.)
> > > > I think the code is valid as it is:
> > > > - If hasGlobalUnnamedAddr() == true then we replace all uses of the function with replaceAllUsesWith.
> > > > - Otherwise we only replace direct caller, which is equivalent to inlining the thunk/alias.
> > > replaceAllUsesWith() only replaces uses in the current module, not uses in other modules.  So despite the name, it's not replacing "all" the uses in the sense we need here.
> > > 
> > > If the function has internal linkage, we're fine because we're actually rewriting all the uses.  If we replace the function with an alias, we're fine because every use refers to the same address.  Otherwise, there's a problem: we're replacing the uses in the current module, but not the uses in other modules.
> > > 
> > > (Making sure unnamed_addr works correctly in this respect is more likely to be a practical issue with constant variables, as opposed to functions, but I'd rather keep this consistent if possible.)
> > hasGlobalUnnamedAddr was from D34805. So what's the recommendation here? Further restrict the condition to local linkage functions?
> Trying to understand what the concern here is based on an example. Would it be something like this https://llvm.godbolt.org/z/e3WYWr69P? Where in this module we observe that `@a == @b`, and then we have another module with linkonce_odr a,b (and lets say those definitions get picked by the linker) and we might observe `@a != @b` in that module?
> 
> 
> Trying to understand what the concern here is based on an example.

The fundamental invariant we want to ensure here is basically `@a == @a`.  Or more generally, "%a == %a" for any pointer that isn't poison/undef.  This should remain true even if one side of the comparison comes from a different module.

That also implies `@a == @b` is the same in all modules.

For a real-world example of how things can explode if unnamed_addr is mishandled in this respect, see https://github.com/llvm/llvm-project/issues/32127 .

> hasGlobalUnnamedAddr was from D34805. So what's the recommendation here? Further restrict the condition to local linkage functions?

Local linkage should be safe to optimize like this. Situations where canCreateAliasFor() is true should also be safe.  (Assuming the global isn't referred to by llvm.used.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127751/new/

https://reviews.llvm.org/D127751



More information about the llvm-commits mailing list