[PATCH] D88697: [WebAssembly] Rename Emscripten EH functions

Sam Clegg via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 6 08:07:49 PDT 2020


sbc100 accepted this revision.
sbc100 added inline comments.


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:205
+      if (Sym->getName().startswith("invoke_"))
+        if (!InvokeSymbols.insert(Sym).second)
+          continue;
----------------
Perhaps merge these two lines into single condition.


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:204
+      // Emscripten EH symbol so we don't emit the same symbol twice.
+      if (Sym->getName().startswith("invoke_"))
+        if (!EmSymbols.insert(Sym).second)
----------------
aheejin wrote:
> sbc100 wrote:
> > aheejin wrote:
> > > aheejin wrote:
> > > > sbc100 wrote:
> > > > > What happens here if there is a user symbol that starts with `invoke_`?   Perhaps getMCSymbolForFunction could also return bool if it detected an `__invoke_` wrapper? 
> > > > > 
> > > > > Might be worth adding a function called `invoke_ignoreme` as a normal symbol in one of the tests?
> > > > > What happens here if there is a user symbol that starts with `invoke_`?
> > > > 
> > > > We haven't handled this situation so far, and this patch does not change this. Invokes' format has set in Emscripten, so any user code that uses function name starting with `invoke_` has a problem now. I think we should consider prepending Emscripten's invokes with two underscores as in the names generated by LowerEmscriptenEHSjLj pass, given that functions with two leading underscores are mostly considered compiler-specific functions. But I'd like to do it as a separate patch, as that changes the interface between LLVM, Binaryen, and Emscripten. Do you think this is a good idea?
> > > > 
> > > > > Perhaps getMCSymbolForFunction could also return bool if it detected an `__invoke_` wrapper?
> > > > 
> > > > What will it be used for?
> > > I ended up renaming invokes to use two leading underscores in this patch. I wanted to it as a separate patch, but that will again need three different patches each from LLVM, Binaryen, and Emscripten, and landing it through the roller using a double roll twice is also a pain, so I ended up doing it here. Will do the same for the pending Binaryen and Emscripten paches.
> > I was actually thinking of the case where there a *not* a collision.
> > 
> > e.g. `invoke_foo` 
> > 
> > This cannot possibly collide today because `foo` is not a valid signature.  The code today seems to allow this and pass through such symbols.  wasm-emscripten-finalize includes this function in the normal `exports` section not in the `invokeFuncs` section of the metadata.
> > 
> > By renaming the functions to use two underscores we solve this issue but I still think it might be nicer for `getMCSymbolForFunction` to return an additional bool if it created a wrapper.. rather then trying to detect this post-hoc with a call to isEmscriptenInvokeWrapperName on the result.   
> I added `invoke_ignoreme` test. This test passes without a problem in the current code.
> 
> Adding `InvokeDetected` output parameter to `getMCSymbolForFunction` seems tricky. `getMCSymbolForFunction` can either create a new invoke symbol or retrieve an existing invoke symbol. 
> 
> Note that this function is used in two different places. When used in `MCInstLower`, it always creates a symbol. But when used in `AsmPrinter`, it can either create a symbol or retrieve an existing symbol. (It creates a symbol in `AsmPrinter` when the symbol has not been used in any instructions, such as calls, so it didn't have a chance to be created in `MCInstLower`.)
> 
> So when retrieving an already created invoke symbol, `getMCSymbolForFunction` just returns the existing name, so it cannot distinguish `invoke_vi` from `invoke_foo`. So if we only count newly created invoke symbols, we are gonna miss already created invoke symbols in calls from `emitEndOfAsmFile` in `AsmPrinter`.
> 
> But I'm not sure if this is a problem. Without this we can use `invoke_ignoreme` just fine. It will be put into `EmSymbols` set under the incorrect assumption that it is an invoke symbol, but that does not cause any problem here. We can actually put all symbols in the set; I didn't do that because I didn't want to grow a large set every time.
I think I still don't really understand.. isn't this condition basically trying to say saying "is this a symbol that possibly got remapped by `getMCSymbolForFunction`"?

if that is true then the more precise condition would be the same as the one in line of getMCSymbolForFunction wouldn't it: `if (EnableEmEH && isEmscriptenInvokeWrapperName(F->getName()))`  ?    

It seems like this line is trying to infer this condition based on the name of the resulting MC symbol.

But as you say maybe it doesn't matter if this condition is conservative?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88697/new/

https://reviews.llvm.org/D88697



More information about the llvm-commits mailing list