[PATCH] D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind

Thu Mar 9 10:00:02 PST 2023

hoy added a comment.

In D83906#4180435 <https://reviews.llvm.org/D83906#4180435>, @dexonsmith wrote:

> Oh, de-refining is pretty nifty / evil. This patch has background:
> https://reviews.llvm.org/D18634
>
> Since 2016, the optimizer is not allowed to do IPA on functions that can be de-refined (such as `linkonce_odr` functions).
>
> Here's a hypothetical problematic scenario for the optimizer:
>
> - original IR for B has a `throw` somewhere
> - function A invokes function B
> - in this TU, B is optimized and removes exceptions, and gets marked `nounwind`
> - function A leverages the `nounwind` to turn the `invoke` into a `call`
> - function B is de-refined at link/load time: linker chooses a *different* function B which still has a `throw`
> - "evil magic" happens (see the discussions around the patch linked above)
> - a crash is introduced
>
> At first blush, it sounds like this could only be a problem if the code has UB in it. However, read the above patch (and follow-ups, and related discussion) for a variety of examples of non-UB cases where IPA on de-refineable functions introduces crashes. I don't know for sure this could be a problem for `nounwind` specifically, but in general the LLVM optimizer doesn't look at attributes of de-refineable functions.
>
> (Note that if you're doing LTO (like AutoFDO), this usually won't block optimization, since at LTO time there are very few de-refineable functions (most `linkonce_odr` functions are internalized, and not exported as weak). So if we added a `-cc1` flag to prefer "stable IR" over "frontend peepholes", it would make sense for `-flto` to imply it.)
>
> On the other hand, the frontend knows the token sequence from the source language. It knows whether function `B` is inherently `nounwind` based on its ODR token sequence; in which case, it's safe to use the attribute for an IPA peephole.
>
> ----
>
> BTW, I'm not personally against removing this peephole from the frontend (even without a flag), or limiting it somehow to cases where it doesn't make IR output unstable. I like the idea of stable IRGen output.
>
> Nevertheless, it feels like removing IPA-based peepholes from Clang in the name of stable IRGen output is a change in direction, which might deserve a discussion in the forums rather than in a particular patch review.

In D83906#4181876 <https://reviews.llvm.org/D83906#4181876>, @rjmccall wrote:

> There are *some* properties we can still assume about `linkonce_odr` functions despite them being replaceable at link time.  The high-level language guarantee we're starting from is that the source semantics of all versions of the function are identical.  The version of the function we're looking at has been transformed from the original source — it is, after all, now LLVM IR, not C/C++ — but it has presumably faithfully preserved the source semantics.  We can therefore rely on any properties of the semantics that are required to be preserved by transformation, which includes things like "does it terminate", "what value does it return", "what side effects does it perform", and so on.  What we can't rely on are properties of the implementation that are not required to be preserved by transformation, like whether or not it uses a certain argument — transformations are permitted to change that.
>
> The output-stability argument is an interesting one.  The critical thing here is to avoid instability on the same source.  When the source is different, I mean, it'd be nice to make a best effort at stability, but even putting optimization aside, things like header processing order or template instantiation order are necessarily going to affect things like order in the functions lists.  That's going to affect output, at the very least in terms of object file order, but also in that we can't realistically promise that function processing order in the optimization will *never* have any impact.  Our interprocedural passes generally try to work in call-dependency order, but that's not a perfect tree, and function order inevitably comes into it.
>
> With all that said, I don't feel strongly that we need to preserve this frontend optimization if it's causing real problems.

Thanks for the detailed explanation about de-refining!

I feel a bit confused about `linkonce_odr`. From the LLVM IR reference I see the definition of

  linkonce_odr, weak_odr
  Some languages allow differing globals to be merged, such as two functions with different semantics. Other languages, such as C++, ensure that only equivalent globals are ever merged (the “one definition rule” — “ODR”). Such languages can use the linkonce_odr and weak_odr linkage types to indicate that the global will only be merged with equivalent globals. These linkage types are otherwise the same as their non-odr versions.

It sounds to me that at link time only equivalent symbols can replace each other. Then de-refining some of those equivalent symbols should not affect their semantics as far as nothrow is concerned? Just as @rjmccall pointed out, the C++ language guarantee we're starting from is that the source semantics of all versions of the function are identical.

That said, the LLVM optimizer does not strictly subsume the front-end because of how it fails to handle `linkonce_odr` functions as in https://reviews.llvm.org/D18634. I'm wondering how common the `linkonce_odr` linkage is for C++. In @wlei's example, none of the functions there is `linkonce_odr`. Is there a particular source-level annotate that specifies functions to be `linkonce_odr`?

Discussing a path to stable IR gen in general in the forum would be great. In the meantime I'm appealing to remove this specific peephole to unblock AutoFDO, if nobody objects.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83906/new/

https://reviews.llvm.org/D83906