[PATCH] D60074: [Attributor] Deduce "no-recurse" function attribute

Thu Jun 6 01:30:59 PDT 2019

nicholas added a comment.

In D60074#1531819 <https://reviews.llvm.org/D60074#1531819>, @jdoerfert wrote:

> In D60074#1531809 <https://reviews.llvm.org/D60074#1531809>, @efriedma wrote:
>
> > Is it actually correct to derive norecurse like this?
> >
> > Suppose I have two functions:
> >
> >   void f(int x) { if (x) { g(); } }
> >   void g() { f(0); }
> >
> >
> > As far as I can tell, it would be correct to mark g() norecurse.  And if g() is marked norecurse, we'll mark f() norecurse, which is wrong.
>
>
> We should/do not mark `g` as `norecurse` in this example. At least not if `f` is externally visible.

Could you explain a little further? I don't see how changing whether `f` is externally visible changes whether it's OK to mark `g` with `norecurse`. Here's my thinking. If `f` is externally visible that means it can be called from external code, and we have to make the worst-case assumption about the call stack, but in this case we know that `g` can't be on the call stack since `g` doesn't call any external functions. Is that right?

I think Eli's question can be considered in two parts. The first is about what LangRef permits, not what the Attributor pass will deduce. Is it legal to mark `g` with `norecurse`? If it is, some other user of LLVM, perhaps a user of LLVM as a library, may mark it `norecurse` themselves which would be correct/truthful/not a bug yet. The second part of Eli's question, is that if the first part is right and `g` has been annotated with `norecurse` before the Attributor runs, won't it mark `f` as `norecurse`, and that would be an incorrect deduction on the Attributor's part?

>> Granted, this is being sort of pedantic with the definition in LangRef, which says the function "does not call itself".  It might make sense to redefine norecurse to mean the function doesn't call "any function currently on the stack", instead of just itself.

The LangRef says "down any possible call path" which might be an attempt to make `g` not legal to be marked with `norecurse`. I think the wording is unclear, is it a "possible path" to have `g` -> `f` -> `g` or not? Statically both `g` -> `f` and `f` -> `g` calls are present, but we can also statically prove that the `g` -> `f` -> `g` route combination is impossible. I think LangRef's wording should focus on forbidding the function from occurring more than once on the call stack, and remove the wording about possible paths. The net result of that change would permit marking of `g` as `norecurse`.

I'm not sure the proposed "does not call any function currently on the stack" is right either. Consider a call stack of A -> B -> C -> B. If this is the call stack that is generated every time (there is no other caller of B or C and this happens for all calls to A) it would be true that A and C are `norecurse` and B is not. With the proposed definition, `c` could not be marked `norecurse`. I haven't considered whether it's useful for optimization to think of C as being `norecurse`.

If the above scenario were real, it would require that B has a branch that determines whether a call is made, splitting the function into a part that may recurse and a part that may not. Some day, we may want to pursue `norecurse` at the instruction level with the semantic of "after this instruction is executed, it is not executed again by any direct or indirect callee until the return of the instruction parent's function on the call stack" (awkward wording so as to not forbid loops within the function). Offhand I think this would also work to represent the result of inlining a `norecurse` callee into a non-norecurse caller. (Again, I haven't considered whether this information is useful for any optimization.)

> What we did before and with this patch is derive `norecurse` only if no call to a recursive function is contained. So in contrast to other attributes we actually do not look at the optimistic/assumed state of called functions.

I think the right fix is to extend it to derive `norecurse` only if no call to a recursive function or a function in the same SCC (even if marked `norecurse`) is contained. That would allow the attributor to avoid marking `f` as `norecurse` even if `g` were marked `norecurse`.

================
Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:299
+      //       and:
+      //       http://llvm.1065342.n5.nabble.com/llvm-dev-Is-every-intrinsic-norecurse-td109617.html
+      if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
----------------
LLVM hosts the mailing list archives on lists.llvm.org. I think the "official" link would be https://lists.llvm.org/pipermail/llvm-dev/2017-June/113714.html

(I think I understand the problem here is that the discussion continued over multiple months and llvm's official archives are generated split per month. If that's the reason behind linking to nabble, I think it's worth a bug report on bugs.llvm.org , and I won't comment again about whether it's a link to lists.llvm.org or nabble.com -- though I'm rather unhappy that nabble has misconfigured HTTPS.)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60074/new/

https://reviews.llvm.org/D60074