[PATCH] D63312: [Attributor] Deduce attributes for non-exact functions

Thu Jun 13 22:49:42 PDT 2019

jdoerfert marked 2 inline comments as done.
jdoerfert added a comment.

Thanks for the quick feedback!

In D63312#1542977 <https://reviews.llvm.org/D63312#1542977>, @nicholas wrote:

> > We can, however, use information from non-exact definitions to improve
> >  themselves as they are either replaced as a whole or not at all.
>
> I didn't understand the motivation of this patch by reading the description, I understood it with this comment. In particular I had the impression that you were using the attributes of one internalized function to optimize another internalized function.

With this patch there is no "inter-internalized-function communication". I have another one to do that. I think I have to reword the whole commit message to make it clear.

With these shallow wrappers you can only use the function itself when it comes to inter-procedural (IP) reasoning. That is why I need to check `WrappedFunctions` in `mayDependOnNonExactDefinition`.
You cannot do cross function reasoning because we actually do call the shallow wrapper which is a non-exact decleration. However, you can do recursive reasoning and manifest attributes on the function, parameters and return value of the internalized version.

> I'm curious what function-local optimizations we're missing? If I understand the design, you can deduce that a function is 'readnone', but no callers of the function can be optimized based on that deduction, since the 'readnone' body may be replaced. I'm mostly confident we don't use readnone to optimize the body of the function. Do you have an example of an attribute that we can deduce and then use to optimize the body of the function?

The shallow wrappers are not the solution I want to have. Real internalization is what we actually want (later patch).
Shallow wrappers are however not useless. We could, for example, derive `nofree` and `nosync` on them which together allow to transform `dereferenceable` arguments into `dereferenceable_globally` (D61652 <https://reviews.llvm.org/D61652>) arguments. Invariant loads of the latter can be hoisted out of loops. Direct recursion can be optimized this way as well.

I did run a test on the LLVM-TS and SPEC2006 with "returned" and "nocapture" attribute enabled in the Attributor.
The changes in statistics > 1% are listed here: https://gist.github.com/jdoerfert/aa94861fd59d1564436d2ab490164d19 
There was <5 benchmarks that were impacted wrt. compile and runtime. I don't have the numbers right now but I will show more elaborate results before we turn anything on. With deep wrappers 7 or so benchmarks got actually faster, 3-30% if I remember correctly.

> If it were my design de novo, I would prefer treating the function attributes as being "as authoritative as the linkage", which pushes the problem of deciding whether `F->doesNotReturn()` is sufficient to the caller, it needs to check the linkage (or `isExactDefinition`) before asking that question. Unfortunately that would churn the API as we would need to update attribute consumers to use a convenience wrapper API when they're examining a callee instead of their function parent, but I expect pushing that decision into the pass might be necessary to get corner cases optimal and correct in passes that look at multiple functions. That seems to me like it would be a lot of work, what do you think?

I dislike this solution mostly because it makes user information less valuable (assuming I understand you correctly). If the user gives us information, `__restrict__ -> noalias`, `int & -> dereferenceable(sizeof(int))`, etc. we should be able to use that to the fullest extend regardless of the linkage.

> Even if the wrapper function approach works, I don't think the representation is right. There aren't two functions, which is why you'll need to teach the inliner, etc., that these aren't truly functions. Without adding the concept to the LangRef, you've got two separate passes in LLVM that coordinate optimizations with each other though implementation details.

Even without teaching the inliner anything, deep wrappers did show a performance benefit (see above). I actually want to teach the inliner something, but I was hoping it is something general, namely:
If a function only contains a call and a return and the call can be implemented with a single `jmp` instruction (the arguments are passed in order,...), do not inline the call. This seems to me like a heuristic that could even save compile time while not sacrificing performance. I hope it would be applicable to user code as well. Even if not, I actually have another patch to use a similar wrapper concept in conjuction with the `__attribute__((callback(...)))`. The use case there is to allow us to unpack struct arguments with more than 3 members without ever actually increasing/changing the arguments of a function. That is useful to communicate information about the struct members to the callee (and potentially back).

> I'm not sure the wrapper can be used on functions that aren't marked `unnamed_addr` or you prove the address isn't taken? You update all users, which means that someone who takes the address of the function in this module might get a different address than someone who takes the address in a different module since they're two different internal functions now? Those pointers need to compare equal if you pass the function pointer around.

That is a very good point which I haven't considered yet. I will check for `unnamed_addr` and, if not present, check that the address is not taken.

================
Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:1187
+
+  StringRef Prefix = "__internal_";
+  F.setName(Prefix + F.getName());
----------------
nicholas wrote:
> FYI, I don't think we have a reserved prefix for llvm-produced functions, but you can make it anonymous with `F.setName("");`. Since the function has internal linkage, it doesn't need a name to link with other modules.
I'll do that. The Prefix was just to avoid clashes ("__" is often reserved).

================
Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:1310
+  };
+
   // Return attributes are only appropriate if the return type is non void.
----------------
This lambda got somehow in this review, will be committed before.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63312/new/

https://reviews.llvm.org/D63312