[PATCH] D66987: [InlineCost] Perform caller store analysis

Tue Sep 3 17:06:18 PDT 2019

mpdenton added a comment.

In D66987#1656628 <https://reviews.llvm.org/D66987#1656628>, @chandlerc wrote:

> This idea isn't fundamentally flawed, its a good idea and something we've discussed many times.
>
> The tricky thing I think will be getting it right.
>
> I think you're going to need a much more powerful form of analysis, specifically something like MemorySSA to do this kind of thing well. Integrating MemorySSA into the inliner and using it to do powerful store->load forwarding would be *awesome*, but it is also going to be a quite a lot of work I fear.
>
> Doing something simpler is likely possible along the lines of what you're starting to do here, but I fear we will have to chase a very long tail of subtle corner cases even with these kinds of simplifications (only look at stores in the callee, only look at constant offsets, etc. etc.).
>
> Not sure how far down you want to go on this rabbit hole? That'll somewhat dictate the direction I suggest...

Indeed, I saw the MemorySSA comment in the source. :) I'm interested in the rabbit hole.

As for this simple case, I tried to make the patch heavily integrated with the current inlining infrastructure, so that corner cases are primarily handled by by the existing logic. For example, most of the information used comes from the ConstantOffsetPtrs map and the SROAArgValues map. SimplifiedValues map is used for propagating constants, just like for the function arguments. And, clearing the map of "CallerStackConstants" is handled by the existing disableLoadElimination(), which is called whenever future loads may be clobbered.

So, the only non-trivial things added here are
(1) CallerStackConstants map, which has the corner case of overlapping stores (hasn't been dealt with)
(2) The caller "store search" performed in searchForConstantAtOffset. It's quite simple and doesn't go past many calls, nor past the callsite's basic block. These simplifications mostly eliminate all the corner cases.

So (1) has a corner case that should be solvable, and (2) might have some corner cases--can anything modify memory/clobber stores other than Callbase's and Stores?
There are plenty of corner cases in the inliner altogether, but I think the additions here don't necessarily *add* many extra.

As for efficacy, (2) is the part that is sad to do without MemorySSA. However, it covers some fairly significant cases that come up all the time in C++, as with the no-op destructors.

It seems it would be cool to do something like this first, and then dive down the MemorySSA rabbit hole to see how much it can add.

Thoughts?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66987/new/

https://reviews.llvm.org/D66987