[PATCH] D19821: [EarlyCSE] Optionally use MemorySSA. NFC.

Tue Sep 6 13:33:10 PDT 2016

To follow up on this, I've been experimenting with an alternative usage 
of MemorySSA in EarlyCSE and would like to get your opinion on it.  For 
the pass of EarlyCSE added at the start of 
addFunctionSimplificationPasses(), instead of requiring MemorySSA as an 
analysis, I have EarlyCSE delay building MemorySSA until it is needed 
(i.e. until one of the simple memory generation checks fails).  
Furthermore, when it is constructed I added a flag to turn off the 
memory use optimization, on the assumption that EarlyCSE is only going 
to query a small sub-set of the loads/stores in the function (I'm still 
using getClobberingMemoryAccess() to do the check).  I also don't 
preserve MemorySSA, side-stepping the issues of having to update 
uses/phis when removing stores.  Using this approach only misses a few 
optimization opportunities and reduces the compile-time impact to be in 
the noise for all benchmarks in the testsuite at O2.

Does this seem like a reasonable approach for the near term?  It seems 
okay to me to miss out on having EarlyCSE preserve MemorySSA since it 
seems like it is going to get invalidated pretty soon in the pipeline 
anyway (e.g. by JumpThreading).  Are there any internal issues that 
would be raised by building an un-optimizaed version of MemorySSA?  I 
would guess not since there is already an optimization threshold that 
prevents MemorySSA from always being fully optimized.

On 8/25/2016 1:11 PM, Daniel Berlin wrote:
>
>
> On Thu, Aug 25, 2016 at 8:36 AM, Geoff Berry <gberry at codeaurora.org 
> <mailto:gberry at codeaurora.org>> wrote:
>
>     Sounds good.  I'll go ahead and check in what I have now (since it
>     is NFC) and we can proceed from there.
>
>     I'm currently trying out an approach where upon removing a store I
>     eagerly re-optimize phis, but just mark uses as dirty, then only
>     re-compute the clobber for dirty uses if they come up in a
>     isSameMemGeneration query or at the end of the pass, which is sort
>     of a hybrid of the two approaches you described.
>
> Okay.
> Internally, I'm going to add an optimized bit that gets reset, and 
> make the default walker only rewalk uses that are not optimized (and 
> optimize them).
>
> That way if you just call getClobberingMemoryAccess, you'd get the 
> right answer all the time and it would only be expensive when it needs 
> to actually be recomputed.
>
> Note that this will be imperfect.
> It relies on not having indirect uses (which use optimization 
> initially guarantees).
>
> That is, it relies on a given store having all of the loads that use 
> it, have it as the clobbering definition.
>
> Otherwise, when you remove/RAUW the store, you will not invalidate the 
> full set of uses.
>
>
> I will also make a batch updater that can do phi insertion and redo 
> use optimization for pieces of memoryssa.
>
> --Dan
>
>
>

-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
  Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160906/0cc169fa/attachment.html>