[PATCH] D15124: Use @llvm.invariant.start/end intrinsics to extend the meaning of basic AA's pointsToConstantMemory(), for GVN-based load elimination purposes [Local objects only]

Wed Jan 13 03:17:33 PST 2016

----- Original Message -----

> From: "Chandler Carruth" <chandlerc at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Chandler Carruth"
> <chandlerc at gmail.com>
> Cc: reviews+D15124+public+ddba332c127856cb at reviews.llvm.org, "Larisse
> Voufo" <lvoufo at gmail.com>, "llvm-commits"
> <llvm-commits at lists.llvm.org>
> Sent: Wednesday, January 13, 2016 4:27:13 AM
> Subject: Re: [PATCH] D15124: Use @llvm.invariant.start/end intrinsics
> to extend the meaning of basic AA's pointsToConstantMemory(), for
> GVN-based load elimination purposes [Local objects only]

> On Wed, Jan 13, 2016 at 2:00 AM Hal Finkel via llvm-commits <
> llvm-commits at lists.llvm.org > wrote:

> > Actually, can you please remind me why we want post-dominance
> > information here at all?
> 
> FWIW, I also would like to see a clear explanation of this -- I
> haven't yet.

> > For the invariant_start -> load, it seems like dominance
> > information
> > is what we want (all paths to the load need to pass through the
> > invariant_start (or some invariant_start, should there be multiple
> > ones for the same pointer). Regarding invariant_end, we need to
> > make
> > sure that no paths from the invariant_start to the load pass
> > through
> > an invariant_end. For this, it seems conservatively correct to
> > check
> > that the load dominates the invariant_end (or all invariant_ends,
> > should there be multiple), and that any containing loop has
> > backedges that branch to blocks dominating the invariant_start(s).
> > We could use post-dom information for this check, but that's just a
> > different approximation.
> 

> Sure.

> I think that your statement is really hitting on the critical point
> here though: the issue is whether the invariant ends between the
> start and the load, nothing else. As you demonstrate, postdom
> actually gets the wrong answer:

> > invariant_start
> 
> > / \
> 
> > | load #1
> 
> > | |
> 
> > \ /
> 
> > \ /
> 
> > |
> 
> > load #2
> 
> > / \
> 
> > / \
> 
> > ret \
> 
> > invariant_end
> 
> > \
> 
> > \
> 
> > ret
> 

> > So in this case (assuming my attempt at ASCII-art is successful),
> > both loads here can be considered to be invariant (both are
> > dominated by the invariant_start, and there are no paths from the
> > invariant_start to either load passing through an invariant_end).
> > However, while it is true that load #2 is post-dominated by the
> > invariant_end w.r.t. one of the exits, it is not w.r.t. all exits.
> > load #1 here does not even post-dominate the invariant_start, and
> > that's fine. However, load #1 does not dominate the invariant_end
> > either.
> 

> This is a great example. Larisse, this should really be a good test
> case for this patch.

> > Maybe we need a dedicated CFG walk to collect the necessary
> > information?
> 

> Yea. We check for the same kinds of things in a number of other
> places without post dominators. It might be useful to mirror the
> logic on them.

> However, maybe there are easier ways. Thinking about this, for GVN in
> particular, I think I see really easy ways to make this fast and
> reasonably precise... Check my reasoning:

> 1) The start must dominate the load.
> 2) Therefore, if an end exists on a path from the start to the load,
> the start must dominate the end.
> 3) GVN never introduces ends or changes the dominance relationship of
> starts and ends.
> 4) We scan the function finding all of these intrinsics ahead of time
> already.
> 5) We can build a map from domtree node (or basic block essentially)
> to the list of invariant end intrinsics in that block.
> 6) We can walk the dom tree from the start to the load, and check
> each end in each block

> I feel like I'm missing a case... do you see the hole in this plan?
This works if there's only one start. If there are multiple starts, then we need only the set to dominate, not any individual one. 

invariant_start invariant_start 
| | 
|________________ | 
| 
load 
| 
ret 

At least in theory, this is fine too. The "right" way to solve this might be to set it up as a lattice/dataflow problem on the CFG, and then iterate until convergence (which should happen in two iterations, aside perhaps from pathological cases). The tricky part here, I suspect, is actually the representation (you can't actually build a large map covering (all pointers x all loads), and then prune it). Aside from that, you mark all live-in pointers to all blocks as invariant, except for the entry block, and at the instructions that define the pointer value, and then iterate the system, accounting for the invariant_start/end intrinsics, and you should find the maximal fixed point. The result would be a map of which pointers are invariant at which loads (which, as mentioned, you'd need to represent intelligently). Doing this might actually be faster than locally walking the CFG many times anyway. 

Here's another test case: 

invariant_start 
____| 
| | 
| load 
| | 
| invariant_end 
|____| 
| 
ret 

Here, we can't treat the load as invariant, because there's a path from the invariant_start passing through the invariant_end via the loop backedge. 

> It'll be a bit expensive because of the linear scan if there are a
> very large number of invariants. Maybe that's OK? Hard for me to
> tell...
GVN already scans the entire (reachable) function body (and, by an order of magnitude, is dominated by the cost of AA queries). I doubt we'll notice. That having been said, we can avoid the linear scan if we use a registration approach for these intrinsics, just as we did with @llvm.assume (the AssumptionCache). 

Another issue worth considering: 

entry 
___________|_____ 
| | 
invariant_start invariant_start 
| | 
load load 
| | 
... ... 

Generally, we'd PRE this load (adding it to the entry block). Either the invariant_start will block this (which is bad), or we'll want to also PRE the invariant_starts to avoid PRE killing the invariant info. 

-Hal 

> > -Hal
> 

> > --
> 
> > Hal Finkel
> 
> > Assistant Computational Scientist
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 
> > _______________________________________________
> 
> > llvm-commits mailing list
> 
> > llvm-commits at lists.llvm.org
> 
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> 

-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160113/29ffc4f5/attachment-0001.html>