[PATCH] D15124: Use @llvm.invariant.start/end intrinsics to extend the meaning of basic AA's pointsToConstantMemory(), for GVN-based load elimination purposes [Local objects only]

Wed Jan 13 01:59:42 PST 2016

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Larisse Voufo" <lvoufo at gmail.com>
> Cc: reviews+D15124+public+ddba332c127856cb at reviews.llvm.org, "llvm-commits" <llvm-commits at lists.llvm.org>
> Sent: Tuesday, January 12, 2016 9:00:17 PM
> Subject: Re: [PATCH] D15124: Use @llvm.invariant.start/end intrinsics to extend the meaning of basic AA's
> pointsToConstantMemory(), for GVN-based load elimination purposes [Local objects only]
> 
> 
> Sorry to dig up an old email, but I'm going through the history here,
> and I think this is the right place to comment on one particular
> aspect of this design:
> 
> 
> 
> On Fri, Dec 11, 2015 at 4:03 PM Hal Finkel via llvm-commits <
> llvm-commits at lists.llvm.org > wrote:
> 
> 
> > I think a larger problem here may be that we actually need
> > post-dominance information, as in "does this load instruction
> > post-dominate an invariant_start (associated with the loaded-from
> > pointer values), and if an equivalent invariant_end exists, does it
> > not post-dominate but rather dominate the invariant_end?"...
> > However, I also understand from Chandler that LLVM has a history of
> > avoiding post-dominance info computation in part because it is just
> > too expensive.
> 
> Having post-dominance information is the right solution here, and
> that's what you should use. Yes, we've traditionally through of it
> as expensive, but that seems less a function of theory than of
> infrastructure. Computing dominance information is also expensive,
> but we mitigate this by caching the analysis, and actively updating
> it to avoid unnecessary invalidations.
> 
> Last I spoke to Chandler about this, it seemed like the right way to
> go about this was to compute a post-dominance tree (which is,
> strictly speaking, not really a tree because it can have multiple
> roots), at least for those blocks reachable from the entry block, as
> part of the same walk that computes the dominator tree. Then we need
> to update it whenever we update the dominator tree. If we do that,
> it won't be expensive, and we can use it.
> 
> 
> 
> While I generally agree that this is the right direction for the LLVM
> project, I don't think that this patch is the right place to force
> the issue.
> 
> 
> Specifically, there is a very large amount of infrastructure work
> which will need to be done to make post-dominators readily
> available. As you outline, we'll need to change the postdom pass,
> and teach *many* other passes to preserve it when they currently
> preserve the dominator tree. I *do* think we should do this, but I
> don't think that we should hold up invariants until that is ready.
> 
> 
> So I think the correct direction for the initial work on invariants
> is to figure out the conservatively correct approach *without*
> post-dominance information, and leave clear comments about how to
> augment the results when post-dominance becomes readily available.

Actually, can you please remind me why we want post-dominance information here at all? For the invariant_start -> load, it seems like dominance information is what we want (all paths to the load need to pass through the invariant_start (or some invariant_start, should there be multiple ones for the same pointer). Regarding invariant_end, we need to make sure that no paths from the invariant_start to the load pass through an invariant_end. For this, it seems conservatively correct to check that the load dominates the invariant_end (or all invariant_ends, should there be multiple), and that any containing loop has backedges that branch to blocks dominating the invariant_start(s). We could use post-dom information for this check, but that's just a different approximation.

 invariant_start
     /     \
     |    load #1
     |     |
     \    /
      \  /
       |
      load #2
     /   \
    /     \
   ret     \
           invariant_end
             \
              \
              ret

So in this case (assuming my attempt at ASCII-art is successful), both loads here can be considered to be invariant (both are dominated by the invariant_start, and there are no paths from the invariant_start to either load passing through an invariant_end). However, while it is true that load #2 is post-dominated by the invariant_end w.r.t. one of the exits, it is not w.r.t. all exits. load #1 here does not even post-dominate the invariant_start, and that's fine. However, load #1 does not dominate the invariant_end either.

Maybe we need a dedicated CFG walk to collect the necessary information?

 -Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory