[LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)

Mon Feb 18 09:42:54 PST 2013

----- Original Message -----
> From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Andrew Trick" <atrick at apple.com>, "Tobias Grosser" <tobias at grosser.es>, "llvmdev at cs.uiuc.edu Dev"
> <llvmdev at cs.uiuc.edu>
> Sent: Monday, February 18, 2013 2:32:14 AM
> Subject: Re: [LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)
> 
> On 02/17/2013 11:15 PM, Hal Finkel wrote:
> > If the unroller somehow differentiates the metadata coming from
> > different
> > loop iterations, then BBVectorize can use this information as well.
> > Even
> > better, we could make BasicAA understand that appropriately marked
> > loads
> > and stores from different iterations don't alias. Then the AA-based
> > dependency breaker in the scheduler could also make use of the
> > information.
> > Thoughts?
> 
> This is roughly what we did in our first version of work-group
> autovectorization in pocl that works on "fully unrolled wi-loops"
> (we call it the 'replication' work group generation method).
> 
> We forked the BBVectorize to the pocl code base and added explicit
> knowledge of the separate work-items (that are really just parallel
> loop
> iterations) so it tries to pair the matching instructions from the
> different iterations (WIs) directly.
> 
> We also have an AA that exploits the independent iterations (WIs)
> information along with the other OpenCL AA helping features (disjoint
> address spaces). We use this AA down to the custom instruction
> scheduler of ours with the TCE target to help the VLIW-style
> scheduling/bundling of multiple WIs.
> 
> I have hoped to get the BBVectorizer and the "unrolled parallel loop
> AA"
> functionality upstreamed as it applies to all fully parallel loops,
> not just
> the OpenCL "work-item loops", and I hate to have the forked
> BBVectorizer in
> pocl.

Agreed.

> 
> The metadata scheme should be thought through, however, to make it
> cleaner
> than our OpenCL-specific hackish attempt, and possibly usable for
> other
> similar "context-dependent scenarios".
> 
> The earlier idea I had was to attach "context information" to the
> memory accesses. In this case it would communicate that the mem
> access
> belongs (or belonged, if fully unrolled) to a loop and it can alias
> only with
> the accesses from the same iteration, or with accesses without the
> metadata.
> 
> Something like:
> 
> llvm.mem.parallel_loop_iteration [loopid] [iteration_id_integer]

Why don't we just add an optional iteration id to !llvm.mem.parallel_loop_access?

> 
> This can help the "pairing" of the BBVectorizer: it can try to pair
> with
> the different iterations first.

This makes sense.

> The ParallelLoopIterationAA can look
> at this
> metadata and if the other instruction has also a
> parallel_loop_iteration MD
> that points to the same loopid (the self-referencing id metadata from
> the
> llvm.loop.parallel patch), check their iteration identifier, and if
> it's
> different, return NO ALIAS.

This also makes sense.

> 
> The similar idea could be applied to preserve the 'restrict' info
> across
> function inlines:
> 
> llvm.mem.restricted_access [funcid] [pointerid]
> 
> Similarly, if the RestrictedPointerAA finds that both of the accesses
> are
> marked with this metadata and point to the same funcid, and the
> [pointerid]
> is different, it can return NO ALIAS.

Interesting idea. I had worked out a proposal for more-general restrict support some weeks ago; but have not yet had a chance to work on implementing it. We should make sure that everything ends up integrated nicely.

Thanks again,
Hal

> 
> --
> Pekka
>