[llvm-dev] LLVM Alias Analysis Technical Call - New Doodle Poll

Wed Jun 24 08:33:20 PDT 2020

Hi, everyone,

We had a great call last month, and progress is definitely being made on several fronts. The notes from our last call are available here:
  https://docs.google.com/document/d/1ybwEKDVtIbhIhK50qYtwKsL50K-NvB6LfuBsfepBZ9Y/edit#heading=h.vpxs8lkuxy79

and, also, pasted below.

DOODLE POLL:
As we discussed on our last call, I would like to schedule a regular call to discuss alias-analysis issues within LLVM. I've put together an initial Doodle poll to pick a time for this call: https://doodle.com/poll/9iqfaqttvvic5rfp

Please fill out this poll with the understanding that the meeting will recur every four weeks. If you're interested in participating, but none of the times on the poll would work for you, please let me know.

Notes from our last call:

****

  *   Scalability challenges and other issues discovered with the current infrastructure (especially, perhaps, with the noalias metadata).

     *   Issue #1: MDNode::intersect uses O(n^2) algorithm. The operation does not scale for large NoAlias sets.

     *   Issue #2: ScopedNoAliasAAResult::mayAliasInScopes includes overhead (per query) of partitioning the input set based on the domain metadata.

     *   Issue #3: Memory footprint of the flattened Alias.Scope or NoAlias set can be large.

     *   Issue #4: correctness: current implementation has problems after inlining in a loop.

Notes:

  *   Jeroen notes that he has not observed scalability issues with the
“full restrict” implementation.

  *   Michael notes that part of the scalability challenge comes from the fact that the noalias scheme marks all those things with which you don’t alias, but for restrict-like pointers, that’s the default for everything in that scope.

  *   Tarique notes that there may still be a need for AA from the frontend, e.g., for Fortran, above what the C-like restrict provides.

  *   Eric asks if anyone is considering a ground-up design to support Fortran?

     *   Troy notes that we would prefer one representation to cover Fortran, C, etc. Fortran is, in some sense, simpler than C/C++. Also notes that language-specific AA can be a useful technique as well.

     *   Johannes notes that other languages, e.g., Rust, can also benefit, and we might need something where we can have universes, negated aliasing edges, etc.

     *   Michael notes that we can chain AA, and so it’s possible to have something more domain specific.

     *   Hal notes that if we have language-specific MD, we need to figure out how to maintain it (otherwise, all of the passes will drop it, but maintaining it means that enough of the semantics must be available). Jeroen notes that reusing in-tree passes maximally reuses correct implementation in combination with loop unrolling, etc.

  *   Proposed solutions: progress, outstanding challenges, how to make progress going forward.

     *   Proposal #1: (Related to both Issue #1 and Issue #2) In the MDNode class, pre-partition the set of Metadata operands where each partition is an ordered set of Metadata operands belonging to a specific domain.

     *   Proposal #2: (Related to Issue #3) Design a hierarchical representation for the metadata operands in Alias.Scope and NoAlias sets.

     *   Proposal #3: https://reviews.llvm.org/D68484 provides the infrastructure to fix issue #4. It might also help with issue #1 and issue #3 as it makes it possible to share scopes.

Notes

  *   Tarique presented challenges with the current prototype, including that there’s currently no way to differentiate noalias MD from other kinds of metadata, so it’s hard to keep a useful cache in the generic MDNode implementation.

  *   Jeroen presented some slides on the full-restrict implementation, noted how this scheme might solve/mitigate some of the scalability challenges associated with the pure-metadata solution.

  *   The full-restrict implementation is being used, and Jerone is currently updating this implementation based on feedback from that usage, he also will update the implementation to use provenance instead of sidechannel and repost.

  *   Johannes notes that we need more reviewers for this patch set.

****

Thanks again,
Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Finkel, Hal J. via llvm-dev <llvm-dev at lists.llvm.org>
Sent: Thursday, May 28, 2020 7:35 AM
To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Alina Sbirlea <alina.sbirlea at gmail.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Doerfert, Johannes <jdoerfert at anl.gov>
Subject: Re: [llvm-dev] LLVM Alias Analysis Technical Call - Doodle Poll

Hi, everyone,

Just a quick reminder, this call starts in approximately 1.5 hours.

At the present time, our agenda has:

  *   Scalability challenges and other issues discovered with the current infrastructure (especially, perhaps, with the noalias metadata).

     *   Issue #1: MDNode::intersect uses O(n^2) algorithm. The operation does not scale for large NoAlias sets.

     *   Issue #2: ScopedNoAliasAAResult::mayAliasInScopes includes overhead (per query) of partitioning the input set based on the domain metadata.

     *   Issue #3: Memory footprint of the flattened Alias.Scope or NoAlias set can be large.

     *   Issue #4: correctness: current implementation has problems after inlining in a loop.

  *   Proposed solutions: progress, outstanding challenges, how to make progress going forward.

     *   Proposal #1: (Related to both Issue #1 and Issue #2) In the MDNode class, pre-partition the set of Metadata operands where each partition is an ordered set of Metadata operands belonging to a specific domain.

     *   Proposal #2: (Related to Issue #3) Design a hierarchical representation for the metadata operands in Alias.Scope and NoAlias sets.

     *   Proposal #3: https://reviews.llvm.org/D68484 provides the infrastructure to fix issue #4. It might also help with issue #1 and issue #3 as it makes it possible to share scopes.

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: Finkel, Hal J. <hfinkel at anl.gov>
Sent: Monday, May 18, 2020 11:40 AM
To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Alina Sbirlea <alina.sbirlea at gmail.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Doerfert, Johannes <jdoerfert at anl.gov>
Subject: Re: LLVM Alias Analysis Technical Call - Doodle Poll

To join our call on Thursday, May 28th @ 9-10 AM central time / 2-3 PM UTC please use this information:

Meeting URL
https://bluejeans.com/643493129?src=join_info

Meeting ID
643 493 129

Want to dial in from a phone?

Dial one of the following numbers:
+1.312.216.0325 (US (Chicago))
+1.408.740.7256 (US (San Jose))
+1.866.226.4650 (US Toll Free)
(see all numbers - https://www.bluejeans.com/premium-numbers)

Enter the meeting ID and passcode followed by #

Connecting from a room system?
Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode

On our agenda, we'll have:

 1. Scalability challenges and other issues discovered with the current infrastructure (especially, perhaps, with the noalias metadata).
 2. Proposed solutions: progress, outstanding challenges, how to make progress going forward.

We'll formulate the detailed agenda and take notes from the call using this Google doc: https://docs.google.com/document/d/1ybwEKDVtIbhIhK50qYtwKsL50K-NvB6LfuBsfepBZ9Y/edit?usp=sharing

A summary will then be sent to the mailing list after the call. If you would like to add items to the agenda, please edit the document (or reply to this email).

Thanks again,
Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: Finkel, Hal J. <hfinkel at anl.gov>
Sent: Monday, May 18, 2020 10:24 AM
To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Alina Sbirlea <alina.sbirlea at gmail.com>; Finkel, Hal J. <hfinkel at anl.gov>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Doerfert, Johannes <jdoerfert at anl.gov>
Subject: Re: LLVM Alias Analysis Technical Call - Doodle Poll

Thanks to everyone who participated in the poll. The time that maximizes availability is:

  Thursday, May 28th @ 9-10 AM central time / 2-3 PM UTC.

I'll send out meeting information shortly.

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Finkel, Hal J. via llvm-dev <llvm-dev at lists.llvm.org>
Sent: Wednesday, May 13, 2020 11:14 AM
To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Alina Sbirlea <alina.sbirlea at gmail.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Doerfert, Johannes <jdoerfert at anl.gov>
Subject: [llvm-dev] LLVM Alias Analysis Technical Call - Doodle Poll

Hi, everyone,

We've had a number of discussions recently, including on the Flang technical call, about potential improvements to LLVM's alias analysis to support handling restrict and restrict-like semantics.

We would like to try having a call to discuss these issues further. Please, if you're interested in joining, indicate your availability (prior to the end of this week):

  https://doodle.com/poll/evhwr2eyfvcf8ib3

Thanks again,
Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Alina Sbirlea via llvm-dev <llvm-dev at lists.llvm.org>
Sent: Monday, May 11, 2020 9:49 AM
To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Doerfert, Johannes <jdoerfert at anl.gov>
Subject: Re: [llvm-dev] Full restrict support - status update

Hi Johannes et al,

Trying to revive this discussion, as the restrict support is relevant for one of our teams.

Thank you,
Alina

On Tue, Nov 12, 2019 at 1:16 PM Jeroen Dobbelaere via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi Johannes et al,

> -----Original Message-----
> From: Doerfert, Johannes <jdoerfert at anl.gov<mailto:jdoerfert at anl.gov>>
[..]
> On 11/06, Jeroen Dobbelaere wrote:
> > >From: Alexey Zhikhartsev
> > [..]
> > > We would love to see your patches merged as soon as possible, so I was
> wondering: do you think the lack of bitcode support will prevent that from
> happening?
> >
> > Yes, I think that the lack of bitcode support will prevent it.
> >
> > During the Developers meeting, I also talked with Hal and Johannes.
> > They had some extra remarks:
> > - (1) the restrict implementation deserves a separate document. (I am
> working on that one)
> > - (2) they don't like the naming of 'noalias_sidechannel'.
> > - (3) they also have some other mechanisms in mind to add the 'sidechannel'
> to the load/store instructions
> >        (and maybe to function calls, intrinsics; currently that is handled through
> llvm.noalias.arg.guard)
> >
> > For (2) and (3), I am waiting for a proposal from them ;)
>
> I would like to see the restrict support be merged but, as Jeroen
> mentions above, I feel there are two design choices we have to
> overthink. Here are short descriptions to get some feedback from the
> community:
>
> (A) Naming and restriction
>
> The name "sidechannel" is unfortunate, it has various negative
> connotations, e.g., the release notes that read:
>  "LLVM 10.0 now has sidechannel support for your restrict pointer"
> will raise a lot of follow up questions.
>
> What I think we actually do, and what we should call it, is "provenance"
> tracking.
>
> Now beyond the pure renaming of "sidechannel" into "provenance" (or sth.
> similar) I want us to decouple provenance tracking from the noalias
> logic. Noalias/restrict is one use case in which (pointer) provenance
> information is useful but not the only one. If we add some mechanism to
> track provenance, let's make it generic and reusable. Note that the
> basic ideas are not much different to what the noalias RFC proposed.
> The major difference would be that we have provenance information and if
> that originates in an `llvm.restrict.decl` call we can use it for
> (no)alias queries.

"provenance" might indeed be a good name.

There is a big difference between a restrict declaration, and a restrict usage:
- the declaration intrinsic (llvm.noalias.decl) is used to track in the cfg the location
   where the restrict variable was declared. This is important to handle code motion,
   merging, duplication in a correct way (inlining, loop unrolling, ...)
- the restrict usage intrinsics (llvm.noalias and llvm.side.noalias) are used to indicate
   that from that point on, restrict (noalias) properties are introduced for that pointer.
  They can exist without an associated 'llvm.noalias.decl' (when the declaration is outside
   the function.)
Given that, I assume that you mean 'llvm.provenance.noalias' (~ llvm.side.noalias) instead
of 'llvm.restrict.decl'.

>
>
>
> (B) Using operand bundles
>
> Right now, loads and stores are treated differently and given a new
> operand. Then there are intrinsics to decode other kinds of information.
> As an alternative, we could allow operand bundles on all instructions
> and use them to tie information to an instruction. The "sidechannel"
> operand of a load would then look something like:
>   load i32* %p [ "ptr_provenance"(%p_decl) ]
> and for a store we could have
>   store i32** %p.addr, i32* %p [ "ptr_provenance"(%p_decl) ]
>
> The benefit is that we do not change the operand count (which causes a
> lot of noise) but we still have to make sure ptr/value uses are not
> confused with operand bundle uses. We can attach the information to more
> than load/store instructions, also to remove the need for some of the
> intrinsics.

To me, operand bundles sound to be more or less equivalent to the current
solution. It  might also make the 'instruction cloning' easier, if we can omit the
'ptr_provenance' there. The change of the number of operands caused some
noise, but it is the changes in the amount of 'uses' of a pointer that refer to the
same instruction that caused the most problems. Especially when that instruction
was going to be erased. Operand bundles will still need those code changes.
(like in parts of D68516 and D68518)

As the 'Call' instruction already supports operand bundles, it could eliminate the need
for the 'llvm.noalias.arg.guard' intrinsic, which combines the normal pointer with the
side channel (aka provenance). But, after inlining, we still need to put that information
somewhere. Or it should be propagated during inlining.
Care must be taken not to lose that information when the 'call' is changed by optimizations
as, after inlining, that might result in wrong alias analysis conclusions.

Are you thinking of "operand bundles" support for just LoadInst/StoreInst, or for all
instructions ?

Greetings,

Jeroen Dobbelaere

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/da46399a/attachment-0001.html>