[llvm-dev] [RFC] IR-level Region Annotations
Yonghong Yan via llvm-dev
llvm-dev at lists.llvm.org
Fri Jan 20 10:45:06 PST 2017
On Fri, Jan 20, 2017 at 12:52 PM, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On Jan 20, 2017, at 6:59 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> On 01/13/2017 12:11 PM, Mehdi Amini wrote:
>
>
> On Jan 13, 2017, at 9:41 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>
> On 01/13/2017 12:29 AM, Mehdi Amini wrote:
>
>
> On Jan 12, 2017, at 5:02 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> On 01/12/2017 06:20 PM, Reid Kleckner via llvm-dev wrote:
>
> On Wed, Jan 11, 2017 at 8:13 PM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>>
>> Can you elaborate why? I’m curious.
>>
>
> The con of proposal c was that many passes would need to learn about many
> region intrinsics. With tokens, you only need to teach all passes about
> tokens, which they should already know about because WinEH and other things
> use them.
>
> With tokens, we can add as many region-introducing intrinsics as makes
> sense without any additional cost to the middle end. We don't need to make
> one omnibus region intrinsic set that describes every parallel loop
> annotation scheme supported by LLVM. Instead we would factor things
> according to other software design considerations.
>
>
> I think that, unless we allow frontends to add their own intrinsics
> without recompiling LLVM, this severely restricts the usefulness of this
> feature.
>
>
> I’m not convinced that “building a frontend without recompiling LLVM while
> injecting custom passes” is a strong compelling use-case, i.e. can you
> explain why requiring such use-case/frontends to rebuild LLVM is so
> limiting?
>
>
> I don't understand your viewpoint. Many frontends either compose their own
> pass pipelines or use the existing extension-point mechanism. Some
> frontends, Chapel for example, can insert code using custom address spaces
> and then insert passes later to turn accesses using pointers to those
> address spaces into runtime calls. This is the kind of design we'd like to
> support, without forcing frontends to use custom versions of LLVM, but with
> annotated regions instead of just with address spaces.
>
>
> I think we’re talking about two different things here: you mentioned
> originally “without recompiling LLVM”, which I don’t see as major blocker,
> while now you’re now clarifying I think that you’re more concerned about
> putting a requirement on a *custom* LLVM, as in “it wouldn’t work with the
> source from a vanilla upstream LLVM”, which I agree is a different story.
>
> That said, it extends the point from the other email (in parallel) about
> the semantics of the intrinsics: while your solution allows these frontend
> to reuse the intrinsics, it means that upstream optimization have to
> consider such intrinsics as optimization barrier because their semantic is
> unknown.
>
>
> I see no reason why this needs to be true (at least so long as you're
> willing to accept a certain amount of "as if" parallelism).
>
>
> Sorry, I didn’t quite get that?
>
> Moreover, if it is true, then we'll lose the benefits of, for example,
> being able to hoist scalar loads out of parallel loops. We might need to
> include dependencies on "inaccessible memory", so cover natural runtime
> dependencies by default (this can be refined with custom AA logic), but
> that is not a complete code-motion barrier. Memory being explicitly managed
> will end up as arguments to the region intrinsics, so we'll automatically
> get more-fine-grained information.
>
>
> Sanjoy gave an example of the kind of optimization that can break the
> semantic: http://lists.llvm.org/pipermail/llvm-dev/2017-
> January/109302.html ; I haven’t yet seen an explanation about how this is
> addressed?
>
If you were asking how this is addressed in the current clang/openmp, the
code in the whole parallel region is outlined into a new function by
frontend and parallel fork-join is transformed to a runtime call
(kmpc_fork_call) that takes as input a pointer to the outlined function. so
procedure-based optimization would not perform those optimization Sanjoy
listed.
Yonghong
>
> I’m not sure how you imagine going around the optimization barrier that
> goes with “this intrinsic has an unknown semantic that can impact the
> control flow of the program implicitly”, unless it acts as a “hint” only
> (but I don’t believe it is the direction?).
>
> —
> Mehdi
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170120/47134149/attachment.html>
More information about the llvm-dev
mailing list