<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jan 20, 2017, at 10:45 AM, Yonghong Yan <<a href="mailto:yanyh15@gmail.com" class="">yanyh15@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><br class="Apple-interchange-newline"><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_quote" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">On Fri, Jan 20, 2017 at 12:52 PM, Mehdi Amini via llvm-dev<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span><span class="Apple-converted-space"> </span>wrote:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Jan 20, 2017, at 6:59 AM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank" class="">hfinkel@anl.gov</a>> wrote:</div><br class="m_3683543841662625865Apple-interchange-newline"><div class=""><div bgcolor="#FFFFFF" text="#000000" class=""><p class="">On 01/13/2017 12:11 PM, Mehdi Amini wrote:<br class=""></p><blockquote type="cite" class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Jan 13, 2017, at 9:41 AM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank" class="">hfinkel@anl.gov</a>> wrote:</div><br class="m_3683543841662625865Apple-interchange-newline"><div class=""><div bgcolor="#FFFFFF" text="#000000" class=""><p class=""><br class=""></p><div class="m_3683543841662625865moz-cite-prefix">On 01/13/2017 12:29 AM, Mehdi Amini wrote:<br class=""></div><blockquote type="cite" class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Jan 12, 2017, at 5:02 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank" class="">hfinkel@anl.gov</a>> wrote:</div><div class=""><div bgcolor="#FFFFFF" text="#000000" class=""><p class="">On 01/12/2017 06:20 PM, Reid Kleckner via llvm-dev wrote:</p><blockquote type="cite" class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote">On Wed, Jan 11, 2017 at 8:13 PM, Mehdi Amini<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:mehdi.amini@apple.com" target="_blank" class="">mehdi.amini@apple.com</a>></span><span class="Apple-converted-space"> </span>wrote:<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div class=""><div class="">Can you elaborate why? I’m curious.</div></div></div></blockquote><div class=""><br class=""></div><div class="">The con of proposal c was that many passes would need to learn about many region intrinsics. With tokens, you only need to teach all passes about tokens, which they should already know about because WinEH and other things use them.</div><div class=""><br class=""></div><div class="">With tokens, we can add as many region-introducing intrinsics as makes sense without any additional cost to the middle end. We don't need to make one omnibus region intrinsic set that describes every parallel loop annotation scheme supported by LLVM. Instead we would factor things according to other software design considerations.</div></div></div></div></blockquote><br class="">I think that, unless we allow frontends to add their own intrinsics without recompiling LLVM, this severely restricts the usefulness of this feature.</div></div></blockquote><div class=""><br class=""></div><div class="">I’m not convinced that “building a frontend without recompiling LLVM while injecting custom passes” is a strong compelling use-case, i.e. can you explain why requiring such use-case/frontends to rebuild LLVM is so limiting?</div></div></blockquote><br class="">I don't understand your viewpoint. Many frontends either compose their own pass pipelines or use the existing extension-point mechanism. Some frontends, Chapel for example, can insert code using custom address spaces and then insert passes later to turn accesses using pointers to those address spaces into runtime calls. This is the kind of design we'd like to support, without forcing frontends to use custom versions of LLVM, but with annotated regions instead of just with address spaces.<br class=""></div></div></blockquote><div class=""><br class=""></div><div class="">I think we’re talking about two different things here: you mentioned originally “without recompiling LLVM”, which I don’t see as major blocker, while now you’re now clarifying I think that you’re more concerned about putting a requirement on a *custom* LLVM, as in “it wouldn’t work with the source from a vanilla upstream LLVM”, which I agree is a different story.</div><div class=""><br class=""></div><div class="">That said, it extends the point from the other email (in parallel) about the semantics of the intrinsics: while your solution allows these frontend to reuse the intrinsics, it means that upstream optimization have to consider such intrinsics as optimization barrier because their semantic is unknown.</div></div></blockquote><br class="">I see no reason why this needs to be true (at least so long as you're willing to accept a certain amount of "as if" parallelism).</div></div></blockquote><div class=""><br class=""></div></span><div class="">Sorry, I didn’t quite get that?</div><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div bgcolor="#FFFFFF" text="#000000" class="">Moreover, if it is true, then we'll lose the benefits of, for example, being able to hoist scalar loads out of parallel loops. We might need to include dependencies on "inaccessible memory", so cover natural runtime dependencies by default (this can be refined with custom AA logic), but that is not a complete code-motion barrier. Memory being explicitly managed will end up as arguments to the region intrinsics, so we'll automatically get more-fine-grained information.<br class=""></div></div></blockquote><div class=""><br class=""></div></span><div class="">Sanjoy gave an example of the kind of optimization that can break the semantic: <a href="http://lists.llvm.org/pipermail/llvm-dev/2017-January/109302.html" target="_blank" class="">http://lists.llvm.<wbr class="">org/pipermail/llvm-dev/2017-<wbr class="">January/109302.html</a> ; I haven’t yet seen an explanation about how this is addressed?</div></div></div></blockquote><div class="">If you were asking how this is addressed in the current clang/openmp, the code in the whole parallel region is outlined into a new function by frontend and parallel fork-join is transformed to a runtime call (kmpc_fork_call) that takes as input a pointer to the outlined function. so procedure-based optimization would not perform those optimization Sanjoy listed.</div><div class=""><br class=""></div></div></div></blockquote></div><br class=""><div class="">Right, but the question is rather how it’ll work with this proposal.</div><div class=""><br class=""></div><div class="">— </div><div class="">Mehdi</div><div class=""><br class=""></div></body></html>