<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<div class="moz-cite-prefix">On 12/05/2017 01:06 PM, Joel E. Denny
wrote:<br>
</div>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div dir="ltr">
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span>Hi,<br>
<br>
We are working on a new project, clacc, that extends clang
with OpenACC support. Clacc's approach is to translate
OpenACC (a descriptive language) to OpenMP (a prescriptive
language) and thus to build on clang's existing OpenMP
support. While we plan to develop clacc to support our
own research, an important goal is to contribute clacc as
a production-quality component of upstream clang.<br>
</span></span></div>
</div>
</blockquote>
<br>
Great.<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><br>
We have begun implementing an early prototype of clacc.
Before we get too far into the implementation, we would
like to get feedback from the LLVM community to help
ensure our design would ultimately be acceptable for
contribution. For that purpose, below is an analysis of
several high-level design alternatives we have considered
and their various features. We welcome any feedback.<br>
<br>
Thanks.<br>
<br>
Joel E. Denny</span></span></div>
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span>Future
Technologies Group<br>
Oak Ridge National Laboratory</span></span></div>
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><br>
</span></span></div>
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace"><br>
</span></span></span></div>
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace">Design
Alternatives<br>
-------------------<br>
<br>
We have considered three design alternatives for the
clacc compiler:<br>
<br>
1. acc src --parser--> <wbr> omp
AST --codegen--> LLVM IR + omp rt calls<br>
</span></span></span></div>
</div>
</blockquote>
<br>
I don't think that we want this option because, if nothing else, it
will preclude builting source-level tooling for OpenACC.<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace">2. acc src
--parser--> acc AST
--codegen--> LLVM IR + omp rt calls<br>
3. acc src --parser--> acc AST --ttx--> omp
AST --codegen--> LLVM IR + omp rt calls<br>
</span></span></span></div>
</div>
</blockquote>
<br>
My recommendation: We should think about the very best way we could
refactor the code to implement (2), and if that is too ugly (or
otherwise significantly degrades maintainability of the OpenMP
code), then we should choose (3).<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace"><br>
In the above diagram:<br>
<br>
* acc src = C source code containing acc constructs.<br>
* acc AST = a clang AST in which acc constructs are
represented by<br>
nodes with acc node types. Of course, such node types
do not<br>
already exist in clang's implementation.<br>
* omp AST = a clang AST in which acc constructs have
been lowered<br>
to omp constructs represented by nodes with omp node
types. Of<br>
course, such node types do already exist in clang's<br>
implementation.<br>
* parser = the existing clang parser and semantic
analyzer,<br>
extended to handle acc constructs.<br>
* codegen = the existing clang backend that translates a
clang AST<br>
to LLVM IR, extended if necessary (depending on which
design is<br>
chosen) to perform codegen from acc nodes.<br>
* ttx (tree transformer) = a new clang component that
transforms<br>
acc to omp in clang ASTs.<br>
<br>
Design Features<br>
---------------<br>
<br>
There are several features to consider when choosing
among the designs<br>
in the previous section:<br>
<br>
1. acc AST as an artifact -- Because they create acc AST
nodes,<br>
designs 2 and 3 best facilitate the creation of
additional acc<br>
source-level tools (such as pretty printers,
analyzers, lint-like<br>
tools, and editor extensions). Some of these tools,
such as pretty<br>
printing, would be available immediately or as minor
extensions of<br>
tools that already exist in clang's ecosystem.<br>
<br>
2. omp AST/source as an artifact -- Because they create
omp AST<br>
nodes, designs 1 and 3 best facilitate the use of
source-level<br>
tools to help an application developer discover how
clacc has<br>
mapped his acc to omp, possibly in order to debug a
mapping<br>
specification he has supplied. With design 2
instead, an<br>
application developer has to examine low-level LLVM
IR + omp rt<br>
calls. Moreover, with designs 1 and 3, permanently
migrating an<br>
application's acc source to omp source can be
automated.<br>
<br>
3. omp AST for mapping implementation -- Designs 1 and 3
might<br>
also make it easier for the compiler developer to
reason about and<br>
implement mappings from acc to omp. That is, because
acc and omp<br>
syntax is so similar, implementing the translation at
the level of<br>
a syntactic representation is probably easier than
translating to<br>
LLVM IR.<br>
<br>
4. omp AST for codegen -- Designs 1 and 3 simplify the<br>
compiler implementation by enabling reuse of clang's
existing omp<br>
support for codegen. In contrast, design 2 requires
at least some<br>
extensions to clang codegen to support acc nodes.<br>
<br>
5. Full acc AST for mapping -- Designs 2 and 3
potentially<br>
enable the compiler to analyze the entire source (as
opposed to<br>
just the acc construct currently being parsed) while
choosing the<br>
mapping to omp. It is not clear if this feature will
prove useful,<br>
but it might enable more optimizations and compiler
research<br>
opportunities.<br>
</span></span></span></div>
</div>
</blockquote>
<br>
We'll end up doing this, but most of this falls within the scope of
the "parallel IR" designs that many of us are working on. Doing this
kind of analysis in the frontend is hard (because it essentially
requires it to do inlining, simplification, and analysis akin to
what the optimizer itself does).<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace"><br>
6. No acc node classes -- Design 1 simplifies the
compiler<br>
implementation by eliminating the need to implement
many acc node<br>
classes. While we have so far found that
implementing these<br>
classes is mostly mechanical, it does take a
non-trivial amount of<br>
time.<br>
</span></span></span></div>
<span style="font-family:monospace,monospace"><br>
7. No omp mapping -- Design 2 does not require acc to be
mapped to<br>
omp. That is, it is conceivable that, for some acc
constructs,<br>
there will prove to be no omp syntax to capture the
semantics we<br>
wish to implement. <br>
</span></div>
</blockquote>
<br>
I'm fairly certain that not everything maps exactly. They'll be some
things we need to deal with explicitly in CodeGen.<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr"><span style="font-family:monospace,monospace"> It
is also conceivable that we might one day<br>
want to represent some acc constructs directly as
extensions to<br>
LLVM IR, where some acc analyses or optimizations might be
more<br>
feasible to implement. This possibility dovetails with
recent<br>
discussions in the LLVM community about developing LLVM IR<br>
extensions for various parallel programming models.</span><span
style="font-family:monospace,monospace"><br>
</span></div>
</blockquote>
<br>
+1<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr"><span style="font-family:monospace,monospace"><br>
<span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span></span></span></span>
<div>
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace">Because of
features 4 and 6, design 1 is likely the fastest
design to<br>
implement, at least at first while we focus on simple
acc features and<br>
simple mappings to omp. However, we have so far found
no advantage<br>
that design 1 has but that design 3 does not have
except for feature<br>
6, which we see as the least important of the above
features in the<br>
long term.<br>
<br>
The only advantage we have found that design 2 has but
that design 3<br>
does not have is feature 7. It should be possible to
choose design 3<br>
as the default but, for certain acc constructs or
scenarios where<br>
feature 7 proves important (if any), incorporate
design 2. In other<br>
words, if we decide not to map a particular acc
construct to any omp<br>
construct, ttx would leave it alone, and we would
extend codegen to<br>
handle it directly.<br>
</span></span></span></div>
</div>
</div>
</blockquote>
<br>
This makes sense to me, and I think is most likely to leave the
CodeGen code easiest to maintain (and has good separation of
concerns). Nevertheless, I think we should go through the mental
refactoring exercise for (2) to decide on the value of (3).<br>
<br>
Thanks again,<br>
Hal<br>
<br>
<blockquote
cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div><span
class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span
style="font-family:monospace,monospace"><br>
Conclusions<br>
-----------<br>
<br>
For the above reasons, and because design 3 offers the
cleanest<br>
separation of concerns, we have chosen design 3 with
the possibility<br>
of incorporating design 2 where it proves useful.<br>
<br>
Because of the immutability of clang's AST, the design
of our proposed<br>
ttx component requires careful consideration. To
shorten this initial<br>
email, we have omitted those details for now, but we
will be happy to<br>
include them as the discussion progresses.</span><br>
</span></span></div>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</body>
</html>