<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 12/05/2017 01:06 PM, Joel E. Denny

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <div dir="ltr">

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span>Hi,<br>

              <br>

              We are working on a new project, clacc, that extends clang

              with OpenACC support.  Clacc's approach is to translate

              OpenACC (a descriptive language) to OpenMP (a prescriptive

              language) and thus to build on clang's existing OpenMP

              support.  While we plan to develop clacc to support our

              own research, an important goal is to contribute clacc as

              a production-quality component of upstream clang.<br>

            </span></span></div>

      </div>

    </blockquote>

    <br>

    Great.<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><br>

              We have begun implementing an early prototype of clacc. 

              Before we get too far into the implementation, we would

              like to get feedback from the LLVM community to help

              ensure our design would ultimately be acceptable for

              contribution.  For that purpose, below is an analysis of

              several high-level design alternatives we have considered

              and their various features.  We welcome any feedback.<br>

              <br>

              Thanks.<br>

              <br>

              Joel E. Denny</span></span></div>

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span>Future

              Technologies Group<br>

              Oak Ridge National Laboratory</span></span></div>

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><br>

            </span></span></div>

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                style="font-family:monospace,monospace"><br>

              </span></span></span></div>

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                style="font-family:monospace,monospace">Design

                Alternatives<br>

                -------------------<br>

                <br>

                We have considered three design alternatives for the

                clacc compiler:<br>

                <br>

                1. acc src  --parser-->                   <wbr>  omp

                AST  --codegen-->  LLVM IR + omp rt calls<br>

              </span></span></span></div>

      </div>

    </blockquote>

    <br>

    I don't think that we want this option because, if nothing else, it

    will preclude builting source-level tooling for OpenACC.<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                style="font-family:monospace,monospace">2. acc src 

                --parser-->  acc AST                    

                --codegen-->  LLVM IR + omp rt calls<br>

                3. acc src  --parser-->  acc AST  --ttx-->  omp

                AST  --codegen-->  LLVM IR + omp rt calls<br>

              </span></span></span></div>

      </div>

    </blockquote>

    <br>

    My recommendation: We should think about the very best way we could

    refactor the code to implement (2), and if that is too ugly (or

    otherwise significantly degrades maintainability of the OpenMP

    code), then we should choose (3).<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                style="font-family:monospace,monospace"><br>

                In the above diagram:<br>

                <br>

                * acc src = C source code containing acc constructs.<br>

                * acc AST = a clang AST in which acc constructs are

                represented by<br>

                  nodes with acc node types.  Of course, such node types

                do not<br>

                  already exist in clang's implementation.<br>

                * omp AST = a clang AST in which acc constructs have

                been lowered<br>

                  to omp constructs represented by nodes with omp node

                types.  Of<br>

                  course, such node types do already exist in clang's<br>

                  implementation.<br>

                * parser = the existing clang parser and semantic

                analyzer,<br>

                  extended to handle acc constructs.<br>

                * codegen = the existing clang backend that translates a

                clang AST<br>

                  to LLVM IR, extended if necessary (depending on which

                design is<br>

                  chosen) to perform codegen from acc nodes.<br>

                * ttx (tree transformer) = a new clang component that

                transforms<br>

                  acc to omp in clang ASTs.<br>

                <br>

                Design Features<br>

                ---------------<br>

                <br>

                There are several features to consider when choosing

                among the designs<br>

                in the previous section:<br>

                <br>

                1. acc AST as an artifact -- Because they create acc AST

                nodes,<br>

                   designs 2 and 3 best facilitate the creation of

                additional acc<br>

                   source-level tools (such as pretty printers,

                analyzers, lint-like<br>

                   tools, and editor extensions).  Some of these tools,

                such as pretty<br>

                   printing, would be available immediately or as minor

                extensions of<br>

                   tools that already exist in clang's ecosystem.<br>

                <br>

                2. omp AST/source as an artifact -- Because they create

                omp AST<br>

                   nodes, designs 1 and 3 best facilitate the use of

                source-level<br>

                   tools to help an application developer discover how

                clacc has<br>

                   mapped his acc to omp, possibly in order to debug a

                mapping<br>

                   specification he has supplied.  With design 2

                instead, an<br>

                   application developer has to examine low-level LLVM

                IR + omp rt<br>

                   calls.  Moreover, with designs 1 and 3, permanently

                migrating an<br>

                   application's acc source to omp source can be

                automated.<br>

                <br>

                3. omp AST for mapping implementation -- Designs 1 and 3

                might<br>

                   also make it easier for the compiler developer to

                reason about and<br>

                   implement mappings from acc to omp.  That is, because

                acc and omp<br>

                   syntax is so similar, implementing the translation at

                the level of<br>

                   a syntactic representation is probably easier than

                translating to<br>

                   LLVM IR.<br>

                <br>

                4. omp AST for codegen -- Designs 1 and 3 simplify the<br>

                   compiler implementation by enabling reuse of clang's

                existing omp<br>

                   support for codegen.  In contrast, design 2 requires

                at least some<br>

                   extensions to clang codegen to support acc nodes.<br>

                <br>

                5. Full acc AST for mapping -- Designs 2 and 3

                potentially<br>

                   enable the compiler to analyze the entire source (as

                opposed to<br>

                   just the acc construct currently being parsed) while

                choosing the<br>

                   mapping to omp.  It is not clear if this feature will

                prove useful,<br>

                   but it might enable more optimizations and compiler

                research<br>

                   opportunities.<br>

              </span></span></span></div>

      </div>

    </blockquote>

    <br>

    We'll end up doing this, but most of this falls within the scope of

    the "parallel IR" designs that many of us are working on. Doing this

    kind of analysis in the frontend is hard (because it essentially

    requires it to do inlining, simplification, and analysis akin to

    what the optimizer itself does).<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                style="font-family:monospace,monospace"><br>

                6. No acc node classes -- Design 1 simplifies the

                compiler<br>

                   implementation by eliminating the need to implement

                many acc node<br>

                   classes.  While we have so far found that

                implementing these<br>

                   classes is mostly mechanical, it does take a

                non-trivial amount of<br>

                   time.<br>

              </span></span></span></div>

        <span style="font-family:monospace,monospace"><br>

          7. No omp mapping -- Design 2 does not require acc to be

          mapped to<br>

             omp.  That is, it is conceivable that, for some acc

          constructs,<br>

             there will prove to be no omp syntax to capture the

          semantics we<br>

             wish to implement. <br>

        </span></div>

    </blockquote>

    <br>

    I'm fairly certain that not everything maps exactly. They'll be some

    things we need to deal with explicitly in CodeGen.<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr"><span style="font-family:monospace,monospace"> It

          is also conceivable that we might one day<br>

             want to represent some acc constructs directly as

          extensions to<br>

             LLVM IR, where some acc analyses or optimizations might be

          more<br>

             feasible to implement.  This possibility dovetails with

          recent<br>

             discussions in the LLVM community about developing LLVM IR<br>

             extensions for various parallel programming models.</span><span

          style="font-family:monospace,monospace"><br>

        </span></div>

    </blockquote>

    <br>

    +1<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr"><span style="font-family:monospace,monospace"><br>

          <span

            class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span></span></span></span>

        <div>

          <div><span

              class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                  style="font-family:monospace,monospace">Because of

                  features 4 and 6, design 1 is likely the fastest

                  design to<br>

                  implement, at least at first while we focus on simple

                  acc features and<br>

                  simple mappings to omp.  However, we have so far found

                  no advantage<br>

                  that design 1 has but that design 3 does not have

                  except for feature<br>

                  6, which we see as the least important of the above

                  features in the<br>

                  long term.<br>

                  <br>

                  The only advantage we have found that design 2 has but

                  that design 3<br>

                  does not have is feature 7.  It should be possible to

                  choose design 3<br>

                  as the default but, for certain acc constructs or

                  scenarios where<br>

                  feature 7 proves important (if any), incorporate

                  design 2.  In other<br>

                  words, if we decide not to map a particular acc

                  construct to any omp<br>

                  construct, ttx would leave it alone, and we would

                  extend codegen to<br>

                  handle it directly.<br>

                </span></span></span></div>

        </div>

      </div>

    </blockquote>

    <br>

    This makes sense to me, and I think is most likely to leave the

    CodeGen code easiest to maintain (and has good separation of

    concerns). Nevertheless, I think we should go through the mental

    refactoring exercise for (2) to decide on the value of (3).<br>

    <br>

    Thanks again,<br>

    Hal<br>

    <br>

    <blockquote

cite="mid:CAA=AU40sBy58vkvWdw1AgOwviaBVWKQqQd_np2J2Ms-Pa+T78A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <div><span

              class="gmail-m_1339134059082687238gmail-m_-1727004882907755355gmail-gI"><span><span

                  style="font-family:monospace,monospace"><br>

                  Conclusions<br>

                  -----------<br>

                  <br>

                  For the above reasons, and because design 3 offers the

                  cleanest<br>

                  separation of concerns, we have chosen design 3 with

                  the possibility<br>

                  of incorporating design 2 where it proves useful.<br>

                  <br>

                  Because of the immutability of clang's AST, the design

                  of our proposed<br>

                  ttx component requires careful consideration.  To

                  shorten this initial<br>

                  email, we have omitted those details for now, but we

                  will be happy to<br>

                  include them as the discussion progresses.</span><br>

              </span></span></div>

        </div>

      </div>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </body>

</html>