[cfe-dev] RFC: clacc: translating OpenACC to OpenMP in clang
Joel E. Denny via cfe-dev
cfe-dev at lists.llvm.org
Fri Dec 8 07:58:15 PST 2017
Hi Hal,
Thanks for your feedback. It sounds like we're basically in agreement, but
I've added a few thoughts inline below.
On Wed, Dec 6, 2017 at 4:02 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> On 12/05/2017 01:06 PM, Joel E. Denny wrote:
>
> Hi,
>
> We are working on a new project, clacc, that extends clang with OpenACC
> support. Clacc's approach is to translate OpenACC (a descriptive language)
> to OpenMP (a prescriptive language) and thus to build on clang's existing
> OpenMP support. While we plan to develop clacc to support our own
> research, an important goal is to contribute clacc as a production-quality
> component of upstream clang.
>
>
> Great.
>
>
> We have begun implementing an early prototype of clacc. Before we get too
> far into the implementation, we would like to get feedback from the LLVM
> community to help ensure our design would ultimately be acceptable for
> contribution. For that purpose, below is an analysis of several high-level
> design alternatives we have considered and their various features. We
> welcome any feedback.
>
> Thanks.
>
> Joel E. Denny
> Future Technologies Group
> Oak Ridge National Laboratory
>
>
> Design Alternatives
> -------------------
>
> We have considered three design alternatives for the clacc compiler:
>
> 1. acc src --parser--> omp AST --codegen--> LLVM
> IR + omp rt calls
>
>
> I don't think that we want this option because, if nothing else, it will
> preclude builting source-level tooling for OpenACC.
>
Agreed.
> 2. acc src --parser--> acc AST --codegen--> LLVM IR
> + omp rt calls
> 3. acc src --parser--> acc AST --ttx--> omp AST --codegen--> LLVM IR
> + omp rt calls
>
>
> My recommendation: We should think about the very best way we could
> refactor the code to implement (2), and if that is too ugly (or otherwise
> significantly degrades maintainability of the OpenMP code), then we should
> choose (3).
>
I started out with design 2 in the early prototype I'm experimenting with.
Eventually I figured out some possibilities for how to implement the ttx
component above (I'd be happy to discuss that), and I switched to design
3. So far, I'm finding design 3 to be easier to implement. Moreover, I
can use -ast-print combined with a custom option to print either OpenACC
source, OpenMP source, or both with one commented out. I like that
capability. However, I think it's clear that design 3 has greater
potential for running into difficulties as I move forward to more complex
OpenACC constructs.
>
>
> In the above diagram:
>
> * acc src = C source code containing acc constructs.
> * acc AST = a clang AST in which acc constructs are represented by
> nodes with acc node types. Of course, such node types do not
> already exist in clang's implementation.
> * omp AST = a clang AST in which acc constructs have been lowered
> to omp constructs represented by nodes with omp node types. Of
> course, such node types do already exist in clang's
> implementation.
> * parser = the existing clang parser and semantic analyzer,
> extended to handle acc constructs.
> * codegen = the existing clang backend that translates a clang AST
> to LLVM IR, extended if necessary (depending on which design is
> chosen) to perform codegen from acc nodes.
> * ttx (tree transformer) = a new clang component that transforms
> acc to omp in clang ASTs.
>
> Design Features
> ---------------
>
> There are several features to consider when choosing among the designs
> in the previous section:
>
> 1. acc AST as an artifact -- Because they create acc AST nodes,
> designs 2 and 3 best facilitate the creation of additional acc
> source-level tools (such as pretty printers, analyzers, lint-like
> tools, and editor extensions). Some of these tools, such as pretty
> printing, would be available immediately or as minor extensions of
> tools that already exist in clang's ecosystem.
>
> 2. omp AST/source as an artifact -- Because they create omp AST
> nodes, designs 1 and 3 best facilitate the use of source-level
> tools to help an application developer discover how clacc has
> mapped his acc to omp, possibly in order to debug a mapping
> specification he has supplied. With design 2 instead, an
> application developer has to examine low-level LLVM IR + omp rt
> calls. Moreover, with designs 1 and 3, permanently migrating an
> application's acc source to omp source can be automated.
>
> 3. omp AST for mapping implementation -- Designs 1 and 3 might
> also make it easier for the compiler developer to reason about and
> implement mappings from acc to omp. That is, because acc and omp
> syntax is so similar, implementing the translation at the level of
> a syntactic representation is probably easier than translating to
> LLVM IR.
>
> 4. omp AST for codegen -- Designs 1 and 3 simplify the
> compiler implementation by enabling reuse of clang's existing omp
> support for codegen. In contrast, design 2 requires at least some
> extensions to clang codegen to support acc nodes.
>
> 5. Full acc AST for mapping -- Designs 2 and 3 potentially
> enable the compiler to analyze the entire source (as opposed to
> just the acc construct currently being parsed) while choosing the
> mapping to omp. It is not clear if this feature will prove useful,
> but it might enable more optimizations and compiler research
> opportunities.
>
>
> We'll end up doing this, but most of this falls within the scope of the
> "parallel IR" designs that many of us are working on. Doing this kind of
> analysis in the frontend is hard (because it essentially requires it to do
> inlining, simplification, and analysis akin to what the optimizer itself
> does).
>
I agree. However, before the parallel IR efforts mature, I need to make
progress. Also, I want to keep my options open, especially at this early
stage, so I can experiment with different possibilities.
>
> 6. No acc node classes -- Design 1 simplifies the compiler
> implementation by eliminating the need to implement many acc node
> classes. While we have so far found that implementing these
> classes is mostly mechanical, it does take a non-trivial amount of
> time.
>
> 7. No omp mapping -- Design 2 does not require acc to be mapped to
> omp. That is, it is conceivable that, for some acc constructs,
> there will prove to be no omp syntax to capture the semantics we
> wish to implement.
>
>
> I'm fairly certain that not everything maps exactly. They'll be some
> things we need to deal with explicitly in CodeGen.
>
> It is also conceivable that we might one day
> want to represent some acc constructs directly as extensions to
> LLVM IR, where some acc analyses or optimizations might be more
> feasible to implement. This possibility dovetails with recent
> discussions in the LLVM community about developing LLVM IR
> extensions for various parallel programming models.
>
>
> +1
>
>
> Because of features 4 and 6, design 1 is likely the fastest design to
> implement, at least at first while we focus on simple acc features and
> simple mappings to omp. However, we have so far found no advantage
> that design 1 has but that design 3 does not have except for feature
> 6, which we see as the least important of the above features in the
> long term.
>
> The only advantage we have found that design 2 has but that design 3
> does not have is feature 7. It should be possible to choose design 3
> as the default but, for certain acc constructs or scenarios where
> feature 7 proves important (if any), incorporate design 2. In other
> words, if we decide not to map a particular acc construct to any omp
> construct, ttx would leave it alone, and we would extend codegen to
> handle it directly.
>
>
> This makes sense to me, and I think is most likely to leave the CodeGen
> code easiest to maintain (and has good separation of concerns).
> Nevertheless, I think we should go through the mental refactoring exercise
> for (2) to decide on the value of (3).
>
At this moment, I'm finding that the easiest way to explore is to just push
forward with design 3. Even so, if developers who have a deeper
understanding than I do of clang's OpenMP implementation would like to have
an email discussion on the refactoring exercise for design 2, I agree that
would be helpful.
> Thanks again,
> Hal
>
Thanks.
Joel
>
> Conclusions
> -----------
>
> For the above reasons, and because design 3 offers the cleanest
> separation of concerns, we have chosen design 3 with the possibility
> of incorporating design 2 where it proves useful.
>
> Because of the immutability of clang's AST, the design of our proposed
> ttx component requires careful consideration. To shorten this initial
> email, we have omitted those details for now, but we will be happy to
> include them as the discussion progresses.
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20171208/05b2cda4/attachment.html>
More information about the cfe-dev
mailing list