[cfe-dev] RFC: clacc: translating OpenACC to OpenMP in clang

Mon Dec 11 15:24:31 PST 2017

On Mon, Dec 11, 2017 at 9:26 AM, C Bergström via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> 2nd - there was no community usage requirement when Intel started working
> on getting OMP4 added to clang. It was allowed that it could incrementally
> be merged and reviewed. Adding this as a troll blocker just doesn't seem to
> be fair. Please stop with the politics and judge this based on technical
> merit alone. OpenACC and OpenMP4+ have similar goals, but try to achieve
> them in different ways. All my biases aside please stick to technical
> reasons for why this should be blocked. (For example, if there isn't anyone
> who has pledged to continue to support it long term.. etc)
>

There is an *explicit* requirement in the "Contributing Extensions to
Clang" section of "Getting Involved with the Clang Project" document (
http://clang.llvm.org/get_involved.html#criteria):

"Evidence of a significant user community: This is based on a number of
factors, including an actual, existing user community, the perceived
likelihood that users would adopt such a feature if it were available, and
any "trickle-down" effects that come from, e.g., a library adopting the
feature and providing benefits to its users."

As an example, CilkPlus support was rejected partly due to concerns of not
significant user community present at the time.

I don't remember the exact situation with OpenMP, but perhaps the concern
hadn't been risen simply because usage interest (including from your
company, no?) was just too obvious for everyone?

Yours,
Andrey
===
Compiler Architect
NXP

>
> So basically if this is just some lame research project with no extended
> maintenance plan, make a fork and put your code there and not bother
> upstream. Otherwise, I think it's a very welcome idea.
>
>
>
> On Sat, Dec 9, 2017 at 2:02 AM, Jeff Hammond via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>>
>>
>> On Fri, Dec 8, 2017 at 9:00 AM, Joel E. Denny <jdenny.ornl at gmail.com>
>> wrote:
>> >
>> > On Fri, Dec 8, 2017 at 11:32 AM, Jeff Hammond <jeff.science at gmail.com>
>> wrote:
>> >>
>> >>
>> >>
>> >> On Fri, Dec 8, 2017 at 7:51 AM, Joel E. Denny <jdenny.ornl at gmail.com>
>> wrote:
>> >> >
>> >> > Hi Jeff, Hal,
>> >> >
>> >> > Thanks for your feedback.  My comments are inline below.
>> >> >
>> >> > On Tue, Dec 5, 2017 at 6:43 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> >> >>
>> >> >> On 12/05/2017 05:11 PM, Jeff Hammond via cfe-dev wrote:
>> >> >>
>> >> >> All of the usage of OpenACC outside of benchmarks/research that I
>> know about is done in Fortran.
>> >> >
>> >> > I agree that it's easier to find real apps that use OpenACC in
>> Fortran than those that use OpenACC in C/C++.  However, the latter
>> certainly exist.  For example:
>> >>
>> >> Two of the three examples you cite are primarily Fortran and using
>> OpenACC exclusively in Fortran subroutines.
>> >
>> >
>> > Are you saying that the occurrences of "pragma acc" in Nek5000 and
>> NekCEM are unused?
>> >
>>
>> The instances of "pragma acc" in those - it's the same code in both
>> projects - are either (1) only causing host-device data synchronization or
>> (2) commented-out.
>>
>> It's unclear to me what actually happens in the code as currently
>> written.  The OpenACC C/C++ code does not more than copy data to/from the
>> device.  I didn't trace the entire code execution but I can't tell if any
>> code touches the device data that OpenACC is updating.  If it is updated,
>> it is updated by Fortran OpenACC code somewhere else in the source tree.
>>
>> What does the OpenACC standard say about interoperability of
>> compilers+runtimes, as would be required if one used Clang OpenACC for
>> C/C++ and Fortran OpenACC implemented by PGI, Cray, or GCC.  OpenMP
>> definitely does not support this, even if a subset of usage may work when
>> one uses the same runtime library with different compilers.
>>
>> /tmp/Nek5000$ git grep "pragma acc"
>> jl/gs.c:#pragma acc update host(sendbuf[0:unit_size*bufSize/2]) if(acc)
>> jl/gs.c:#pragma acc update device(buf[0:unit_size*bufSize/2]) if(acc)
>> jl/gs.c:#pragma acc update host(sendbuf[0:unit_size*bufSize/2]) if(acc)
>> jl/gs.c:#pragma acc update device(buf[0:unit_size*bufSize/2]) if(acc)
>> jl/gs.c:#pragma acc exit data delete(map0,map1)
>> jl/gs.c:#pragma acc update host(buf[0:unit_size*bufSize]) if(acc)
>> jl/gs.c:#pragma acc update device(buf[0:unit_size*bufSize]) if(acc)
>> jl/gs.c:#pragma acc update host(buf[0:unit_size*bufSize]) if(acc)
>> jl/gs.c:#pragma acc update device(buf[0:unit_size*bufSize]) if(acc)
>> jl/gs.c://#pragma acc enter data copyin(stage[0].scatter_map[0:
>> stage[0].s_size],stage[0].scatter_mapf[0:stage[0].s_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage[i].scatter_map[i:
>> stage[i].s_size],stage[i].scatter_mapf[i:stage[i].s_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage[i].gather_map[i:s
>> tage[i].g_size],stage[i].gather_mapf[i:stage[i].g_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage[i].scatter_map[i:
>> stage[i].s_size],stage[i].scatter_mapf[i:stage[i].s_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage[i].gather_map[i:s
>> tage[i].g_size],stage[i].gather_mapf[i:stage[i].g_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage2[0].scatter_map[0
>> :stage2[0].s_size],stage2[0].scatter_mapf[0:stage2[0].s_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage2[i].scatter_map[i
>> :stage2[i].s_size],stage2[i].scatter_mapf[i:stage2[i].s_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage2[i].gather_map[i:
>> stage2[i].g_size],stage2[i].gather_mapf[i:stage2[i].g_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage2[i].scatter_map[i
>> :stage2[i].s_size],stage2[i].scatter_mapf[i:stage2[i].s_nt])
>> jl/gs.c://#pragma acc enter data copyin(stage2[i].gather_map[i:
>> stage2[i].g_size],stage2[i].gather_mapf[i:stage2[i].g_nt])
>> jl/gs.c:#pragma acc exit data delete(map,mapf)
>> jl/gs.c:#pragma acc exit data delete(map,mapf)
>> jl/gs.c:#pragma acc exit data delete(map,mapf)
>> jl/gs.c:#pragma acc exit data delete(map,mapf)
>> jl/gs.c:#pragma acc update host(buf[0:vn*unit_size*bufSize]) if(acc)
>> jl/gs.c:#pragma acc update device(buf[0:vn*unit_size*bufSize]) if(acc)
>> jl/gs.c:  //#pragma acc exit data delete(ard->map_to_buf[0],ard-
>> >map_to_buf[1],ard->map_from_buf[0],ard->map_from_buf[1])
>> jl/gs.c:  //#pragma acc enter data copyin(ard->map_to_buf[0][0:ar
>> d->mt_size[0]],ard->map_from_buf[0][0:ard->mf_size[0]],ard-
>> >map_to_buf_f[0][0:ard->mt_nt[0]],ard->map_from_buf_f[0][0:
>> ard->mf_nt[0]],ard->map_to_buf[1][0:ard->mt_size[1]],ard-
>> >map_from_buf[1][0:ard->mf_size[1]],ard->map_to_buf_f[1][
>> 0:ard->mt_nt[1]],ard->map_from_buf_f[1][0:ard->mf_nt[1]])
>> jl/gs.c:#pragma acc update host(a[0:n])
>> jl/gs.c:#pragma acc update host(a[0:n])
>> jl/gs.c:#pragma acc exit data delete(bufPtr)
>> jl/gs.c:#pragma acc enter data create(bufPtr[0:vn*gs_dom_size
>> [dom]*gsh->r.buffer_size])
>> jl/gs.c:#pragma acc exit data delete(bufPtr)
>> jl/gs.c:#pragma acc enter data create(bufPtr[0:vn*gs_dom_size
>> [dom]*gsh->r.buffer_size])
>> jl/gs.c:#pragma acc exit data delete(map_local0,map_local1,f
>> lagged_primaries)
>> jl/gs.c:#pragma acc enter data pcopyin(map[0:*m_size],mapf2[0:2*mf_temp])
>> jl/gs_acc.c://#pragma acc data present(buf[0:l])
>> jl/gs_acc.c://#pragma acc host_data use_device(buf)
>> jl/gs_acc.c://#pragma acc data present(buf[0:l])
>> jl/gs_acc.c://#pragma acc host_data use_device(buf)
>> jl/gs_acc.c:  //#pragma acc enter data copyin(t_mapf[0:t_m_nt*2],mapf
>> [0:m_nt*2],snd_mapf[0:snd_m_nt*2],rcv_mapf[0:rcv_m_nt*2],fp_mapf[0:fp_m_nt*2],
>> t_map[0:t_m_size],map[0:m_size],fp_map[0:fp_m_size],snd_map[
>> 0:snd_m_size],rcv_map[0:rcv_m_size])
>> jl/gs_acc.c:  //#pragma acc enter data copyin(t_mapf[0:t_m_nt*2],mapf
>> [0:m_nt*2],snd_mapf[0:snd_m_nt*2],rcv_mapf[0:rcv_m_nt*2],fp_mapf[0:fp_m_nt*2],
>> t_map[0:t_m_size],map[0:m_size],fp_map[0:fp_m_size],snd_map[
>> 0:snd_m_size],rcv_map[0:rcv_m_size])
>> jl/gs_acc.c://#pragma acc enter data pcopyin(t_mapf[0:t_m_nt*2],map
>> f[0:m_nt*2],snd_mapf[0:snd_m_nt*2],rcv_mapf[0:rcv_m_nt*2],fp_mapf[0:fp_m_nt*2],
>> t_map[0:t_m_size],map[0:m_size],fp_map[0:fp_m_size],snd_map[
>> 0:snd_m_size],rcv_map[0:rcv_m_size])
>> jl/gs_acc.c://#pragma acc data present(u[0:uds],mapf[0:m_nt*2
>> ],snd_mapf[0:snd_m_nt*2],rcv_mapf[0:rcv_m_nt*2],fp_mapf[0:fp
>> _m_nt*2],t_map[0:t_m_size],map[0:m_size],fp_map[0:fp_m_size]
>> ,snd_map[0:snd_m_size],rcv_map[0:rcv_m_size])
>> jl/gs_acc.c://#pragma acc data create(sbuf[0:bl],rbuf[0:bl]) if(bl!=0)
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],map[0:m_size],mapf[0:m_nt*2]) private(i,j,t) async(k+1)
>> jl/gs_acc.c://#pragma acc loop seq
>> jl/gs_acc.c://#pragma acc wait
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],fp_map[0:fp_m_size],fp_mapf[0:fp_m_nt*2]) private(i,j)
>> async(k+1)
>> jl/gs_acc.c://#pragma acc loop seq
>> jl/gs_acc.c://#pragma acc wait
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],fp_map[0:fp_m_size]) private(i,k)
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],snd_map[0:snd_m_size],snd_mapf[0:snd_m_nt*2],sbuf[0:bl])
>> private(i,j,t) async(k+1)
>> jl/gs_acc.c://#pragma acc loop seq
>> jl/gs_acc.c://#pragma acc wait
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],snd_map[0:snd_m_size],sbuf[0:bl]) private(i,j,k)
>> jl/gs_acc.c://#pragma acc update host(sbuf[0:bl]) async(vn+2)
>> jl/gs_acc.c://#pragma acc wait
>> jl/gs_acc.c://#pragma acc update device(rbuf[0:bl]) async(vn+2)
>> jl/gs_acc.c://#pragma acc wait
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],rcv_map[0:rcv_m_size],rcv_mapf[0:rcv_m_nt*2],rbuf[0:bl])
>> private(i,j,t) async(k+1)
>> jl/gs_acc.c://#pragma acc loop seq
>> jl/gs_acc.c://#pragma acc wait
>> jl/gs_acc.c:    //#pragma acc parallel loop gang vector
>> present(u[0:uds],rcv_map[0:rcv_m_size],rbuf[0:bl]) private(i,j,k)
>> jl/gs_acc.c://#pragma acc parallel loop gang vector
>> present(u[0:uds],t_map[0:t_m_size],t_mapf[0:t_m_nt*2]) private(i,j,t)
>> async(k+1)
>> jl/gs_acc.c://#pragma acc loop seq
>> jl/gs_acc.c://#pragma acc wait
>>
>> >>
>> >>
>> >> > http://mrfil.github.io/PowerGrid/
>> >>
>> >> /tmp/PowerGrid$ git grep -il "pragma acc"
>> >> PowerGrid/Gfft.hpp
>> >> PowerGrid/Gnufft.hpp
>> >> PowerGrid/ftCpu.hpp
>> >> PowerGrid/gridding.hpp
>> >> PowerGrid/griddingSupport.hpp
>> >>
>> >> From http://mrfil.github.io/PowerGrid/docs/Installation:
>> >>
>> >> We have experience with PGC++ 15.7 from NVIDIA/The Portland Group as
>> the version we have used most extensively. There is a free license
>> available as part of the OpenACC Toolkit for academic users.
>> >>
>> >> GCC 6.1 has OpenACC support but has not yet been tested by the
>> developers, we welcome reports of anyone trying to compile with it. We hope
>> to support it alongside PGI compilers in the near future.
>> >>
>> >> For those lucky enough to have access to Cray supercomputers, the Cray
>> compiler does support OpenACC, but we have not tried to build with it.
>> Because the Cray compilers are not available on desktops, workstations, or
>> non-Cray branded clusters, we have not dedicated resources to testing
>> PowerGrid on it.
>> >>
>> >> So these folks support OpenACC, but haven't bothered to try the GCC
>> implementation in the 1+ year that it's been available.  How likely are
>> they to use Clang's?
>> >
>> >
>> > I cannot answer that. Perhaps they were waiting for GCC support to
>> mature?
>>
>> Or maybe they aren't interested using in OpenACC compiler support outside
>> of PGI.
>>
>> What I'm really getting at here is who is going to use OpenACC support in
>> Clang, particularly if there is no compatible Fortran OpenACC compiler?  In
>> addition to justifying the code maintenance effort, users who are not
>> developers are essential for implementation hardening.
>>
>> Best,
>>
>> Jeff
>>
>>
>> > Thanks.
>> >
>> > Joel
>> >
>> >>
>> >> > https://nek5000.mcs.anl.gov/ (look at the openacc branch in github)
>> >>
>> >> (on the openacc branch)
>> >>
>> >> /tmp/Nek5000$ git grep -il "\$acc "
>> >> core/acc.f
>> >> core/comm_mpi.f
>> >> core/gmres.f
>> >> core/hmholtz.f
>> >> core/hsmg.f
>> >> core/math.f
>> >> core/navier1.f
>> >> core/navier4.f
>> >> core/plan4.f
>> >> core/prepost.f
>> >> core/subs2.f
>> >>
>> >> >
>> >> > https://nekcem.mcs.anl.gov/
>> >>
>> >> (on master)
>> >> /tmp/svn$ git grep -il "\$acc"
>> >> branches/maxwell-experimental/src/cem_dg.F
>> >> branches/maxwell-experimental/src/dssum2.F
>> >> branches/maxwell-experimental/src/io.F
>> >> branches/maxwell-experimental/src/mat1.F
>> >> branches/maxwell-experimental/src/maxwell.F
>> >> branches/maxwell-experimental/src/maxwell_acc.F
>> >> branches/maxwell-experimental/src/mxm_acc.F
>> >> branches/trunkQu/src/quantum_csr.F
>> >> branches/trunkQu/src/quantum_setup.f
>> >> branches/trunkQu/src/quantum_time.F
>> >> trunk/examples/openacc_gpu=1/box.usr
>> >> trunk/examples/openacc_gpu=8/box.usr
>> >> trunk/src/acoustic.F
>> >> trunk/src/cem_dg2.F
>> >> trunk/src/complex.F
>> >> trunk/src/drift1.F
>> >> trunk/src/drift1_maud.F
>> >> trunk/src/drive.F
>> >> trunk/src/drive_maud.F
>> >> trunk/src/dssum2.F
>> >> trunk/src/hmholtz.F
>> >> trunk/src/io.F
>> >> trunk/src/mat1.F
>> >> trunk/src/maxwell.F
>> >> trunk/src/maxwell_acc.F
>> >> trunk/src/mg_r2204.F
>> >> trunk/src/mxm_acc.F
>> >> trunk/src/poisson.F
>> >> trunk/src/quantum2.F
>> >> www/examples/libs/phpThumb/phpthumb.functions.php
>> >> www/examples/phpthumb.functions.php
>> >>
>> >> >>   Can you provide a list of C/C++ applications using OpenACC today
>> and estimate the number of users that will benefit from this feature?
>> >> >>
>> >> >>
>> >> >> Such lists exist, although I don't know what can be shared (and Oak
>> Ridge likely has better lists in this regard than I do).
>> >> >
>> >> > I'll look for a better list that I can share.
>> >>
>> >> That would be helpful.
>> >>
>> >> Best,
>> >>
>> >> Jeff
>> >>
>> >>
>> >>
>> >> >> I can tell you, from my own experience, that we're seeing an
>> increase in development using OpenACC, in both C/C++ and Fortran, over the
>> last couple of years (essentially because the compiler technology has
>> improved to the point where that is now a potentially-productive choice).
>> >> >
>> >> >
>> >> > Providing support in a production-quality, open-source compiler tool
>> chain like LLVM will hopefully accelerate this trend.
>> >> >
>> >> > Joel
>> >> >
>> >> >>
>> >> >> Also, we have a strong desire to enable tooling over code bases
>> using OpenACC. Among many other things, at some point we'll likely want the
>> option to automatically migrate much of this code to using OpenMP. Having
>> an OpenACC-enabled Clang, with an implementation that maps to OpenMP, is an
>> important step in that process.
>> >> >>
>> >> >>  -Hal
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Jeff
>> >> >>
>> >> >> On Tue, Dec 5, 2017 at 11:06 AM, Joel E. Denny via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> We are working on a new project, clacc, that extends clang with
>> OpenACC support.  Clacc's approach is to translate OpenACC (a descriptive
>> language) to OpenMP (a prescriptive language) and thus to build on clang's
>> existing OpenMP support.  While we plan to develop clacc to support our own
>> research, an important goal is to contribute clacc as a production-quality
>> component of upstream clang.
>> >> >>>
>> >> >>> We have begun implementing an early prototype of clacc.  Before we
>> get too far into the implementation, we would like to get feedback from the
>> LLVM community to help ensure our design would ultimately be acceptable for
>> contribution.  For that purpose, below is an analysis of several high-level
>> design alternatives we have considered and their various features.  We
>> welcome any feedback.
>> >> >>>
>> >> >>> Thanks.
>> >> >>>
>> >> >>> Joel E. Denny
>> >> >>> Future Technologies Group
>> >> >>> Oak Ridge National Laboratory
>> >> >>>
>> >> >>>
>> >> >>> Design Alternatives
>> >> >>> -------------------
>> >> >>>
>> >> >>> We have considered three design alternatives for the clacc
>> compiler:
>> >> >>>
>> >> >>> 1. acc src  --parser-->                     omp AST  --codegen-->
>>  LLVM IR + omp rt calls
>> >> >>> 2. acc src  --parser-->  acc AST                     --codegen-->
>>  LLVM IR + omp rt calls
>> >> >>> 3. acc src  --parser-->  acc AST  --ttx-->  omp AST  --codegen-->
>>  LLVM IR + omp rt calls
>> >> >>>
>> >> >>> In the above diagram:
>> >> >>>
>> >> >>> * acc src = C source code containing acc constructs.
>> >> >>> * acc AST = a clang AST in which acc constructs are represented by
>> >> >>>   nodes with acc node types.  Of course, such node types do not
>> >> >>>   already exist in clang's implementation.
>> >> >>> * omp AST = a clang AST in which acc constructs have been lowered
>> >> >>>   to omp constructs represented by nodes with omp node types.  Of
>> >> >>>   course, such node types do already exist in clang's
>> >> >>>   implementation.
>> >> >>> * parser = the existing clang parser and semantic analyzer,
>> >> >>>   extended to handle acc constructs.
>> >> >>> * codegen = the existing clang backend that translates a clang AST
>> >> >>>   to LLVM IR, extended if necessary (depending on which design is
>> >> >>>   chosen) to perform codegen from acc nodes.
>> >> >>> * ttx (tree transformer) = a new clang component that transforms
>> >> >>>   acc to omp in clang ASTs.
>> >> >>>
>> >> >>> Design Features
>> >> >>> ---------------
>> >> >>>
>> >> >>> There are several features to consider when choosing among the
>> designs
>> >> >>> in the previous section:
>> >> >>>
>> >> >>> 1. acc AST as an artifact -- Because they create acc AST nodes,
>> >> >>>    designs 2 and 3 best facilitate the creation of additional acc
>> >> >>>    source-level tools (such as pretty printers, analyzers,
>> lint-like
>> >> >>>    tools, and editor extensions).  Some of these tools, such as
>> pretty
>> >> >>>    printing, would be available immediately or as minor extensions
>> of
>> >> >>>    tools that already exist in clang's ecosystem.
>> >> >>>
>> >> >>> 2. omp AST/source as an artifact -- Because they create omp AST
>> >> >>>    nodes, designs 1 and 3 best facilitate the use of source-level
>> >> >>>    tools to help an application developer discover how clacc has
>> >> >>>    mapped his acc to omp, possibly in order to debug a mapping
>> >> >>>    specification he has supplied.  With design 2 instead, an
>> >> >>>    application developer has to examine low-level LLVM IR + omp rt
>> >> >>>    calls.  Moreover, with designs 1 and 3, permanently migrating an
>> >> >>>    application's acc source to omp source can be automated.
>> >> >>>
>> >> >>> 3. omp AST for mapping implementation -- Designs 1 and 3 might
>> >> >>>    also make it easier for the compiler developer to reason about
>> and
>> >> >>>    implement mappings from acc to omp.  That is, because acc and
>> omp
>> >> >>>    syntax is so similar, implementing the translation at the level
>> of
>> >> >>>    a syntactic representation is probably easier than translating
>> to
>> >> >>>    LLVM IR.
>> >> >>>
>> >> >>> 4. omp AST for codegen -- Designs 1 and 3 simplify the
>> >> >>>    compiler implementation by enabling reuse of clang's existing
>> omp
>> >> >>>    support for codegen.  In contrast, design 2 requires at least
>> some
>> >> >>>    extensions to clang codegen to support acc nodes.
>> >> >>>
>> >> >>> 5. Full acc AST for mapping -- Designs 2 and 3 potentially
>> >> >>>    enable the compiler to analyze the entire source (as opposed to
>> >> >>>    just the acc construct currently being parsed) while choosing
>> the
>> >> >>>    mapping to omp.  It is not clear if this feature will prove
>> useful,
>> >> >>>    but it might enable more optimizations and compiler research
>> >> >>>    opportunities.
>> >> >>>
>> >> >>> 6. No acc node classes -- Design 1 simplifies the compiler
>> >> >>>    implementation by eliminating the need to implement many acc
>> node
>> >> >>>    classes.  While we have so far found that implementing these
>> >> >>>    classes is mostly mechanical, it does take a non-trivial amount
>> of
>> >> >>>    time.
>> >> >>>
>> >> >>> 7. No omp mapping -- Design 2 does not require acc to be mapped to
>> >> >>>    omp.  That is, it is conceivable that, for some acc constructs,
>> >> >>>    there will prove to be no omp syntax to capture the semantics we
>> >> >>>    wish to implement.  It is also conceivable that we might one day
>> >> >>>    want to represent some acc constructs directly as extensions to
>> >> >>>    LLVM IR, where some acc analyses or optimizations might be more
>> >> >>>    feasible to implement.  This possibility dovetails with recent
>> >> >>>    discussions in the LLVM community about developing LLVM IR
>> >> >>>    extensions for various parallel programming models.
>> >> >>>
>> >> >>> Because of features 4 and 6, design 1 is likely the fastest design
>> to
>> >> >>> implement, at least at first while we focus on simple acc features
>> and
>> >> >>> simple mappings to omp.  However, we have so far found no advantage
>> >> >>> that design 1 has but that design 3 does not have except for
>> feature
>> >> >>> 6, which we see as the least important of the above features in the
>> >> >>> long term.
>> >> >>>
>> >> >>> The only advantage we have found that design 2 has but that design
>> 3
>> >> >>> does not have is feature 7.  It should be possible to choose
>> design 3
>> >> >>> as the default but, for certain acc constructs or scenarios where
>> >> >>> feature 7 proves important (if any), incorporate design 2.  In
>> other
>> >> >>> words, if we decide not to map a particular acc construct to any
>> omp
>> >> >>> construct, ttx would leave it alone, and we would extend codegen to
>> >> >>> handle it directly.
>> >> >>>
>> >> >>> Conclusions
>> >> >>> -----------
>> >> >>>
>> >> >>> For the above reasons, and because design 3 offers the cleanest
>> >> >>> separation of concerns, we have chosen design 3 with the
>> possibility
>> >> >>> of incorporating design 2 where it proves useful.
>> >> >>>
>> >> >>> Because of the immutability of clang's AST, the design of our
>> proposed
>> >> >>> ttx component requires careful consideration.  To shorten this
>> initial
>> >> >>> email, we have omitted those details for now, but we will be happy
>> to
>> >> >>> include them as the discussion progresses.
>> >> >>>
>> >> >>> _______________________________________________
>> >> >>> cfe-dev mailing list
>> >> >>> cfe-dev at lists.llvm.org
>> >> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >> >>>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jeff Hammond
>> >> >> jeff.science at gmail.com
>> >> >> http://jeffhammond.github.io/
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> cfe-dev mailing list
>> >> >> cfe-dev at lists.llvm.org
>> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Hal Finkel
>> >> >> Lead, Compiler Technology and Programming Languages
>> >> >> Leadership Computing Facility
>> >> >> Argonne National Laboratory
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jeff Hammond
>> >> jeff.science at gmail.com
>> >> http://jeffhammond.github.io/
>> >
>> >
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20171212/9919da6b/attachment.html>