[flang-dev] [llvm-dev] About OpenMP dialect in MLIR

Mon Feb 17 16:49:15 PST 2020

On 02/17, Vinay Madhusudan via flang-dev wrote:
> Please find the reply inline below
>
> [...]
>
> About Clang / MLIR / LLVM being loosely related and not being relevant in
> llvm-dev:
> 
> With the introduction of the OpenMPIRBuilder in MLIR (from this review :
> https://reviews.llvm.org/D72962), Clang and MLIR would now have the common
> code for building OpenMP constructs. I do not think it is so loosely
> related anymore. Note that MLIR and Clang frontends for LLVM are very
> different. Clang emits LLVM IR with almost no optimizations and MLIR
> already supports considerable amount of optimizations.
> 
> Decision of using the OpenMPIRBuilder for MLIR was discussed in the
> following flang-dev threads (Please correct me If I am missing some newer
> discussions on the below topics)

I still am unsure about a basic detail that seems really important:

Could you explain to me what the differences/benefits are between
lowering OpenMP Ops with something we call CGOpenMP[0] into LLVM dialect
(as far as I understand this) instead of using the OpenMPIRBuilder to
generate LLVM-IR from these Ops?

I am unsure because the functions/functionality in those two files look
pretty similar to me, except that the CGOpenMP solution requires the
entire* runtime call encoding of clang to be duplicated and maintained
inside MLIR [1] while it is shared in a single space for the
OpenMPIRBuilder.

* For comparison: OpenMPLowering in FC lowers to 9 runtime calls [1].
                  Clang for the host runtime part knows about 64 [2].

Once the above is sorted out we can discuss other things but IMHO
arguments are all over the place right now which makes it hard to
justify anything. I mean, OpenMPIRBuilder is about creating OpenMP
runtime calls for OpenMP directives. To me it seems a lot of the
arguments below talk about high-level transformations on MLIR and for
which I have a hard time to relate them to the OpenMPIRBuilder.

Cheers,
  Johannes

[0] https://github.com/compiler-tree-technologies/fc/blob/master/lib/codegen/CGOpenMP.cpp
[1] https://github.com/compiler-tree-technologies/fc/blob/master/lib/transforms/OpenMPLowering.cpp
[2] https://github.com/llvm/llvm-project/blob/master/clang/lib/CodeGen/CGOpenMPRuntime.cpp#L568

>    1.
> 
>    [May 2019] h
>    ttp://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html
>    <http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html>
>    2.
> 
>    [June 2019]
>    http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-June/000251.html
> 
> 
> However I could not find any conclusions for the concerns raised by Kiran:
> 
> 
>    1.
> 
>    Early outlining  (in MLIR) vs. Late outlining (in LLVM)
>    2.
> 
>    Handling of target constructions: high-level transformations for GPUs
>    and  CPUs (offloading in LLVM vs. MLIR?)
> 
> 
> Kiran seems to suggest the early outlining (version 2) would be better(
> http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000224.html
> ). But currently, the late outlining has been implemented in LLVM (version
> 1) (
> https://github.com/llvm/llvm-project/blob/master/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
> ).
> 
> Early outlining in MLIR would have the following benefits as suggested in
> the thread:
> 
> 
> 
>    1.
> 
>    Enables more optimization in MLIR (intra-procedural because of regions).
>    2.
> 
>    Offloading in MLIR (which is designed for heterogenous hardware
>    compilation support)
>    3.
> 
>    Direct LLVM Dialect lowering of OpenMP operations (no LLVM IR lowering)
> 
> 
> MLIR google groups discussion (*h*
> ttps://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/4Aj_eawdHiw
> <https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/4Aj_eawdHiw>)
> regarding the use of OpenMPIRBuilder doesn’t seem to discuss the above
> concerns and also about how the various design decisions in OpenMPIRBuilder
> affects MLIR in general.
> 
> Also,
> 
>  > “The point here is that we do not want to use MLIR just as a
> pass-through layer because MLIR has a lot of strengths”
> 
>   ....
> 
>   > “The point here is that if we lower to LLVM dialect, we will not be
> able to reuse OpenMP codegen & optimisation code from Clang/LLVM.”
> 
>      --- by Kiran in
> https://lists.llvm.org/pipermail/llvm-dev/2020-February/139181.html
> 
> In the latest reply from Kiran (quoted above) to this thread, Kiran seems
> to suggest that lowering to LLVM Dialect (instead of LLVM IR) would
> restrict the use of OpenMP Optimization code from LLVM and also MLIR will
> just be a pass-through to the OpenMPIRBuilder.
> 
> Because of the above reasons, it seems to me that design considerations of
> using OpenMPIRBuilder for MLIR should also be mentioned (and discussed)
> before commiting LLVM IR lowering part for OpenMP dialect in
> https://reviews.llvm.org/D72962
> 
> 
> 
> > Mehdi also seems to have the same suggestion: “I agree that having
> >> dialect lowering would be cleaner” in https://reviews.llvm.org/D72962
> >>
> >
> > Since you're calling me out: yes it would be cleaner from a pure MLIR
> > point of view, I don't think there is much disagreement on this (I think?).
> > However we already have the OpenMP builders available and they will
> > continue to be maintained/evolved to support OpenMP in clang.
> > Duplicating them entirely in MLIR for the sake of purity seems like a lack
> > of pragmatism here, so I support the current approach with the current
> > tradeoffs.
> >
> >
> >>
> >> > Yes, the design has mildly changed over time to incorporate feedback.
> >> But the latest is what is there in the RFC in discourse.
> >>
> >> RFC fails to discuss the following (I have also mentioned some of them in
> >> my reply to Johannes):
> >>
> >> > The proposed plan involves a) lowering F18 AST with OpenMP directly to
> >> a mix of OpenMP and FIR dialects. b) converting this finally to a mix of
> >> OpenMP and LLVM dialects.
> >>
> >> It is unclear in the RFC what other dialects are considered as supported
> >> for OpenMP dialect  (std, affine, vector, loop, etc) and how it would be
> >> transformed, used and lowered from FIR to LLVM.
> >>
> >> It becomes important to list down the various dialects / operations /
> >> types supported for OpenMP (which is mainly defined for C, C++ and Fortran
> >> programs. MLIR has a much wider scope.
> >>
> >> It wouldn’t add much value for the proposed OpenMP dialect to be in the
> >> MLIR tree if it cannot support at least the relevant standard dialect types
> >> / operations.
> >>
> >
> > I agree, and I think this was something I called out as important in the
> > RFC: "It seems that the dialect can be orthogonal to FIR and its type
> > system, which the most important thing to me to integrate MLIR (favor
> > reusability across other frontends / compiler frameworks)".
> > If you don't think that this is the case, then please raise this in the
> > RFC!
> > I think it is perfectly fair to ask for more examples from the author and
> > digging a bit deeper if you're unconvinced that the proposed modeling can
> > be applicable outside of FIR. This is exactly why we ask such proposal to
> > go through RFC by the way: to allow people like you to point at the
> > blindspot and ask the right questions.
> >
> > Best,
> >
> > --
> > Mehdi
> >
> >
> >
> >> > We would like to take advantage of the transformations in cases that
> >> are possible. FIR loops will be converted to affine/loop dialect. So the
> >> loop inside an omp.do can be in these dialects as clarified in the
> >> discussion in discourse and also shown in slide 20 of the fosdem
> >> presentation (links to both below).
> >>
> >>
> >> https://llvm.discourse.group/t/rfc-openmp-dialect-in-mlir/397/7?u=kiranchandramohan
> >>
> >>
> >> https://fosdem.org/2020/schedule/event/llvm_flang/attachments/slides/3839/export/events/attachments/llvm_flang/slides/3839/flang_llvm_frontend.pdf
> >>
> >> Although it is mentioned that the affine/ loop.for is used, following
> >> things are unclear:
> >>
> >> I am assuming that there will be lowering / conversion code in f18 repo
> >> dialect from fir.do to loop.for / affine.for. Is it the case? If so, I
> >> think it is worth mentioning it in the “sequential code flow
> >> representation” in the RFC.
> >>
> >> This raises the following questions.
> >>
> >>
> >>
> >>    1.
> >>
> >>    Which types are supported? Standard dialect types and FIR types?
> >>
> >>
> >> For example, what types are used for Fortran arrays used inside OpenMP
> >> regions? Is it std.memref OR Fortran array representation in FIR dialect
> >> (fir.array?) OR both?  Note that Fortran has support for column major
> >> arrays. std.memref supports custom memory layouts. What custom layouts are
> >> supported?
> >>
> >>
> >> How would different non-scalar types in standard dialect  be lowered to
> >> LLVM IR and passed to OpenMP runtime calls? Can you please elaborate on
> >> this?
> >>
> >> The example provided in slide 20 of the fosdem presentation contains
> >>
> >> “loop.for %j = %lb2 to %ub2 : !integer {“
> >>
> >> But loop.for accepts “index” type. Not sure what type “!integer”
> >> represents here.
> >>
> >>
> >>    1.
> >>
> >>    What are the different memory access operations which are supported
> >>    inside the OpenMP region and lowered to proper OpenMP runtime calls in LLVM
> >>    IR?
> >>
> >>
> >> The possibilities are:
> >>
> >>    1.
> >>
> >>    affine.load / affine.store
> >>    2.
> >>
> >>    std.load / std.store
> >>    3.
> >>
> >>    FIR dialect memory access operations.
> >>
> >>
> >> > I must also point out that the question of where to do loop
> >> transformations is a topic we have not fully converged on. See the
> >> following thread for discussions.
> >> http://lists.llvm.org/pipermail/flang-dev/2019-September/000042.html
> >>
> >> Looks like placement (MLIR / LLVM) of various transformations related to
> >> OpenMP has not been finalized, from what I could infer from Johannes’s
> >> reply and the below text in the latest RFC in discourse:
> >>
> >> “So there exist some questions regarding where the optimisations should
> >> be carried out. We will decide on which framework to choose only after some
> >> experimentation.”
> >>
> >> > i) we need to keep the loops separately so as to take advantage of
> >> transformations that other dialects like affine/loop would provide.
> >>
> >> 1) Keeping the loops separate from the OpenMP operations will expose them
> >> to the “regular” transformations passes in MLIR inside the OpenMP region.
> >> Most of them are invalid or in-efficient for OpenMP operations.
> >>
> >> Examples:
> >>
> >>    1.
> >>
> >>    Constant propagation example mentioned by Johannes in this thread.
> >>    (omp task shared(x))
> >>    2.
> >>
> >>    Loop (nest) transformations (permute / split / fuse / tile, etc) will
> >>    happen ignoring the surrounding OpenMP operations.
> >>    3.
> >>
> >>    Hoisting and sinking of various memory/ SSA values inside the OpenMP
> >>    region. This goes against the likes of “map”, “firstprivate”, shared, etc
> >>    clauses and more.
> >>
> >>
> >> 2) Various loop operations (loop.for, affine.for, fir.do) have (or will
> >> have) different transformations/ optimization passes which are different
> >> from one another.
> >>
> >> Example:
> >>
> >>    1.
> >>
> >>    AffineLoopInvariantCodeMotion.cpp is different from
> >>    LoopInvariantCodeMotion.cpp.
> >>    2.
> >>
> >>    Other Loop transformation passes for affine.for
> >>
> >>
> >> These loops also use different Types and memory access operations in
> >> general for transformations. Example, most Affine dialect transformations
> >> (if not all) work on affine.load and affine.store operations.
> >>
> >> Supporting different loop operations means that there would be *OpenMP
> >> specific transformations* for each one of them and also requires a way to
> >> restrict each of them from existing transformations (when nested in OpenMP
> >> constructs).
> >>
> >> There would be different lowerings for different loop operations as well.
> >> Example, affine.for and loop.for would have to be lowered to omp.do in
> >> different ways.
> >>
> >> From slide 20 of fosdem presentation you mentioned, the LLVM + OpenMP
> >> dialect representation is as follows:
> >>
> >> ------------------------------
> >>
> >> Mlir.region(…) {
> >>
> >>    omp.parallel  {
> >>
> >>      %ub3 = …
> >>
> >>      omp.do %i = 0 to %ub3 : !integer  {
> >>
> >>      …
> >>
> >>      }
> >>
> >>   }
> >>
> >> }
> >>
> >> -------------------------------
> >>
> >> Currently, the LLVM Dialect doesn’t contain a high level loop operation.
> >> It is all based on CFG implementation.
> >>
> >> Will omp.do follow the same structure (SizedRegion<1>) as loop.for? OR
> >> there would be CFG for LLVM Dialect based loop operation?
> >>
> >> Can you please mention how the OpenMP + LLVM dialect will look like for
> >> the below parallel do construct?
> >>
> >> integer :: i=1, k=10
> >>
> >> integer :: a(10), b(10), c(10)
> >>
> >> ...
> >>
> >>  !$omp parallel do
> >>
> >>   do i = 1, k
> >>
> >>     if (i .ne. 1) *cycle*
> >>
> >>     c(i) = a(i) + b(i)
> >>
> >>   end do
> >>
> >>   !$omp end parallel do
> >>
> >> print *,c
> >>
> >
> >> Thanks,
> >>
> >> Vinay
> >>
> >> On Fri, Feb 14, 2020 at 6:52 AM Kiran Chandramohan via llvm-dev <
> >> llvm-dev at lists.llvm.org> wrote:
> >>
> >>> Hello Vinay,
> >>>
> >>> Thanks for your mail about the OpenMP dialect in MLIR. Happy to know
> >>> that you and several other groups are interested in the OpenMP dialect. At
> >>> the outset, I must point out that the design is not set in stone and will
> >>> change as we make progress. You are welcome to participate, provide
> >>> feedback and criticism to change the design as well as to contribute to the
> >>> implementation. I provide some clarifications and replies to your comments
> >>> below. If it is OK we can have further discussions in discourse as River
> >>> points out.
> >>>
> >>> 1. [May 2019] An OpenMPIRBuilder in LLVM was proposed for flang and
> >>> clang frontends. Note that this proposal was before considering MLIR for
> >>> FIR.
> >>>
> >>> A correction here. The proposal for OpenMPIRBuilder was made when MLIR
> >>> was being considered for FIR.
> >>> (i) Gary Klimowicz's minutes for Flang call in April 2019 mentions
> >>> considering MLIR for FIR.
> >>>
> >>> http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-April/000194.html
> >>> (ii) My reply to Johaness's proposal in May 2019 mentions MLIR for FIR.
> >>>
> >>> http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000220.html
> >>>
> >>> b. Review of barrier construct is in progress:
> >>> https://reviews.llvm.org/D72962
> >>>
> >>> Minor correction here. The addition of barrier construct was accepted
> >>> and has landed (https://reviews.llvm.org/D7240
> >>> <https://reviews.llvm.org/D72400>). It is the review for translation to
> >>> LLVM IR that is in progress.
> >>>
> >>> It looks like the design has evolved over time and there is no one place
> >>> which contains the latest design decisions that fits all the different
> >>> pieces of the puzzle. I will try to deduce it from the above mentioned
> >>> references. Please correct me If I am referring to anything which has
> >>> changed.
> >>>
> >>> Yes, the design has mildly changed over time to incorporate feedback.
> >>> But the latest is what is there in the RFC in discourse.
> >>>
> >>> For most OpenMP design discussions, FIR examples are used (as seen in
> >>> (2) and (3)). The MLIR examples mentioned in the design only talks about
> >>> FIR dialect and LLVM dialect.
> >>>
> >>> Our initial concern was how will all these pieces (FIR, LLVM Dialect,
> >>> OpenMPIRBuilder, LLVM IR) fit together. Hence you see the prominence of FIR
> >>> and LLVM dialect and more information about lowering/translation than
> >>> transformations/optimisations.
> >>>
> >>> This completely ignores the likes of standard, affine (where most loop
> >>> transformations are supposed to happen) and loop dialects.
> >>>
> >>> Adding to the reply above. We would like to take advantage of the
> >>> transformations in cases that are possible. FIR loops will be converted to
> >>> affine/loop dialect. So the loop inside an omp.do can be in these dialects
> >>> as clarified in the discussion in discourse and also shown in slide 20 of
> >>> the fosdem presentation (links to both below).
> >>>
> >>> https://llvm.discourse.group/t/rfc-openmp-dialect-in-mlir/397/7?u=kiranchandramohan
> >>>
> >>> https://fosdem.org/2020/schedule/event/llvm_flang/attachments/slides/3839/export/events/attachments/llvm_flang/slides/3839/flang_llvm_frontend.pdf
> >>>
> >>> I must also point out that the question of where to do loop
> >>> transformations is a topic we have not fully converged on. See the
> >>> following thread for discussions.
> >>> http://lists.llvm.org/pipermail/flang-dev/2019-September/000042.html
> >>>
> >>> Is it the same omp.do operation which now contains the bounds and
> >>> induction variables of the loop after the LLVM conversion?
> >>>
> >>> The point here is that i) we need to keep the loops separately so as to
> >>> take advantage of transformations that other dialects like affine/loop
> >>> would provide. ii) We will need the loop information while lowering the
> >>> OpenMP do operation. For implementation, if reusing the same operation (in
> >>> different contexts) is difficult then we can add a new operation.
> >>>
> >>> It is also not mentioned how clauses like firstprivate, shared,
> >>> private, reduce, map, etc are lowered to OpenMP dialect.
> >>>
> >>> Yes, it is not mentioned. We did a study of a few constructs and clauses
> >>> which was shared as mails to flang-dev and the RFC. As we make progress and
> >>> before implementation, we will share further details.
> >>>
> >>> it would be beneficial to have an omp.parallel_do operation which has
> >>> semantics similar to other loop structures (may not be LoopLikeInterface)
> >>> in MLIR.
> >>>
> >>> I am not against adding parallel_do if it can help with transformations
> >>> or reduce the complexity of lowering. Please share the details in discourse
> >>> as a reply to the RFC or a separate thread.
> >>>
> >>> it looks like having OpenMP operations based on standard MLIR types and
> >>> operations (scalars and memrefs mainly) is the right way to go.
> >>>
> >>> This will definitely be the first version that we implement. But I do
> >>> not understand why we should restrict to only the standard types and
> >>> operations. To ease lowering and translation and to avoid adding OpenMP
> >>> operations to other dialects, I believe OpenMP dialect should also be able
> >>> to exist with other dialects like FIR and LLVM.
> >>>
> >>> E. Lowering of target constructs mentioned in ( 2(d) ) specifies direct
> >>> lowering to LLVM IR ignoring all the advantages that MLIR provides.
> >>>
> >>> Also, OpenMP codegen will automatically benefit from the GPU dialect
> >>> based optimizations. For example, it would be way easier to hoist a memory
> >>> reference out of GPU kernel in MLIR than in LLVM IR.
> >>>
> >>> I might not have fully understood you here. But the dialect lives
> >>> independently of the translation to LLVM IR. If there are optimisations
> >>> (like hoisting that you mention here) I believe they can be performed as
> >>> transformation passes on the dialect. It is not ruled out.
> >>>
> >>> --Kiran
> >>> ------------------------------
> >>> *From:* flang-dev <flang-dev-bounces at lists.llvm.org> on behalf of Vinay
> >>> Madhusudan via flang-dev <flang-dev at lists.llvm.org>
> >>> *Sent:* 13 February 2020 16:33
> >>> *To:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>;
> >>> flang-dev at lists.llvm.org <flang-dev at lists.llvm.org>
> >>> *Subject:* [flang-dev] About OpenMP dialect in MLIR
> >>>
> >>>
> >>> Hi,
> >>>
> >>> I have few questions / concerns regarding the design of OpenMP dialect
> >>> in MLIR that is currently being implemented, mainly for the f18 compiler.
> >>> Below, I summarize the current state of various efforts in clang / f18 /
> >>> MLIR / LLVM regarding this. Feel free to add to the list in case I have
> >>> missed something.
> >>>
> >>> 1. [May 2019] An OpenMPIRBuilder in LLVM was proposed for flang and
> >>> clang frontends. Note that this proposal was before considering MLIR for
> >>> FIR.
> >>>
> >>> a. llvm-dev proposal :
> >>> http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html
> >>>
> >>> b. Patches in review: https://reviews.llvm.org/D70290. This also
> >>> includes the clang codegen changes.
> >>>
> >>> 2.  [July - September 2019] OpenMP dialect for MLIR was discussed /
> >>> proposed with respect to the f18 compilation stack (keeping FIR in mind).
> >>>
> >>> a. flang-dev discussion link:
> >>> https://lists.llvm.org/pipermail/flang-dev/2019-September/000020.html
> >>>
> >>> b. Design decisions captured in PPT:
> >>> https://drive.google.com/file/d/1vU6LsblsUYGA35B_3y9PmBvtKOTXj1Fu/view
> >>>
> >>> c. MLIR google groups discussion:
> >>> https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/4Aj_eawdHiw
> >>>
> >>> d. Target constructs  design:
> >>> http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-September/000285.html
> >>>
> >>> e. SIMD constructs design:
> >>> http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-September/000278.html
> >>>
> >>> 3.  [Jan 2020] OpenMP dialect RFC in llvm discourse :
> >>> https://llvm.discourse.group/t/rfc-openmp-dialect-in-mlir/397
> >>>
> >>> 4.  [Jan- Feb 2020] Implementation of OpenMP dialect in MLIR:
> >>>
> >>> a. The first patch which introduces the OpenMP dialect was pushed.
> >>>
> >>> b. Review of barrier construct is in progress:
> >>> https://reviews.llvm.org/D72962
> >>> https://reviews.llvm.org/D72400
> >>>
> >>> I have tried to list below different topics of interest (to different
> >>> people) around this work. Most of these are in the design phase (or very
> >>> new) and multiple parties are interested with different sets of goals in
> >>> mind.
> >>>
> >>> I.  Flang frontend and its integration
> >>>
> >>> II. Fortran representation in MLIR / FIR development
> >>>
> >>> III. OpenMP development for flang,  OpenMP builder in LLVM.
> >>>
> >>> IV. Loop Transformations in MLIR / LLVM with respect to OpenMP.
> >>>
> >>> It looks like the design has evolved over time and there is no one place
> >>> which contains the latest design decisions that fits all the different
> >>> pieces of the puzzle. I will try to deduce it from the above mentioned
> >>> references. Please correct me If I am referring to anything which has
> >>> changed.
> >>>
> >>> A. For most OpenMP design discussions, FIR examples are used (as seen
> >>> in (2) and (3)). The MLIR examples mentioned in the design only talks about
> >>> FIR dialect and LLVM dialect.
> >>>
> >>> This completely ignores the likes of standard, affine (where most loop
> >>> transformations are supposed to happen) and loop dialects. I think it is
> >>> critical to decouple the OpenMP dialect development in MLIR from the
> >>> current flang / FIR effort. It would be useful if someone can mention these
> >>> examples using existing dialects in MLIR and also how the different
> >>> transformations / lowerings are planned.
> >>>
> >>> B. In latest RFC(3), it is mentioned that the initial OpenMP dialect
> >>> version will be as follows,
> >>>
> >>>   omp.parallel {
> >>>
> >>>     omp.do {
> >>>
> >>>        fir.do %i = 0 to %ub3 : !fir.integer {
> >>>
> >>>         ...
> >>>
> >>>        }
> >>>
> >>>     }
> >>>
> >>>   }
> >>>
> >>> and then after the "LLVM conversion" it is converted as follows:
> >>>
> >>>   omp.parallel {
> >>>
> >>>     %ub3 =
> >>>
> >>>     omp.do %i = 0 to %ub3 : !llvm.integer {
> >>>
> >>>     ...
> >>>
> >>>     }
> >>>
> >>>   }
> >>>
> >>>
> >>> a. Is it the same omp.do operation which now contains the bounds and
> >>> induction variables of the loop after the LLVM conversion? If so, will the
> >>> same operation have two different semantics during a single compilation?
> >>>
> >>> b. Will there be different lowerings for various loop operations from
> >>> different dialects? loop.for and affine.for under omp operations would need
> >>> different OpenMP / LLVM lowerings. Currently, both of them are lowered
> >>> to the CFG based loops during the LLVM dialect conversion (which is much
> >>> before the proposed OpenMP dialect lowering).
> >>>
> >>> There would be no standard way to represent OpenMP operations
> >>> (especially the ones which involve loops) in MLIR. This would drastically
> >>> complicate lowering.
> >>>
> >>> C. It is also not mentioned how clauses like firstprivate, shared,
> >>> private, reduce, map, etc are lowered to OpenMP dialect. The example in
> >>> the RFC contains FIR and LLVM types and nothing about std dialect types.
> >>> Consider the below example:
> >>>
> >>> #pragma omp parallel for reduction(+:x)
> >>>
> >>> for (int i = 0; i < N; ++i)
> >>>
> >>>   x += a[i];
> >>>
> >>> How would the above be represented in OpenMP dialect? and What type
> >>> would "x" be in MLIR?  It is not mentioned in the design as to how the
> >>> various SSA values for various OpenMP clauses are passed around in OpenMP
> >>> operations.
> >>>
> >>> D. Because of (A), (B) and (C), it would be beneficial to have an omp.
> >>> parallel_do operation which has semantics similar to other loop
> >>> structures (may not be LoopLikeInterface) in MLIR. To me, it looks like
> >>> having OpenMP operations based on standard MLIR types and operations
> >>> (scalars and memrefs mainly) is the right way to go.
> >>>
> >>> Why not have omp.parallel_do operation with AffineMap based bounds, so
> >>> as to decouple it from Value/Type similar to affine.for?
> >>>
> >>> 1. With the current design, the number of transformations /
> >>> optimizations that one can write on OpenMP constructs would become limited
> >>> as there can be any custom loop structure with custom operations / types
> >>> inside it.
> >>>
> >>> 2. It would also be easier to transform the Loop nests containing OpenMP
> >>> constructs if the body of the OpenMP operations is well defined (i.e., does
> >>> not accept arbitrary loop structures). Having nested redundant "parallel" ,
> >>> "target" and "do" regions seems unnecessary.
> >>>
> >>> 3. There would also be new sets of loop structures in new dialects when
> >>> C/C++ is compiled to MLIR. It would complicate the number of possible
> >>> combinations inside the OpenMP region.
> >>>
> >>> E. Lowering of target constructs mentioned in ( 2(d) ) specifies direct
> >>> lowering to LLVM IR ignoring all the advantages that MLIR provides. Being
> >>> able to compile the code for heterogeneous hardware is one of the biggest
> >>> advantages that MLIR brings to the table. That is being completely missed
> >>> here. This also requires solving the problem of handling target information
> >>> in MLIR. But that is a problem which needs to be solved anyway. Using GPU
> >>> dialect also gives us an opportunity to represent offloading semantics in
> >>> MLIR.
> >>>
> >>> Given the ability to represent multiple ModuleOps and the existence of
> >>> GPU dialect, couldn't higher level optimizations on offloaded code be done
> >>> at MLIR level?. The proposed design would lead us to the same problems that
> >>> we are currently facing in LLVM IR.
> >>>
> >>> Also, OpenMP codegen will automatically benefit from the GPU dialect
> >>> based optimizations. For example, it would be way easier to hoist a memory
> >>> reference out of GPU kernel in MLIR than in LLVM IR.
> >>>
> >>> Thanks,
> >>>
> >>> Vinay
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >

> _______________________________________________
> flang-dev mailing list
> flang-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/flang-dev

-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/flang-dev/attachments/20200217/77004c70/attachment-0001.sig>