[cfe-dev] [Openmp-dev] Comparison of 2 schemes to implement OpenMP 5.0 declare mapper codegen

Alexey Bataev via cfe-dev cfe-dev at lists.llvm.org
Tue Jul 2 18:19:22 PDT 2019

Best regards,
Alexey Bataev

2 июля 2019 г., в 14:34, Finkel, Hal J. via Openmp-dev <openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>> написал(а):

On 6/28/19 11:56 PM, James Beyer wrote:
Recursive data structures are important if you consider linked lists important.

I definitely agree, and I do.

Supporting these is challenging but not impossible, I would expect that if someone manages to implement a cost effective way to support linked lists we would add support to OpenMP with ease.

In the context of the current proposal, supporting recursion seems to have two effects:

 1. You would not want to use a two-pass traversal to precalculate the size of the mapping table (because if you did two passes, you would traverse the list twice, and hat would be unnecessarily expensive).

In this case we should review 2 remaining schemes: the original from Lingda and alternative scheme with functional part moved to the runtime and mappers called indirectly by the runtime (see the description provided by Lingda).

 2. We'd need to also maintain a "visited addresses" hash table to prevent infinite recursion. As we build up the array of mapping descriptors, we would also add the addresses to the hash table, and should the address already be present , we'd avoid recursing (i.e., just use a regular visited set as one does with a graph traversal).

Am I overlooking something?


From: Finkel, Hal J. <hfinkel at anl.gov><mailto:hfinkel at anl.gov>
Sent: Friday, June 28, 2019 10:46 PM
To: Alexey Bataev <Alexey.Bataev at ibm.com><mailto:Alexey.Bataev at ibm.com>; Li, Lingda <lli at bnl.gov><mailto:lli at bnl.gov>
Cc: Alexandre Eichenberger <alexe at us.ibm.com><mailto:alexe at us.ibm.com>; Chapman, Barbara (Contact) <barbara.chapman at stonybrook.edu><mailto:barbara.chapman at stonybrook.edu>; Kevin K O'Brien <caomhin at us.ibm.com><mailto:caomhin at us.ibm.com>; Carlo Bertolli <cbertol at us.ibm.com><mailto:cbertol at us.ibm.com>; Deepak Eachempati <deachempat at cray.com><mailto:deachempat at cray.com>; Denny, Joel E. <dennyje at ornl.gov><mailto:dennyje at ornl.gov>; David Oehmke <doehmke at cray.com><mailto:doehmke at cray.com>; Ettore Tiotto <etiotto at ca.ibm.com><mailto:etiotto at ca.ibm.com>; fraggamuffin at gmail.com<mailto:fraggamuffin at gmail.com>; Rokos, Georgios <georgios.rokos at intel.com><mailto:georgios.rokos at intel.com>; Gheorghe-Teod Bercea <Gheorghe-Teod.Bercea at ibm.com><mailto:Gheorghe-Teod.Bercea at ibm.com>; gregory.rodgers at amd.com<mailto:gregory.rodgers at amd.com>; Sharif, Hashim <hsharif3 at illinois.edu><mailto:hsharif3 at illinois.edu>; Cownie, James H <james.h.cownie at intel.com><mailto:james.h.cownie at intel.com>; Sjodin, Jan <Jan.Sjodin at amd.com><mailto:Jan.Sjodin at amd.com>; James Beyer <jbeyer at nvidia.com><mailto:jbeyer at nvidia.com>; Doerfert, Johannes <jdoerfert at anl.gov><mailto:jdoerfert at anl.gov>; Jones, Jeff C <jeff.c.jones at intel.com><mailto:jeff.c.jones at intel.com>; josem at udel.edu<mailto:josem at udel.edu>; Robichaux, Joseph <joseph.robichaux at intel.com><mailto:joseph.robichaux at intel.com>; Jeff Heath <jrheath at ca.ibm.com><mailto:jrheath at ca.ibm.com>; khaldi.dounia at gmail.com<mailto:khaldi.dounia at gmail.com>; Kelvin Li <kli at ca.ibm.com><mailto:kli at ca.ibm.com>; Bobrovsky, Konstantin S <konstantin.s.bobrovsky at intel.com><mailto:konstantin.s.bobrovsky at intel.com>; Kotsifakou, Maria <kotsifa2 at illinois.edu><mailto:kotsifa2 at illinois.edu>; Li, Lingda (Contact) <lildmh at gmail.com><mailto:lildmh at gmail.com>; Lopez, Matthew Graham <lopezmg at ornl.gov><mailto:lopezmg at ornl.gov>; lopezmg at ornl.org<mailto:lopezmg at ornl.org>; Menard, Lorri <lorri.menard at intel.com><mailto:lorri.menard at intel.com>; Martin Kong <martin.richard.kong at gmail.com><mailto:martin.richard.kong at gmail.com>; Sarah McNamara <mcnamara at ca.ibm.com><mailto:mcnamara at ca.ibm.com>; Rice, Michael P <michael.p.rice at intel.com><mailto:michael.p.rice at intel.com>; Matt Martineau <m.martineau at bristol.ac.uk><mailto:m.martineau at bristol.ac.uk>; oscar at ornl.gov<mailto:oscar at ornl.gov>; Jeeva Paudel <pjeeva01 at ca.ibm.com><mailto:pjeeva01 at ca.ibm.com>; Rao, Premanand M <premanand.m.rao at intel.com><mailto:premanand.m.rao at intel.com>; Krishnaiyer, Rakesh <rakesh.krishnaiyer at intel.com><mailto:rakesh.krishnaiyer at intel.com>; Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com><mailto:ravi.narayanaswamy at intel.com>; Monteleone, Robert <robert.monteleone at intel.com><mailto:robert.monteleone at intel.com>; Lieberman, Ron <Ron.Lieberman at amd.com><mailto:Ron.Lieberman at amd.com>; Samuel Antao <Samuel.Antao at ibm.com><mailto:Samuel.Antao at ibm.com>; Jeffrey Sandoval <sandoval at cray.com><mailto:sandoval at cray.com>; Sunita Chandrasekaran <schandra at udel.edu><mailto:schandra at udel.edu>; sergey.y.ostanevich at gmail.com<mailto:sergey.y.ostanevich at gmail.com>; Sergio Pino Gallardo <sergiop at udel.edu><mailto:sergiop at udel.edu>; Dmitriev, Serguei N <serguei.n.dmitriev at intel.com><mailto:serguei.n.dmitriev at intel.com>; Chan, SiuChi <siuchi.chan at amd.com><mailto:siuchi.chan at amd.com>; Sunil Shrestha <sshrestha at cray.com><mailto:sshrestha at cray.com>; Wilmarth, Terry L <terry.l.wilmarth at intel.com><mailto:terry.l.wilmarth at intel.com>; Tianyi Zhang <tzhan18 at lsu.edu><mailto:tzhan18 at lsu.edu>; vadve at illinois.edu<mailto:vadve at illinois.edu>; Wang Chen <wdchen at ca.ibm.com><mailto:wdchen at ca.ibm.com>; Wael Yehia <wyehia at ca.ibm.com><mailto:wyehia at ca.ibm.com>; Tian, Xinmin <xinmin.tian at intel.com><mailto:xinmin.tian at intel.com>; cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>; openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>
Subject: Re: Comparison of 2 schemes to implement OpenMP 5.0 declare mapper codegen

Hi, Alexey, Lingda,

I haven't been following this closely, so a few questions/comments:

 1. Recursive mappers are not supported in OpenMP 5, but do we expect that to change in the future?

 2. Our experience so far suggests that the most important optimization in this space is to limit the number of distinct host-to-device transfers (or data copies) on systems where data needs to be copied. In these schemes, where does that coalescing occur?

 3. So long as the mappers aren't recursive, I agree with Alexey that the total number of to-be-mapped components should be efficient to calculate. The counting function should simplify to a trivial expression in nearly all cases. The only case where it might not is where the type contains an array section with dynamic bounds, and the element type also has a mapper with an array section with dynamic bounds. In this case (similar to the unsupported recursive cases, which as an aside, we should probably support it as an extension) we could need to walk the data structure twice to precalculate the number of total components to map. However, this case is certainly detectable by static analysis of the declared mappers, and so I think that we can get the best of both worlds: we could use Alexey's proposed scheme except in cases where we truly need to walk the data-structure twice, in which case we could use Lingda's combined walk/push_back scheme. Is there any reason why that wouldn't work?

Thanks again,

On 6/28/19 9:00 AM, Alexey Bataev wrote:

Hi Lingda, thanks for your comments.
We can allocate the buffer either by allocating it on the stack or calling OpenMP allocate function.
With this solution, we allocate memory only once (no need to resize buffer after push_backs) and we do not need to call the runtime function to put map data to the buffer, compiler generated code can do it.
But anyway, I agree, it would be good to hear some other opinions.
Best regards,
Alexey Bataev



Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory

This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Openmp-dev mailing list
Openmp-dev at lists.llvm.org<mailto:Openmp-dev at lists.llvm.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190703/0373b5eb/attachment.html>

More information about the cfe-dev mailing list