[Openmp-dev] PPC64 patch from Intel's fourth cmake patch

Hal Finkel hfinkel at anl.gov
Wed Aug 20 05:42:07 PDT 2014


----- Original Message -----
> From: "James H Cownie" <james.h.cownie at intel.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Carlo Bertolli" <cbertol at us.ibm.com>
> Cc: "Michael Wong" <michaelw at ca.ibm.com>, openmp-dev at dcs-maillist2.engr.illinois.edu, "C. Bergström"
> <cbergstrom at pathscale.com>, "Alexey Bataev" <a.bataev at gmx.com>
> Sent: Wednesday, August 20, 2014 5:48:38 AM
> Subject: RE: [Openmp-dev] PPC64 patch from Intel's fourth cmake patch
> 
> > As you've discovered, clang-omp currently does not use the
> > arbitrary-length-parameter-list aspect of the microtasks.
> 
> Indeed, and I don't expect it too. IMO the way the Intel compiler
> does it by passing a pointer argument for each
> reference into the parent stack is unpleasant.

Okay, this is all fine. I recommend that we do the following:

 1. In the runtime source code (and its documentation), clearly document that the arbitrary-parameter-list aspect of microtasks is deprecated, and what the maximum number of arguments is that we expect from "new" frontends.

 2. As I recall, we now have a libffi dependency to support arbitrary-length parameter lists on ARM. Is this really needed? Maybe they can use a switch statement just as I did on PPC?

Thanks again,
Hal

> 
> I have canvased internally to change it, but without success. ("We've
> done it like that for 15 years, why would we change?")
> 
> If you stick with a single argument to the outlined routine you
> should be able to eliminate a bunch of code in the rutime
> for creating and copying the argument vector and (as a result) have
> no need to write some of that code in assembler.
> 
> As (I think) you're pointing out, you do need different code to
> handle disambiguating references so that
> you can vectorize. If you have code something like this
> 
> float a[SIZE];
> float b[SIZE];
> 
> // initialize b
> #pragma omp parallel for
>    for (int i=1; i<SIZE; i++)
>       a[i] = b[i-1];
>  
> Once you outline you either have something like
> void outlinedFunc (float * restrict a, float * restrict b)
>     // compute local slice into low, high
>     for (i=low; i<high; i++)
>         a[i]=b[i-1];
> 
> or like this
> 
> struct localState {
>     float a[SIZE];
>     float b[SIZE];
> };
> 
> void outlinedFunc( localState * args )
>     // compute local slice into low, high
>     for (i=low; i<high; i++)
>         args->a[i]=args->b[i-1];
> 
> To vectorize you need to be able to prove to yourself that a and b
> don’t overlap here.
> That doesn't seem too hard (fields in a struct can't overlap), but is
> certainly different from the other case.
> 
> -- Jim
> 
> James Cownie <james.h.cownie at intel.com>
> SSG/DPD/TCAR (Technical Computing, Analyzers and Runtimes)
> Tel: +44 117 9071438
> 
> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: Wednesday, August 20, 2014 1:30 AM
> To: Carlo Bertolli
> Cc: Cownie, James H; Michael Wong;
> openmp-dev at dcs-maillist2.engr.illinois.edu; C. Bergström; Alexey
> Bataev
> Subject: Re: [Openmp-dev] PPC64 patch from Intel's fourth cmake patch
> 
> ----- Original Message -----
> > From: "Carlo Bertolli" <cbertol at us.ibm.com>
> > To: "C. Bergström" <cbergstrom at pathscale.com>
> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "James H Cownie"
> > <james.h.cownie at intel.com>, "Michael Wong"
> > <michaelw at ca.ibm.com>, openmp-dev at dcs-maillist2.engr.illinois.edu
> > Sent: Wednesday, August 6, 2014 12:59:42 PM
> > Subject: Re: [Openmp-dev] PPC64 patch from Intel's fourth cmake
> > patch
> > 
> > 
> > 
> > Hi,
> > 
> > No apologies needed - I am glad that you highlighted these issues
> > and
> > that you helped making the patch stronger.
> > Let me see what I can do about the imlementation of
> > kmp__invoke_microtask.
> 
> Thanks for looking into this. As you've discovered, clang-omp
> currently does not use the arbitrary-length-parameter-list aspect of
> the microtasks. I've cc'd Alexey here, and perhaps he can describe
> why. I suspect that using function parameters directly, instead of
> putting everything into structures, will be easier on the IPA (and
> more efficient -- although it is not clear how important that is
> relative to the dispatching overhead).
> 
>  -Hal
> 
> > 
> > 
> > Thanks
> > 
> > -- Carlo
> > 
> > Inactive hide details for "C. Bergström" ---08/06/2014 01:37:30
> > PM---On 08/ 6/14 11:27 PM, Carlo Bertolli wrote: >"C. Bergström"
> > ---08/06/2014 01:37:30 PM---On 08/ 6/14 11:27 PM, Carlo Bertolli
> > wrote: >
> > 
> > From: "C. Bergström" <cbergstrom at pathscale.com>
> > To: Carlo Bertolli/Watson/IBM at IBMUS
> > Cc: Hal Finkel <hfinkel at anl.gov>, "Cownie, James H"
> > <james.h.cownie at intel.com>, Michael Wong <michaelw at ca.ibm.com>,
> > "openmp-dev at dcs-maillist2.engr.illinois.edu"
> > <openmp-dev at dcs-maillist2.engr.illinois.edu>
> > Date: 08/06/2014 01:37 PM
> > Subject: Re: [Openmp-dev] PPC64 patch from Intel's fourth cmake
> > patch
> > 
> > 
> > 
> > 
> > On 08/ 6/14 11:27 PM, Carlo Bertolli wrote:
> > > 
> > > Hi C. Bergström,
> > > 
> > > My answers below interspersed with your comments.
> > > 
> > Apologies - My comments were meant to be general and your patch
> > just
> > made some fugly areas more visible. Hopefully this clears up the
> > confusion. For our BGQ (IBM Power7 A2) work I certainly appreciate
> > this.
> > 
> > Thanks
> > 
> > 
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> ---------------------------------------------------------------------
> Intel Corporation (UK) Limited
> Registered No. 1134945 (England)
> Registered Office: Pipers Way, Swindon SN3 1RJ
> VAT No: 860 2173 47
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the Openmp-dev mailing list