<div dir="ltr">Never hurts to ask them via radar. Especially if you can point at specific instances where the openmp support which would benefit in performance from that feature.</div><div class="gmail_extra"><br><br><div class="gmail_quote">

On Thu, May 29, 2014 at 2:36 PM, Steven Noonan <span dir="ltr"><<a href="mailto:steven@uplinklabs.net" target="_blank">steven@uplinklabs.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

No, I haven't. I'm pretty sure Apple's stance is that they don't<br>

*want* people to affinitize processes because they believe people<br>

would abuse it. Comments like this seem to indicate that to me,<br>

anyway:<br>

<br>

<a href="https://github.com/opensource-apple/xnu/blob/10.9/osfmk/kern/sched_prim.c#L1720" target="_blank">https://github.com/opensource-apple/xnu/blob/10.9/osfmk/kern/sched_prim.c#L1720</a><br>

<br>

They used to make it possible to affinitize to CPUs via a framework<br>

that came with Xcode called CHUD, but they never made an Intel 64-bit<br>

version of it, and now it's gone altogether.<br>

<div class="HOEnZb"><div class="h5"><br>

On Thu, May 29, 2014 at 11:13 AM, Jack Howarth<br>

<<a href="mailto:howarth.mailing.lists@gmail.com">howarth.mailing.lists@gmail.com</a>> wrote:<br>

> Steven,<br>

>      Have you filed a radar bug report with Apple on this?. There always is<br>

> the remote possibility that this issue could be addressed in a future OS<br>

> release.<br>

>             Jack<br>

><br>

><br>

> On Thu, May 29, 2014 at 1:30 PM, Steven Noonan <<a href="mailto:steven@uplinklabs.net">steven@uplinklabs.net</a>><br>

> wrote:<br>

>><br>

>> Darwin has a very weak notion of "affinity hints":<br>

>><br>

>><br>

>> <a href="https://developer.apple.com/library/mac/releasenotes/Performance/RN-AffinityAPI/" target="_blank">https://developer.apple.com/library/mac/releasenotes/Performance/RN-AffinityAPI/</a><br>

>><br>

>> But it's so dumbed down (only a concept of distinct affinity "tags"<br>

>> based solely on L2 cache sharing) that it's pretty useless. I did some<br>

>> microbenchmarks with it to simulate an OpenMP workload with pinning,<br>

>> and as far as I'm able to tell, the Darwin kernel just ignores those<br>

>> hints and does whatever it pleases.<br>

>><br>

>> On Thu, May 29, 2014 at 5:39 AM, Cownie, James H<br>

>> <<a href="mailto:james.h.cownie@intel.com">james.h.cownie@intel.com</a>> wrote:<br>

>> > I think the complaint is this: on Darwin, the scaling to 4 "processes"<br>

>> > is<br>

>> > worse than on Linux.<br>

>> ><br>

>> > Four threads is small. The OpenMP runtime is tested scaling in the 200+<br>

>> > thread range for Xeon Phi, and on big-iron servers. We measure the<br>

>> > scaling<br>

>> > of a variety of more interesting things there (such as SpecOMP).<br>

>> ><br>

>> ><br>

>> ><br>

>> > Futexes are fast, but then so are our spin-locks. The difference is what<br>

>> > happens when the lock is contended (whether you enter the kernel or not,<br>

>> > and<br>

>> > therefore allow the kernel to schedule something else onto the same HW<br>

>> > thread). That should make little difference in this case, since the<br>

>> > machine<br>

>> > is not over-subscribed.<br>

>> ><br>

>> ><br>

>> ><br>

>> > If Darwin provides a fast futex interface, then iomp should use it.<br>

>> ><br>

>> > Darwin does not provide it, so we can’t use it J.<br>

>> ><br>

>> ><br>

>> ><br>

>> > I’d guess that the issue here is more likely related to affinity choices<br>

>> > made by the operating system (whether it chooses to place threads as<br>

>> > hyper-threads on the same core, as threads in the same socket, or across<br>

>> > sockets) than details of the locking. I believe that Darwin also has no<br>

>> > specific support that would let us control that either…<br>

>> ><br>

>> ><br>

>> ><br>

>> > -- Jim<br>

>> ><br>

>> > James Cownie <<a href="mailto:james.h.cownie@intel.com">james.h.cownie@intel.com</a>><br>

>> > SSG/DPD/TCAR (Technical Computing, Analyzers and Runtimes)<br>

>> ><br>

>> > Tel: <a href="tel:%2B44%20117%209071438" value="+441179071438">+44 117 9071438</a><br>

>> ><br>

>> ><br>

>> ><br>

>> > From: Chandler Carruth [mailto:<a href="mailto:chandlerc@google.com">chandlerc@google.com</a>]<br>

>> > Sent: Thursday, May 29, 2014 1:07 PM<br>

>> > To: Cownie, James H<br>

>> > Cc: Jack Howarth; <a href="mailto:openmp-dev@dcs-maillist2.engr.illinois.edu">openmp-dev@dcs-maillist2.engr.illinois.edu</a><br>

>> ><br>

>> ><br>

>> > Subject: Re: [Openmp-dev] initial clang-omp/openmp benchmarking<br>

>> ><br>

>> ><br>

>> ><br>

>> ><br>

>> ><br>

>> > On Thu, May 29, 2014 at 4:45 AM, Cownie, James H<br>

>> > <<a href="mailto:james.h.cownie@intel.com">james.h.cownie@intel.com</a>><br>

>> > wrote:<br>

>> ><br>

>> > I don’t really understand what problem you are complaining about.<br>

>> ><br>

>> > Your numbers show clang-omp as the fastest implementation in all<br>

>> > directly<br>

>> > comparable cases. That doesn’t seem like something we want to change!<br>

>> ><br>

>> ><br>

>> > I think the complaint is this: on Darwin, the scaling to 4 "processes"<br>

>> > is<br>

>> > worse than on Linux.<br>

>> ><br>

>> ><br>

>> ><br>

>> > However, the reason is stated already: Linux provides a *very* fast<br>

>> > futex<br>

>> > implementation. Darwin either doesn't provide it or iomp doesn't use it.<br>

>> ><br>

>> ><br>

>> ><br>

>> > If Darwin provides a fast futex interface, then iomp should use it.<br>

>> > That's a<br>

>> > useful request. I don't know enough about Darwin to help investigate<br>

>> > whether<br>

>> > the OS has a futex interface exposed to userland.<br>

>> ><br>

>> ><br>

>> ><br>

>> > If Darwin doesn't provide a futex interface, there is literally nothing<br>

>> > we<br>

>> > can do about that. You aren't going to match the scalability of a<br>

>> > kernel-supported futex with something in userspace.<br>

>> ><br>

>> ><br>

>> ><br>

>> > Anyways, I do agree that micro-optimizing mutex performance for<br>

>> > something<br>

>> > like openmp seems somewhat less important....<br>

>> ><br>

>> > ---------------------------------------------------------------------<br>

>> ><br>

>> ><br>

>> > Intel Corporation (UK) Limited<br>

>> > Registered No. 1134945 (England)<br>

>> > Registered Office: Pipers Way, Swindon SN3 1RJ<br>

>> > VAT No: 860 2173 47<br>

>> ><br>

>> > This e-mail and any attachments may contain confidential material for<br>

>> > the sole use of the intended recipient(s). Any review or distribution<br>

>> > by others is strictly prohibited. If you are not the intended<br>

>> > recipient, please contact the sender and delete all copies.<br>

>> ><br>

>> ><br>

>> > _______________________________________________<br>

>> > Openmp-dev mailing list<br>

>> > <a href="mailto:Openmp-dev@dcs-maillist2.engr.illinois.edu">Openmp-dev@dcs-maillist2.engr.illinois.edu</a><br>

>> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/openmp-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/openmp-dev</a><br>

>> ><br>

>><br>

>> _______________________________________________<br>

>> Openmp-dev mailing list<br>

>> <a href="mailto:Openmp-dev@dcs-maillist2.engr.illinois.edu">Openmp-dev@dcs-maillist2.engr.illinois.edu</a><br>

>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/openmp-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/openmp-dev</a><br>

><br>

><br>

</div></div></blockquote></div><br></div>