<div dir="ltr">Jason can probably take on the non-trivial task of writing this up more formally and make sure it is clearly documented.<br><div><br></div><div>I'm glad to do this, and I'm planning to start work on it next week.</div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Apr 22, 2016 at 3:24 PM Chandler Carruth <<a href="mailto:chandlerc@gmail.com">chandlerc@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Fri, Apr 22, 2016 at 3:05 PM Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
> On Apr 22, 2016, at 3:01 PM, Chandler Carruth <<a href="mailto:chandlerc@gmail.com" target="_blank">chandlerc@gmail.com</a>> wrote:<br>
><br>
> I feel like this thread got a bit stalled. I'd like to pick it up and try to suggest a path forward.<br>
><br>
> I don't hear any real objections to the overall idea of having an LLVM subproject for parallelism runtimes and support libraries. I think we should get that created.<br>
<br>
I think it should be clarified if "parallelism runtimes and support libraries" are intended to expose user-level APIs or if these are intended to expose APIs for the compiler generated code (this may be part of your point about "writing up its charter, scope" but I also think it shouldn't be underestimated as a task so I called it out).<br></blockquote><div><br></div></div></div><div dir="ltr"><div class="gmail_quote"><div>Absolutely. I think that needs to be clearly spelled out.</div><div><br></div><div>Personally, I'd like to see the subproject open to *both*. Here are some libraries I would love to see (but don't necessarily have concrete plans around):</div><div>- A nice vectorized math library</div><div>- Linear algebra libraries like BLAS implementations or such</div><div>- Highly tuned FFT or other domain specific libraries for GPUs. Essentially the same is the vectorized math libraries but for GPUs and slightly higher level.</div><div>- Stream executor </div><div>- Any generic components of the OpenMP libraries.</div><div><br></div><div>Clearly each of these would need to be discussed on a case by case basis, but there seems to be a healthy mixture of both user-level APIs and compiler-level APIs. I would suggest criteria for being here along the lines of:</div><div><br></div><div>- Includes compiler-targeted APIs (maybe in addition to user-level APIs, maybe even with overlap), or</div><div>- Leverages compiler details for its implementation (for example, using vector extensions we know LLVM supports), or</div><div>- Wants to use compiler-specific packaging techniques or other integration techniques (for example shipping as bitcode), or</div><div>- Helps support compiler or programming language functionality</div><div><br></div><div>The first three here seem clear cut to me. If any part of the library is intended to be callable by the compiler, its a good fit. SE has such interfaces. Vectorized math libraries do too, etc. If the implementation of th elibrary really wants to use compiler internals like our vector math extensions, again, I think it makes sense to keep it reasonably co-located with the compiler.</div><div><br></div><div>The last seems a bit tricky, but I think its really important. Currently, CUDA provides a pretty big programming surface, and having a well tuned BLAS or FFT implementation for example that integrates with CUDA is pretty important. Similarly in the future, we expect C++ to get lots of parallel standard library interfaces, potentially even BLAS-looking ones and we might want a good parallel BLAS implementation or other very fundamental parallel library implementation to use when implementing it.</div><div><br></div><div>But at the same time, I think its really important to have a clear place where any library here ties back into the compiler ecosystem and/or the programming language ecosystem that are the core of LLVM.</div><div><br></div><div>Does this seem like its going in the right direction? (Jason can probably take on the non-trivial task of writing this up more formally and make sure it is clearly documented.)</div></div></div><div dir="ltr"><div class="gmail_quote"><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Otherwise you plan sounds good to me.<br>
<br>
--<br>
Mehdi<br>
<br>
<br>
<br>
><br>
> I don't actually see any real objections to StreamExecutor being one of the runtimes. There are some interesting questions however:<br>
> - Is there common code in the OpenMP runtime that could be unified with this?<br>
> - Could OpenMP end up using SE or some common shared library between them as a basis for offloading?<br>
> - Would instead it make more sense to have the OpenMP offload library be a plugin for StreamExecutor?<br>
><br>
> I don't know the answer to any of these really, but I also don't think that they should prevent us from making progress here. And I think if anything, they'll become easier to answer if we do.<br>
><br>
> So my suggestion would be:<br>
> 1) Create the broader scoped LLVM subproject, including writing up its charter, scope, plans, etc.<br>
><br>
> 2) Add stream executor to it<br>
><br>
> 3) Initially, leave the OpenMP offloading stuff targeted at OpenMP. Then, as it evolves, consider moving it to be another runtime in the broad project if and when it makes sense.<br>
><br>
> 4) As both OpenMP and SE evolve and are used some in the project, evaluate whether there is a common core that makes sense to extract. If so, do it and rebase them appropriately.<br>
><br>
><br>
> Does this make sense? Are there objections to moving forward here?<br>
<br>
</blockquote></div></div></blockquote></div>