<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Sat, Jun 6, 2015 at 5:02 AM C Bergström <<a href="mailto:cbergstrom@pathscale.com">cbergstrom@pathscale.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas<br>
<<a href="mailto:chrmargiolas@gmail.com" target="_blank">chrmargiolas@gmail.com</a>> wrote:<br>
> Hello,<br>
><br>
> Thank you a lot for the feedback. I believe that the heterogeneous engine<br>
> should be strongly connected with parallelization and vectorization efforts.<br>
> Most of the accelerators are parallel architectures where having efficient<br>
> parallelization and vectorization can be critical for performance.<br>
><br>
> I am interested in these efforts and I hope that my code can help you<br>
> managing the offloading operations. Your LLVM instruction set extensions may<br>
> require some changes in the analysis code but I think is going to be<br>
> straightforward.<br>
><br>
> I am planning to push my code on phabricator in the next days.<br>
<br>
If you're doing the extracting at the loop and llvm ir level - why<br>
would you need to modify the IR? Wouldn't the target level lowering<br>
happen later?<br>
<br>
How are you actually determining to offload? Is this tied to<br>
directives or using heuristics+some set of restrictions?<br>
<br>
Lastly, are you handling 2 targets in the same module or end up<br>
emitting 2 modules and dealing with recombining things later..<br><br></blockquote><div><br></div><div>It's not currently possible to do this using the current structure without some significant and, honestly, icky patches.</div><div><br></div><div>-eric </div></div></div>