[PATCH] D69103: Backend for NEC SX-Aurora

Mon Oct 28 12:53:00 PDT 2019

rengolin added a comment.

In D69103#1724045 <https://reviews.llvm.org/D69103#1724045>, @simoll wrote:

> I understand. We will repackage the github code into digestible commits and put them on phab. I'll get back to you when the patch sets leading up to full scalar codegen are ready.

Thanks!

> Any code that would run on your CPU really :) The full application binary (ELF, btw) runs on the card by default and dispatches systemcalls to a proxy process on the host. The github code already implements full scalar instruction codegen (and there are VE implementations for libcxx/libcxxabi/libunwind, ..).

Awesome! Hopefully the callbacks into CPU are rare enough in optimised code that it doesn't get to be a bottleneck.

> Well, we already use the target-specific intrinsics for hand-written Tensorflow kernels.  However, we are planning to implement standard vector instructions and LLVM-VP once its available upstream. The Region Vectorizer is already capable of emitting VP intrinsics. The current loop vectorizer would need to be extended to support tail loop predication through setting the active vector length to be useful for the VE (Otw, eg for packed f32 mode, there would be up to 511 scalarized remainder iterations, ..).

Interesting. I'm guessing VPlan would be a must-have to get decent performance in this target.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69103/new/

https://reviews.llvm.org/D69103