[llvm-dev] vectorisation, risc-v

Mon Aug 6 06:32:46 PDT 2018

On 6 August 2018 at 07:12, Luke Kenneth Casson Leighton via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> (please do cc me to preserve thread as i am subscribed digest)
>
> Hi folks, i have a requirement to develop a libre licensed low power
> embedded 3D GPU and VPU and using RISCV as the basis (GPGPU style) seems
> eminently sensible, and anyone is invited to participate.  A gsoc2017
> student named Jake has already developed a Vulkan3D software renderer and
> shader, and (parallelised) llvm is a critical dependency to achieving the
> high efficiency needed. The difference when compared to gallium3d-llvmpipe
> is that Jake's software renderer also uses llvm for the shader, where
> g3dllvm does not.

Hi Luke and welcome to the LLVM community.

> I have reviewed the llvm RV RFC and it looks very sensible, informative, and
> well thought through. Keeping VL changes restricted to function call
> boundaries is a very good idea (presumably "fake" function calls can be
> considered, as a way to break up large functions safely), the instrinsic
> vector length, ie passing in the vector length effectively as an additional
> hidden function parameter, also very sensible.
>
> I also liked that it was clear from the RFC that LLVM is divided into two
> parts, which I suspected but had not had it confirmed.
>
> As an aside I have to say that I am extremely surprised to learn that it is
> only in the past year that vectorisation or more specifically variable
> length SIMD has hit mainstream in libre licensed toolchains, through ARM and
> RISCV.
>
> So some background : I am the author of the SimpleV extension, which has
> been developed to provide a uniform *parallelism* API, *not* as a new Vector
> Microarchitecture (a common misconception). It has unintended sideeffects
> such as providing LD/ST multi with predication, which in turn can be used on
> function entry or context switch to save or load *up to* the entire register
> file with around three instructions. Another unintended sideeffect is code
> size reduction.
>
> There is a total of ZERO new RISCV instructions, the entire design is based
> around CSRs that implicitly mark the STANDARD registers as "vectorised",
> also providing a redirection table that can arbitrarily redirect the 32
> registers to 64 REAL registers (64 real FP and 64 real int), including
> empowering Compressed instructions to access the full 64 registers, even
> when the C instruction is restricted to x8-x15.  Predication similarly is
> via CSR redirection/lookups.

It's worth noting the fact that there are zero new RISC-V instruction
encodings doesn't mean it's necessarily easier to support vs a
proposal that introduces new instructions. LLVM would have to be
taught how to handle this register bank switching / redirection scheme
and how to minimise the number of switches required. This does have
the potential to be somewhat intrusive. It reduces work for the MC
layer (assembler/disassembler), but the code generator would still
need to understand the semantics of these overloaded instructions.

> 1. I note that the separation between LLVM front and backend looks like
> adding SV experimental support would be a simple matter of doing the backend
> assembly code translator, with little to no modifications to the front end
> needed, would that be about right? Particularly if LLVM-RV already adds a
> variable length concept.

As with most compilers you can separate the frontend, middle-end and
backend. Adding SV experimental support would definitely, as you say,
require work in the backend (supporting lowering of IR to machine
instructions) but potentially also middle-end modifications (IR->IR
transformations) to enable the existing vectorisation passes.

> 2. With there being absolutely no new instructions whatsoever (standard
> existing AND FUTURE scalar ops are instead made implicitly parallel), and
> given the deliberate design similarities it seems to me that SV would be a
> good first experimental backend  *ahead* of RVV, for which the 240+ opcodes
> have not yet been finalised. Would people concur?

I'm not convinced it would actually be an easier starting point and I
anticipate quite a lot of work describing these new instruction
semantics and teaching LLVM how to use them.

For clarity, is this something you're proposing to be done directly in
upstream LLVM, or something you're asking for advice on in an (at
least initially) downstream project?

> 3. If there are existing patches, where can they be found?

Robin Kruppe is the main person working on RVV support. I'm not sure
if patches have been made available anywhere at this point.

> 4. From Jeff Bush's Nyuzi work It has been noted that certain 3D operations
> are just far too expensive to do as SIMD or vectors. Multiple FP ARGB to
> 24/32 bit direct overlay with transparency into a tile is therefore for
> example a high priority candidate for adding a special opcode that must
> explicitly be called. Is this relatively easy to do and is there
> documentation explaining how?

Adding a new instruction and making it available through inline
assembly or intrinsics is pretty easy. I did a mini-tutorial on this
at the RISC-V Workshop in Barcelona and really should tidy up and
publish the extended materials I started on this subject.

> It is worth emphasising that this shall not be a private proprietary hard
> fork of llvm, it is an entirely libre effort including the GPGPU (I read
> Alex's lowRISC posts on such private forking practices, a hard fork would be
> just insane and hugely counterproductive), so in particular regard to (4)
> documentation, guidelines and recommendations likely to result in the
> upstreaming process going smoothly also greatly appreciated.

One additional thought: I think RISC-V is somewhat unique in LLVM in
that implementers are free to design and implement custom extensions
without need for prior approval. Many such implementers may wish to
see upstream LLVM support for their extensions. For any open source
project, it's normal to consider factors such as the following when
considering new contributions:
* Potential value to the project (will there be users?)
* Potential cost to the project (what is the maintenance burden? Is
someone stepping up to maintain the addition?)
* Is it stable? i.e. is the design and external interfaces finalised?
If not, is the level of instability compatible with the project's
release process ad sufficiently explained in docs etc.

Support for any standard RISC-V Foundation published extensions is
easy to justify. Also for custom extensions with shipping hardware
that is programmable by end-users. Cases such as experimental
non-standard extensions that haven't (yet) shipped in hardware might
require more examination on a case-by-case basis. [Just sharing my
initial thoughts here rather than official LLVM policy.]

Best,

Alex