[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
Trifunovic, Konrad via llvm-dev
llvm-dev at lists.llvm.org
Tue Mar 2 01:36:35 PST 2021
We would like to propose this RFC for upstreaming a proper SPIR-V backend to LLVM:
We at Intel are interested in contributing a proper LLVM backend that targets Khronos SPIR-V portable IR format . It would be based on a proper backend architecture (GlobalISel) and targeting compute flavour of SPIR-V with further possibility to extend it to 3D shader flavour (Vulkan). What we are asking for is LLVM community blessing for the proposal and help in addressing open questions (Many of you are already familiar with the topic, so you might want to skip immediately to 'Open questions' and 'Objective' without going through all paragraphs).
We would be extremely grateful for all comments, questions and guidance on further direction.
There have been several attempts to properly integrate SPIR-V generators into LLVM , but, to the best of our knowledge, none of them made significant progress to eventually land into LLVM.org trunk.
One of the reasons for such a state is the lack of consensus onto the fundamental design: whether it should be a translator library (Khronos LLVM - SPIR-V translator) wrapped within a target, or, it should be a 'proper' LLVM target using SelectionDAG/GlobalISel or 'just' a binary emission layer (just naming few ideas discussed over previous mailing list threads). 
We at Intel do want to give it another try by implementing a 'true' backend approach. Most importantly, we do want to land the prototype code into LLVM trunk as a SPIR-V target and continue the development there as a prototype LLVM target. Starting point for project is code base at Khronos github.
note: In the meantime it is not meant to be a replacement for bidirectional SPIRV-LLVM translator developed by Khronos members  (including Intel). This proposal does not address the question of SPIR-V to LLVM IR translation (what could be considered a SPIR-V front-end for LLVM).
Without starting a new debate on implementation choices, we took into the account the following important design points from previous discussions:
* The overall goal of this effort is to implement a proper LLVM backend for SPIR-V. That said, it registers itself as a proper target, implements Target* interfaces (similarly to NVPTX or AMDGPU backends). The backend uses GlobalISel infrastructure starting with Khronos prototype  (big thanks goes to ARM for contributing this code) and we are committed to keep it that way (i.e. no fallback to SelectionDAG is planned). This addresses some concerns in the first proposal .
* Support OpenCL (compute) flavour of SPIR-V. Infrastructure is flexible, so adding Vulkan specific opcodes/capabilities should not be a big effort. (but not planned in the near term)
* For non-clang based frontends it is desirable to expose intrinsics through a target specific .td file (currently not done, still relying on well-known names and mangling). Need discussion on direction.
* Since SPIR-V is a virtual ISA, many of the regular backend passes are disabled, such as register allocation or scheduling. This is quite similar to what NVPTX BE is doing. Still most of the logic is concentrated in canonical GlobalISel passes: IRTranslator, CallLowering, Legalization, InstructionSelection. RegBankSelect is of no need in our backend.
* One of the major differences between SPIR-V ISA and LLVM IR is the way type information is stored. In order to link gMIR instructions to the SPIR-V type they are producing we use some pseudo instructions which were quite easy to fold with the actual instruction on the selection stage while still providing all the necessary info at the previous passes.
* In the meantime some of the SPIR-V instructions (e.g. OpAccessChain) are being generated right at the IRTranslation stage. This goes back to the original prototype, we are not sure yet if we should get rid of this - some advice could be helpful. Moreover, calls to OpenCL builtins are lowered into the actual SPIR-V code at CallLowering stage - i.e. not properly integrated into selection yet.
* Due to the aforementioned difference in how LLVM IR and SPIR-V describe values and their types, backend legalizer is making some custom transformations on top of the existing ones to ensure types compliance with the selector expectations without disabling preISel legality checks.
* Instruction selection patterns are distributed between Tablegen and plain C++ - thanks to GlobalISel for allowing that. For example, most of the binary operators are described in .td while casts are selected with C++ code.
- note: Code generation is achieved with no (or minimal) changes to general GlobalISel infrastructure. Some modifications to the existing GlobalISel implementation may happen, but at the moment we're trying to avoid them unless absolutely necessary or we're sure the changes may be beneficial for the whole LLVM project.
* There is a couple of custom passes in the backend, e.g. for generating required capabilities, decorations and extensions. There also exists a pass to ensure SPIR-V BBs layout requirements.
Current state & open problems
Current code is based on LLVM 12 and is now published at Khronos github . This includes the original code contributed by ARM and some additions developed at Intel (both being active Khronos members).
We are working on overall refactoring, implementation of the missing features and improving the pass-rate (see 'Testing' below), but there are a bunch of problems which are on our TODO list:
* Remove selection logic from IR translation stage, this problem's inherited from original prototype
* Proper handling of extensions (planned to be similar to the translator's approach which is to enable them explicitly via an option)
* Binary file versioning - there is much output version numbers (and header structure in general) hardcoding in the current codebase
* Implement some of the currently missing OpenCL builtins
* .td descriptions for Capabilities/Decorations/etc. - already work in progress
A dozen of LIT tests have been contributed to facilitate offline testing. Nevertheless, there is (still) a lack of 'runtime testing', where a produced SPIR-V binary is actually executed on a target platform (being it a CPU/GPU/FPGA). Intel plans to provide testing on a reference GPU platform and other OpenCL platform providers are encouraged to do the same.
Current test-suite mostly consists of LITs taken from LLVM-SPIRV translator. We have not achieved 100% pass-rate on it yet and the testsuite itself is not yet complete.
There are also a number of problems we have not come to a final solution as of yet, so any input from the community would be greatly appreciated. Here we list some:
* Exposing compute intrinsics: mangling or Intrinsics.td? It seems that non-clang front-ends would prefer having a library of SPIR-V (GPU-centric) intrinsics exposed by a target. Current clang approach for OpenCL is using well-known names for OpenCL builtin functions and name mangling (which is also the way supported by LLVM-SPIRV translator). SPIRV-LLVM bidirectional translator also supports a 'SPIR-V friendly' LLVM-IR convention .
* Development model - in-trunk or out-of-trunk?
1) we could land the code as it is to llvm.org trunk (residing in lib/Target/SPIRV) and continue development from there, keeping it as a prototype target. That would be preferable for us, since we think that contributing code to trunk will give better community visibility and help us with a continuous guidance of LLVM community.
2) development will continue on external Github (based on most recent LLVM codebase) until some agreed-upon milestone is reached. We are open to this option, though it is less preferable by us since we will remain out-of-sync with main llvm development and will not have an opportunity to contribute back generic improvements to codegen infrastructure.
* Selection dilemma: .td vs C++ selection patterns - maybe there is already a BKM for that? One of the problem with moving everything to Tablegen is an increased number of variants for the same opcode (due to the generality built in SPIR-V design, e.g. OpSelect supports integers, float, vectors of both, etc.). That in turn worsens the code in some places, e.g. some checks regarding those opcodes.
* Promotion criteria: whichever development model is chosen, the backend will be in an experimental state. There is a need to set up quality criteria for promoting it into a regular backend. We propose that we should track the quality of current Khronos LLVM-SPIRV translator  and to switch to a production quality SPIR-V backend once that quality/functionality is on par. Any other suggestions would be appreciated.
* Testing and maintenance: currently testing is performed through LIT tests, but that only facilitates 'offline' testing. Ultimately the SPIR-V code needs to be executed on at least one OpenCL conformant platform that does execute SPIR-V kernels. This is work in progress and currently will proceed outside of LLVM buildbot infrastructure (i.e. will be performed at in-house Intel infrastructure). We want to discuss how this flow could be up-streamed to LLVM community. Of course, other vendors are encouraged to support this effort by providing their reference platforms.
This is not closed list of open questions, please feel free to add Your opinions and points for discussion.
Our ultimate objective is to upstream the backend to the trunk LLVM repository. Since our changes are too significant for a general code review on Phabricator/Mailing List, we would like to encourage you to comment on the backend's original repository on GitHub . Eventually (in the next couple of months), we plan to commit the experimental backend to the LLVM repository and ask for post-commit review. The backend could land either in the main branch as an experimental backend or possibly on a new branch allowing for easier review and further work. Right now we would like to ask for general discussion, comments and we are happy to answer any questions you might have as well.
Intel Technology Poland sp. z o.o. - ul. Slowackiego 173, 80-298 Gdansk - KRS 101882 - NIP 957-07-52-316
More information about the llvm-dev