[llvm-dev] Best way to integrate special-purpose hardware with LLVM
Kai Plociennik via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 27 07:03:38 PDT 2021
Dear all,
in a research project, we are developing a special-purpose
floating-point accelerator, together with C++/OpenMP offloading support
with LLVM.
Since our hardware has a limited, special-purpose instruction set, it is
often a challenge for us to decide on how to integrate with LLVM's
general-purpose concepts such as IR and instruction selection. Hence, I
would appreciate a lot if we could get some feedback w.r.t. our current
approach. Let me sketch an example.
Our accelerator has no instruction for loading/writing floating point
values from or to memory using a given address. Instead: 1) The data
layout of a structured volume of data in memory can be
hardware-configured, and then load and store instructions have a special
form with offsets around the "center point" in the data volume which is
currently processed. 2) We can write values to hardware registers and
use them in computations.
Now, let's assume we have C++ code like the following:
float array[5] {1, 2, 3, 4, 5}.
#pragma omp target
{
...
float a = array[0];
...
}
In the IR we get, we find fragments like this:
%arrayidx = getelementptr inbounds [5 x float], ..., i32 0
%1 = load float, float* %arrayidx, align 4
If we leave it like this, in the created selection DAG there will be a
lot of instructions we do not conceptually have on our hardware, and
detecting the necessary changes at instruction selection stage is
complex and hard.
Hence, we chose method 2) (use register) from above in this case and do
the following:
1) Define a target-specific intrinsic "array_access" which captures
information about the accessed array and the offset.
2) Replace the two instructions (getelementptr, load) from above by a
single "array_access" intrinsic in a special IR pass. This is a kind of
"meta-instruction" which is here for conveying necessary information to
instruction selection phase.
3) Lower the array_access intrinsic to a pseudo-instruction we also
define for our hardware, to get from IR to selection DAG.
4) Replace the pseudo instruction by a hardware register in instruction
selection. This creates valid machine code. (At runtime, we configure
the register correctly before executing the machine code.)
I would be interested in feedback on several things here:
-Is there an "intended way" of integrating such special-purpose hardware
with LLVM?
-Is it a good idea to introduce an intrinsic for the described purpose,
or does this violate some architectural ideas of intrinsics or LLVM in
general?
-We do not want to lose any relevant information on the IR or selection
DAG. For example, in the original form, general LLVM functionality will
be aware of what getelementptr and load means, so data dependencies will
not be broken. In our "meta-instruction" (intrinsic) form, there might
be a risk of either LLVM passes creating invalid code since they do not
know what our meta-instructions mean, or losing the ability to employ
optimization functionality present in LLVM due to unknown semantics of
our meta-instructions. What would be the right way to handle this?
-How to cope in general with the fact that a target architecture has no
instructions for certain concepts such as integer computation, address
calculations, "standard" memory accesses, and so on. Are there known
projects which had to tackle this?
-When modifying IR or selection DAG as above, how to get the necessary
"metadata" from IR to instruction selection or to runtime? (In this case
the data on the accessed array and offset or register)
To summarize: What would be the best "architectural approach" to
integrate support for such a kind of special-purpose hardware in LLVM?
Should we e.g. include "pseudo instructions" in OurHWInstructionInfo.td
and "pseudo registers" in OurHWRegisterInfo.td to be able to integrate
with the general-purpose LLVM concepts and in the end treat
instructions/registers in a special way? Or can we avoid having to do so
earlier?
Any feedback on this would be greatly appreciated.
Sincerely,
Kai Plociennik
--
Dr. Kai Plociennik
Fraunhofer-Institut für Techno- und Wirtschaftsmathematik ITWM
Competence Center High Performance Computing
Fraunhofer-Platz 1
67663 Kaiserslautern
Tel: +49 (0)631 31600 4081
mail: kai.plociennik at itwm.fraunhofer.de
www.itwm.fraunhofer.de
More information about the llvm-dev
mailing list