[llvm-dev] add intrinsic function support for customized backend

Thu Oct 22 08:51:59 PDT 2015

Hi Xiangyang,

When your intrinsic is passed to the back-end, it will be converted
into DAG nodes automatically (like for a function call). Then the back-
end need to know how to convert it into real instructions or at least
how to manage it through instruction selection pass. An intrinsic
cannot exist inside the backend (it is IR code) but can be converted
into pseudo-instruction instead (at least for the X86 backend). Then,
there's many ways to handle pseudo-instructions inside a backend. 

I will take the X86 backend as an example.

First, your intrinsics:

def int_foo : Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
[IntrReadArgMem]>;

It will be converted into DAG nodes. Then, you can handle it manually
inside the code or using TableGen mechanism. For the latter, you should
define a pseudo-instruction that match your intrinsic-translated-into-
dag-node. The pseudo-instruction definition should look like this:

let isPseudo = 1 in {
def FOO : PseudoI<(outs i32mem:$dst), (ins i32mem:$src1, i32mem:$src2,
), [(set i32mem:$dst, (int_foo i32mem:$src1, i32mem:$src2))]>;
}

First, you should always set isPeudo to 1 if it is a pseudo-
instruction. Then, if it has some side effect, you should define them.
For example, set Defs = [EFLAGS] if it impacts EFLAGS value.
The tricky part is the pattern matching for dag nodes. I'm not sure
it's correct. I'm used to enable debug pass in Clang to have the
resulted dag node representation of my intrinsics and then create my
pseudo-instruction definition based on it.

When your intrinsic is correctly translated into pseudo-instruction,
you can use it where you want. If you need to convert it into real
instructions, there's some common place to do it.

You have the ExpandISelPseudos pass which is called at the beginning of
addMachinePasses. Its operation is relatively simple since it browses
the MachineInstr by looking for pseudo-instructions and then calls
TargetLowering::EmitInstrWithCustomeInserter for each of them. This
last method being abstract, it is implemented by each backend that
wants it like in X86TargetLowering for the x86 backend. Due to its
location, this solution offers the advantage that no optimization has
already taken place. Thus, the added machine code will be optimized in
the same way than any other options of the program. Moreover, you still
have the virtual register abstraction allowing you to be more
flexibility in your implementation.

You also have ExpandPostRA pass. This one commes right after register
allocation and the addition of the prolog-epilog. It calls
TargetInstrInfo::expandPostRAPseudo() giving a chance to the target to
extend the pseudo-instruction encountered. For the backend X86, the
TargetInstrInfo concrete implementation is in X86InstrInfo. As register
allocation and the majority of previous optimizations have already been
done, this solution ensures that the added code will not be altered
afterwards.

Finally, it the two previous passes are not suitable for a particular
reason, other more generic ways exist. Simply create a new
MachineFunctionPass and call it when you need it. For example from:
- addPreRegAlloc
- addPostRegAlloc
- addPreSched2
- addPreEmitPass

I don't have a big LLVM background but thus are my findings when I was
playing with the middle-end/back-end some time ago.

Regards,
Gaël
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151022/35891ede/attachment.html>