[llvm-dev] Strategy for writing a new LLVM backend?
Mark R V Murray via llvm-dev
llvm-dev at lists.llvm.org
Mon Dec 31 04:07:51 PST 2018
I'm playing around with an LLVM backend for the MC6809 8-bit processor. This is both so that I can learn LLVM and I would like a halfway-decent cross-compiler for that processor. Yes, I am a sucker for punishment!
I'm getting very tied up in the implementation details, hence my request for advice.
The MC6809 instruction set is very clean at the assembly language level, but the binary opcodes are not so helpful. Some instructions have a prefix byte, and due to the rich addressing modes, the instructions are very variable in length, and not necessarily neat or consistent. There is only very limited scope for packing known bit patterns like a a $src or $dst field.
Where it would be nice to have (e.g.)
... or ...
... all matched with (say) "[(set $dst, (<opcode> $dst, $src)),(set $dst, (add $dst, $imm))]" - note the 2-argument mode - constructing the opcodes at the same time appears to be not possible. I think I can't do instruction matching this way. Multiclasses don't seem to quite get there either.
So far I've got the whole instruction set in MC6809Instr(Format|Info).td (the indexed modes need some work as the postbytes are messy), without any matching, and I suspect I'll need to do all the instruction selection in the MC6809ISel* files. All the different types (immediate, indexed, inherent etc are there as separate instructions ordered by primary opcode (plus prefix byte if there is one).
1) Am I making sense, and is my approach so far sane?
1a) Once I have defined all the instructions, will grouping them by function and selecting the right one(s) in C++ code in the *ISel*.cpp files make more sense?
2) Using MSP430 as a base, how do I force the TwoAddressInstructionPass to be run at the right time?
Thanks and happy new year to all!
Mark R V Murray
More information about the llvm-dev