[llvm-dev] RFC/bikeshedding: Separation of instruction and pattern definitions in LLVM backends

Fri Aug 18 02:55:09 PDT 2017

As many of you know, I have a growing series of patches for a RISC-V backend
under/awaiting review
<https://reviews.llvm.org/differential/?authors=asb&order=updated>,
<http://github.com/lowrisc/riscv-llvm>. I'll be posting a larger status update
on that work either later today or tomorrow, this RFC focuses on an issue that
came up during review which I think may benefit from wider input.

David Chisnall suggested that the backend could be made easier to read and
work with by separating out instruction definitions from the patterns used to
match them for codegen. One key advantage of such a separation is that
patterns and insruction definitions can be grouped and ordered independently
in the way that makes most sense. Patterns for addi and add benefit from being
grouped together, but it may make more sense to group the reg-reg and reg-imm
instruction definitions separately. The main downside is that this style is
not quite as concise as specifying patterns alongside the instruction
definition. Repetition seems to be effectively reduced with a few simple Pat
classes.

Semantic information such as isBranch is still represented in the instruction
definition meaning there isn't a complete split between MC-layer and codegen
concerns. The Mips{64,32}r6InstrInfo.td does also factor out this information.
This seems less compelling to me, but dissenting opinions are welcome!

I've demonstrated both the "conventional" approach
<https://gist.github.com/asb/0c61ebc131076c6186052c29968a491d#file-riscvinstrinfo_conventional-td>
and the "separate patterns" approach
<https://gist.github.com/asb/0c61ebc131076c6186052c29968a491d#file-riscvinstrinfo_separate_pats-td>.
Obviously once patterns and pseudo-instructions are separated out, you may
want to move them to a different .td file.

Does anyone have strong views on these sort of choices one way or another?

Best,

Alex