[llvm-dev] RFC/bikeshedding: Separation of instruction and pattern definitions in LLVM backends

Mon Aug 21 03:53:41 PDT 2017

One thing to be aware of with this is that (IIRC) tablegen uses the pattern to infer things about the pattern. One example I vaguely remember is that an empty pattern would result in the same effect as hasSideEffects=1 and I think there were others.

> Semantic information such as isBranch is still represented in the instruction
> definition meaning there isn't a complete split between MC-layer and codegen
> The Mips{64,32}r6InstrInfo.td does also factor out this information.
> This seems less compelling to me, but dissenting opinions are welcome!

Mips doesn't manage a complete split between the MC-layer and CodeGen either because there's overlap in the fields that they use. It's separating the encoding from the operation and the availability. The encoding and availability portions are focused on making them easily verifiable against the specification, while the operation section deals with the description of the behaviour of the instruction, how it maps to assembly, etc.

> On 18 Aug 2017, at 10:55, Alex Bradbury via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> As many of you know, I have a growing series of patches for a RISC-V backend
> under/awaiting review
> <https://reviews.llvm.org/differential/?authors=asb&order=updated>,
> <http://github.com/lowrisc/riscv-llvm>. I'll be posting a larger status update
> on that work either later today or tomorrow, this RFC focuses on an issue that
> came up during review which I think may benefit from wider input.
> 
> David Chisnall suggested that the backend could be made easier to read and
> work with by separating out instruction definitions from the patterns used to
> match them for codegen. One key advantage of such a separation is that
> patterns and insruction definitions can be grouped and ordered independently
> in the way that makes most sense. Patterns for addi and add benefit from being
> grouped together, but it may make more sense to group the reg-reg and reg-imm
> instruction definitions separately. The main downside is that this style is
> not quite as concise as specifying patterns alongside the instruction
> definition. Repetition seems to be effectively reduced with a few simple Pat
> classes.
> 
> Semantic information such as isBranch is still represented in the instruction
> definition meaning there isn't a complete split between MC-layer and codegen
> concerns. The Mips{64,32}r6InstrInfo.td does also factor out this information.
> This seems less compelling to me, but dissenting opinions are welcome!
> 
> I've demonstrated both the "conventional" approach
> <https://gist.github.com/asb/0c61ebc131076c6186052c29968a491d#file-riscvinstrinfo_conventional-td>
> and the "separate patterns" approach
> <https://gist.github.com/asb/0c61ebc131076c6186052c29968a491d#file-riscvinstrinfo_separate_pats-td>.
> Obviously once patterns and pseudo-instructions are separated out, you may
> want to move them to a different .td file.
> 
> Does anyone have strong views on these sort of choices one way or another?
> 
> Best,
> 
> Alex
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev