[llvm] [TableGen][DecoderEmitter] Add option to emit type-specialized code (PR #146593)

Thu Aug 21 14:47:50 PDT 2025

s-barannikov wrote:

> > One option to have backends choose between the two versions (i.e., in total support 3 modes currently, templated, with a decode impl function for less code size, and with a per-bitwidth specialized decodeInstruction) but I feel that too many options and too much complexity and maintenance over time. I think we want to drill down to just one form that works well for most backends and deprecate the templated support eventually
> 
> (Just thoughts) After spending some time with the disassembler backend, I think in the end it might make sense to emit one table and one function that works like the one for variable-length encodings, processing all possible sizes for the namespace+hwmode combination. That is, consumes data as we go. That would relieve us from having to have multiple tables for each size, the decoder function would determine the size itself based on the encoding in the low bits of an instruction. The `Size` field then could just be ignored for the purposes of disassembly (or used in an assert that the decoder consumed the right number of bytes).

I gave it a try. It works for some backends (e.g. RISCV), but not for the others. E.g., on AVR, instructions are encoded in a PDP-endian way for some reason, and merging tables for 16-bit and 32-bit instructions leads to conflicts. Conflicts also start appearing on at least Mips and AMDGPU, but I didn't investigate why. The single table is a few bytes larger than the sum of separate tables, there are no space savings anywhere (decoders are already shared). So it is probably not worth it.

https://github.com/llvm/llvm-project/pull/146593