[llvm] [TableGen][DecoderEmitter] Add option to emit type-specialized `decodeToMCInst` (PR #146593)
Rahul Joshi via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 3 20:31:14 PDT 2025
jurahul wrote:
I have spent more time on this and refined the change a bit. Essentially, two things.
(1) I made the interface a little more structured:
```
// InstrDecoderOption - This class is used to provide some options to the
// TableGen DecoderEmitter backend.
class InstrDecoderOption<string ty, list<int> bws> {
string CPPType = ty; // C++ type for generating non-templated code.
list<int> Bitwidths = bws; // List of bitwidths supported by the above type.
assert !not(!empty(CPPType)), "CPP type cannot be empty";
assert !not(!empty(Bitwidths)), "Bitwidths cannot be empty";
}
class InstrInfo {
...
list<InstrDecoderOption> DecoderOptions = [];
}
def AMDGPUInstrInfo : InstrInfo {
let guessInstructionProperties = 1;
// Opt-in into non-templated code for instruction decoder.
let DecoderOptions = [
InstrDecoderOption<"uint32_t", [32]>,
InstrDecoderOption<"uint64_t", [64]>,
InstrDecoderOption<"DecoderUInt128", [96, 128]>,
];
}
```
and
(2) when this option is provided, we do not generate *any* templated code. So, all functions in the emitted code are non-templated, including `decodeInstruction`. @topperc this is similar to what you suggested, except without the 16/32 suffix and these are just overloaded functions. To avoid coincidental errors, I made the `insn` parameter a reference so that we don't inadvertently call the wrong one. I then measured the size of the AMDGPU disassembler code by using `size -A build/lib/libLLVMAMDGPUDisassembler.so` command. With that I see the following delta:
```
.text 378780 -> 244684, i.e., a 35% reduction in size
.rodata 352648 -> 334960 i.e., a 5% reduction in size
```
The reduction in .text size is expected. The reduction in data size is coincidental. Because the decoder index is now assigned per-CPP type, its value is generally smaller for the 2nd and 3rd CPP type, so the resulting ULEB128 encoding is smaller as well, resulting in a net reduction in the decoder table size.
Please let me know how this looks. I still need to add unit tests for this. Once this is in, I think there are 2 additional improvements here that can be made:
a. Improve the robustness. Encode the bitwidth as the first entry in the decoder table, and in the appropriate overload of `decodeInstruction`, fail immediately if it's not one of the supported ones for that variant.
b. If its beneficial, change `NumToSkip` encoding per CPP type. Because the code is now specialized per type, we can support the case where say all 32-bit tables are small enough to use 8-bits but only 64-bit tables need 16 bits. I am not sure if this will pan out but seems this per-size/CPP type specialization can open this possibility.
https://github.com/llvm/llvm-project/pull/146593
More information about the llvm-commits
mailing list