[llvm] [DecoderEmitter] Support for DecodeOrder and `resolve-conflicts-try-all` (PR #157948)

Rahul Joshi via llvm-commits llvm-commits at lists.llvm.org
Sun Sep 14 17:23:50 PDT 2025


jurahul wrote:

> Here is an interesting decoding conflict (I replaced some '.' with a/d/m to give names to fields):
> 
> ```
>     111110110101....aaaadddd0000mmmm
>     111110110101________11110000____  t2AUTG
>     111110110101____1111____0000____  t2SMMUL
>     111110110101____________0000____  t2SMMLA
> ```
> 
> `t2AUTG` can only match if `aaaa != 1111`, (otherwise it is `t2SMMUL`) `t2SMMUL` can only match if `dddd != 1111`, (otherwise it is `t2AUTG`) `t2SMMLA` can only match if `aaaa != 1111 && dddd != 1111` (otherwise it is either `t2SMMUL` or `t2AUTG`)
> 
> 1111 is the encoding of the PC register. Now if we have `aaaa == 1111` or `dddd == 1111`, no matter which order we try, we will always successfully decode the first attempted instruction. The issue is that `rGPR` decoder accepts `1111` as a valid (SoftFail) bitpattern, even though `rGPR` class does _not_ contain `PC`.
> 
> One could argue that it is wrong to accept 1111 for `rGPR`, but it is not wrong for other instructions -- they should indeed be decoded and marked as SoftFail.
> 
> What I'm trying to say is that "decoding order" will not always work and we need a more sophisticated solution. I was thinking about adding a "decoder predicate" -- small function or inline code that can disambiguate encodings, but I couldn't figure out what that function/code should take as input.

I am under the (maybe mistaken) impression that the target is responsible that for a given bit pattern and features, only a single decoding succeeds for a given bit pattern, because that's what happens on the actual HW (each bit pattern has a deterministic interpretation). So an "O0" decoder can be built by essentially segregating each instruction into its own namespace, building N decoder tables and iterating through them till one succeeds (similar to resolve-conflicts-try-all). And in such a scenario, exactly one succeeds or none (i.e., return MCDisassembler::Success or SoftFail) and that's the final result of decoding. And then what we do on top of this O0 is a CSE/commoning of checks across these N instructions. In that case, the result should be independent of the decode order attempted. But maybe we are not there yet.

https://github.com/llvm/llvm-project/pull/157948


More information about the llvm-commits mailing list