[Lldb-commits] [lldb] [lldb] Add SubtargetFeatures to ArchSpec (PR #173046)

Sam Elliott via lldb-commits lldb-commits at lists.llvm.org
Mon Feb 9 10:55:33 PST 2026


lenary wrote:

> In order to observe how LLDB would handle such conflicting extensions, I attempted to create an executable containing C, D, and Zcmp instructions. Clang rightfully rejects this configuration via `-march`, failing with: `'zcmp' extension is incompatible with 'c' extension when 'd' extension is enabled`. However, I was able to work around this by compiling one file with `rv64imad_zcmp` and another with `rv64gc`, then linking them together. The resulting arch string in `.riscv.attributes` becomes `rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zmmul1p0_zaamo1p0_zalrsc1p0_zca1p0_zcmp1p0` — containing all three conflicting extensions: C, D, and Zcmp.

To be clear, both of these are expected:
- You cannot create a single object file with a `-march=` with incompatible extensions.
- Linking is how to create an executable with incompatible extensions in the attributes.

> 
> To my surprise, when disassembling the resulting executable, LLDB and llvm-objdump produce different outputs:
> 
> * **LLDB** correctly disassembles `function_with_zcd_instructions`. However, `function_with_zcmp_extension` is completely wrong - it doesn't contain any Zcmp instructions and instead shows incorrect C/D extension instructions.
> * **llvm-objdump** correctly disassembles `function_with_zcmp_extension`, but makes an error in `function_with_zcd_instructions`: instead of `fsd fa1, 0x18(sp)`, it displays `cm.mvsa01 s0, s3`. This is the expected conflict, as both instructions share the same encoding `ac2e`, and seems like Zcmp takes priority over C/D extensions.

Arguably, both of these are wrong.

If your executable contains both zcd and zcmp, then you don't actually know what a specific encoding will be executed as, which is the key information needed in a debugger. Somewhere you need more specific information.


> At this point I would propose to use `parseNormalizedArchString` for consistency with llvm-objdump's behavior. However, I agree it is still necessary to provide a clear warning when `.riscv.attributes` contains an invalid or inconsistent arch string, thus I am considering to call `parseArchString` in order to detect inconsistencies and warn users when the disassembler output may be invalid:

I would find this approach acceptable.

There are other approaches to be more accurate:
- Disable both conflicting extensions, and which would cause the disassembler to print `<unknown>` for encodings in those extensions. This ensures the information the user gets is accurate, even if more is missing.
- You could find more accurate info in ISA mapping symbols (where the symbol that denotes executable instructions contains the normalised isa string as well), though the proposal has not been implemented in LLVM yet (neither in the assembler, or the disassembler). This is more accurate information for what the developer intended the encoding to mean.
- There is an object which allows you to query which extensions are actually enabled in memory. This would tell you exactly how the core would execute these encodings, if it executed them at all. This object is defined in the C API doc.

I think your proposed approach (prioritise one of the conflicting extensions, also emit a warning) is acceptable at the moment. Later we can choose which of the three more accurate approaches we prefer (or how to combine them for most accuracy).

I think I did give some of this feedback already, maybe on the other PR - I'm slightly struggling with how these changes have been split into two PRs.

https://github.com/llvm/llvm-project/pull/173046


More information about the lldb-commits mailing list