[llvm] MC: Restructure MCFragment as a fixed part and a variable tail (PR #148544)

Sun Jul 13 18:02:26 PDT 2025

MaskRay wrote:

> Direction looks good, will review more closely tomorrow. However, could you elaborate on why the fixed+variable part are not stored together? This seems to complicate things when accessing them. If growth during relaxation is a concern, we could allocate some padding afterwards to reduce the likelihood of moving the entire contents.

Thanks for taking a look.

Growth of the variable part during relaxation is a valid concern.
To store both the fixed and variable parts together, we need a vector-like data structure (begin, size, capacity) for the variable part, while maintaining the four variables related to `Contents`:

```
ContentStart
ContentEnd (= VarStart)
VarEnd
VarEndOfStorage
```

The current approach also uses four variables.

To prevent issues, such as a 2-byte x86 instruction relaxing to 5 bytes requires copying 100-byte fixed part, we must track the maximum variable size for each fragment type and target-specific span-dependent instruction.

The padding can be 10 bytes, or perhaps longer.
x86 is the only user of MCInst::Flags span-dependent instruction (`{evex}` as we have discovered in #147229).
(While I am not familiar with x86 APX, I suspect that we should encode the prefix into the fixed part (MCDataFragment), leaving the opcode/operands to MCRelaxableFragment.)

We need the max variable size information for each fragment type and each target-specific FT_Relaxable, otherwise there is a risk that relaxing a 2-byte x86 instruction to 5-byte might copy 100-byte fixed part.

---

Although the current storage scheme could likely be optimized, it offers sufficient flexibility.
MCFragment users rely on the `{set,get}Var{Contents,Fixups}` APIs, allowing the internal implementation to be swapped out for a more efficient scheme if one is identified later.

https://github.com/llvm/llvm-project/pull/148544