[llvm] [M68k] implement -mxgot (PR #119803)

Sun Feb 15 03:26:01 PST 2026

karcherm wrote:

The AI assisted patch addresses multiple issues, many of which are not actually related to the size of the GOT and thus unrelated to the `-mxgot` flag of gcc.

Currently, the llvm m68k backend seems to be using 16-bit offsets in a multitude of places, like position-independent addressing of the data segment. Using 16-bit offsets at that point requires that the distance between code and data is less then 32K. Similarly, setting up the GOT base address regiser from PC uses a 16-bit offset, which requires that the GOT is quite close to the code that tries to set up `%a5`. On the other hand, gcc generates 32-bit offsets for these purposes even without any special flags. Using 16 bit offsets has two advantages: First, the code is smaller, and second, the m68k-inspired ColdFire architecture does not allow 32-bit offsets in many addressing modes used for this purpose.

gcc uses classic m68k instructions with 32-bit offsets unless a ColdFire architecture is selected, and for the purposes mentioned above, it also uses slightly longer instruction sequences on ColdFire that allow 32-bit offsets. The only point gcc uses 16-bit offsets is the offset of GOT slots relative to the GOT base, and gcc only uses 16-bit offsets at that location if ColdFire is selected, because classic m68k allow 32-bit offsets in register-relative addressing.

For example, I put source code like this into Compiler Explorer

```c
extern int my_global_int;

int main(void)
{
    return my_global int;
}
```

I observe these translations using gcc:

m68k -fPIC:
```
        move.l my_global_int at GOT(%a5),%a0 ; 32 Bit
        move.l (%a0),%d0
```
ColdFire -fPIC:
```
        move.l my_global_int at GOT(%a5),%a0 ; 16 Bit (ColdFire does not allow 32-bit offsets in this addressing mode)
        move.l (%a0),%d0
```
ColdFire -fPIC -mxgot:
```
        move.l %a5,%a0
        add.l #my_global_int at GOT,%a0      ; 32 Bit
        move.l (%a0),%a0
        move.l (%a0),%d0
```

If the llvm backend for m68k is meant to generate general-purpose 32-bit code, the backend should be changed to use 32-bit compatible addressing (direct 32-bit addressing on m68k, 32-bit "emulation" on coldfire) for intra-ELF-references unconditionally. The two situations, in which 16-bit offsets may be sensible in general-purpose code generation are:

* offsets inside one code section (e.g. switch jump tables) in which the code generator might even be able to verify whether a 16-bit table fits
* offsets inside the GOT on ColdFire. gcc documentation explains that the 16-bit addressing overflow in the GOT is mostly worked around by the GNU linker being able to split the GOT on m68k. It is able to generate seperate GOTs for individual object files if the GOT size overflows the range addressable with 16 bits, so the `-mxgot` mode is only required if a single object file requires more than 8192 GOT entries.

Generating 16-bit offsets for inter-modules calls in the `bsr` instruction, for addressing module-local constants and static variables, or for setting up the GOT may be useful for generating compact firmware for embedded systems that fit all their code/data into 32K, and may be an optional feature enabled by a "tiny code" compilation model, but IMHO should not be the default in llvm.

It seems the AI-suggested patch respects the limits of the ColdFire-Architecture and avoids 32-bit register relative addressing in the 32-bit support code it generates (but I did not read the whole patch, just enough to get the gist of what's going on here). For example, that patch changes relative calls using the `BSR` instruction to absolute calls using the `JSR` instruction, which is not desired in position-independent code. Classic m68k has a 32-bit `BSR` instrcution, which should be used instead unless ColdFire support is a design goal.

So I suggest to first decide whether ColdFire compatiblity is a goal at the moment. If it is not, we don't need to care about workarounds for the address mode limitation of ColdFire. Then we can split the AI suggested changes into smaller, easier reviewable chunks like this:
1. Support setting up the GOT base register if the GOT is more than 32K away from the code.
2. Support addressing data segment items (e.g. string constants, static local variables) in PIC mode using 32-bit offsets to the PIC register.
3. Use 32-bit BSR instead of 16-bit BSR in intra-module calls.
4. Use 32-bit `%a5`-relative addressing for GOT slots.
5. Add an option to generate 32-bit switch tables.

If ColdFire support is desired, it might be sensible to conditionally back out the GOT slot patch for ColdFire by default instead of generating code sequences to support 32-bit register-relative addressing for accessing GOT slots, and just enable this feature if a "feature enable flag" is given. gcc calls this flag `-mxgot`.

As a bonus feature, if someone is interested in 16-bit-focussed code generation, implement a "tiny code" model which approximately works like `llvm` works now. I don't see the users for this feature at the moment, so I suggest to consider that feature out-of-scope for the first series of patches.

https://github.com/llvm/llvm-project/pull/119803