[PATCH] D139888: [lld][ARM] support absolute thunks for Armv4T Thumb and interworking

Peter Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 13 03:07:41 PST 2022


peter.smith added a comment.

> What are the "absolute relocations"? They usually refer to R_ARM_ABS32 and other relocation types that are represented as R_ABS in lld.

When I added the Thunk support I borrowed terminology from Arm's linker. It uses ABS as short for ABSOLUTE which was its term for Non-PI or position dependent.

> Does the support make GBA and NDS programs with mixed A32/T32 code linkable with lld? If yes, that'll be great... I've played many GBA/NDS games in the past...

GBA is v4T so in theory this should make it possible as I think the existing homebrew dev-kits are based on GCC. NDS has a v5T and a v4T, not sure what the development story was for it, whether the software used v4T exclusively or the chips were programmed independently.

A few small comments, but overall looking good to me.

For v4T PI thunks I think the V4 Thumb to Arm is essentially

  P:  bx px 
       b #-6
       ldr ip, [pc, #0] ; L2
  L1: add pc, pc, ip
  L2: .word s - (p + (L1-p) + 8)

The Arm to Thumb looks like it can reuse the existing v5 PI Thunk.



================
Comment at: lld/ELF/Thunks.cpp:239
 
+
 // MIPS LA25 thunk
----------------
Is this extra new line intentional?


================
Comment at: lld/ELF/Thunks.cpp:733
 
-bool ARMV5PILongThunk::isCompatibleWith(const InputSection &isec,
-                                        const Relocation &rel) const {
-  // Thumb branch relocations can't use BLX
-  return rel.type != R_ARM_THM_JUMP19 && rel.type != R_ARM_THM_JUMP24;
+void ARMV4ABSLongThunk::writeLong(uint8_t *buf) {
+  const uint8_t data[] = {
----------------
It looks like the original naming convention for Thunks has broken down a bit. When I first saw the sequence I left a comment that you can't use this for V4 as it can't handle interworking, only saw the ARMLongBXThunk later. 

As I understand it the thunk has a code sequence that is V4 ABS (non-interworking), but V5 ABS (interworking). The new v4 ABS Long BX thunk is only needed for V4 interworking.

Although a more significant name change, perhaps:
`ARMV4V5LongLdrPcThunk`
`ARMV4LongBxThunk`

Not sure what the capitalisation should be for Ldr and Bx. As instructions they need to be all one case, but this makes it difficult to read the class name.




================
Comment at: lld/ELF/Thunks.cpp:770
+      0x78, 0x47,             // bx pc
+      0xfd, 0xe7,             // b #-6
+      0x04, 0xf0, 0x1f, 0xe5, // ldr pc, [pc, #-4]
----------------
perhaps worth a comment: `Arm recommended sequence to follow bx pc`. As it looks strange, and in theory anything can be used as it is never executed, but it can make a difference if run on a CPU with speculative execution.


================
Comment at: lld/ELF/Thunks.cpp:1143
+//
+// TODO: proper and efficient PIC relocation support for V4T
+// TODO: use B for short Thumb->Arm thunks instead of LDR (this doesn't work for
----------------
I think it will be worth separating what is not supported, from what is supported inefficiently.
// TODO: Support PIC interworking thunks for V4T.
// TODO: More efficient PIC non-interworking thunks for V4T.


================
Comment at: lld/ELF/Thunks.cpp:1145
+// TODO: use B for short Thumb->Arm thunks instead of LDR (this doesn't work for
+//       Arm->Thumb, as in Arm state no BX PC trick; it doesn't switch state).
+static Thunk *addThunkArmv4(RelType reloc, Symbol &s, int64_t a) {
----------------
This would be useful as it looks like there is a potential problem with the existing implementation for V5T and below as it always uses the 16 MiB V6+ Thumb branch range. This puts it at risk of relocation out of range errors if there is more than 4 MiB of code in a single output section. 


================
Comment at: lld/ELF/Thunks.cpp:1154
+  case R_ARM_CALL:
+    if (config->picThunk && thumb_target)
+      fatal("PIC relocations across state change not supported for Armv4T");
----------------
I think the ARMV5PILongThunk can work for V4T as well here.
```
P:   ldr ip, [pc, #4] ; L2
L1: add ip, pc, ip
      bx ip
L2: .word S - (P + (L1 - P) + 8)
``` 
The entry is Arm state so no BLX required, and the LDR is just loading an offset, the state change is done by the bx ip.


================
Comment at: lld/ELF/Thunks.cpp:1233
   if (!config->armHasMovtMovw) {
-    if (!config->armJ1J2BranchEncoding)
-      return addThunkPreArmv7(reloc, s, a);
+    if (!config->armJ1J2BranchEncoding && config->armHasBlx)
+      return addThunkArmv5v6(reloc, s, a);
----------------
Possible to rerrange to:
```
if (config->armJ1J2BranchEncoding)
  return addThunkV6M(reloc, s, a);
else if (config->armHasBlx)
  return addThunkArmv5v6(reloc, s, a);
return addThunkArmv4(reloc, s, a);
```
Should be similar but could be easier to read.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139888/new/

https://reviews.llvm.org/D139888



More information about the llvm-commits mailing list