[llvm] [AIX][TLS] Optimize the small local-exec access sequence for non-zero offsets (PR #71485)

Amy Kwan via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 31 12:18:18 PST 2024


================
@@ -1523,30 +1590,73 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
     EmitToStreamer(*OutStreamer, MCInstBuilder(PPC::EnforceIEIO));
     return;
   }
-  case PPC::ADDI8: {
-    // The faster non-TOC-based local-exec sequence is represented by `addi`
-    // with an immediate operand having the MO_TPREL_FLAG. Such an instruction
-    // does not otherwise arise.
-    unsigned Flag = MI->getOperand(2).getTargetFlags();
-    if (Flag == PPCII::MO_TPREL_FLAG ||
-        Flag == PPCII::MO_GOT_TPREL_PCREL_FLAG ||
-        Flag == PPCII::MO_TPREL_PCREL_FLAG) {
-      assert(
-          Subtarget->hasAIXSmallLocalExecTLS() &&
-          "addi with thread-pointer only expected with local-exec small TLS");
-      LowerPPCMachineInstrToMCInst(MI, TmpInst, *this);
-      TmpInst.setOpcode(PPC::LA8);
-      EmitToStreamer(*OutStreamer, TmpInst);
-      return;
-    }
-    break;
-  }
   }
 
   LowerPPCMachineInstrToMCInst(MI, TmpInst, *this);
   EmitToStreamer(*OutStreamer, TmpInst);
 }
 
+// For non-TOC-based local-exec variables that have a non-zero offset,
+// we need to create a new MCExpr that adds the non-zero offset to the address
+// of the local-exec variable that will be used in either an addi, load or
+// store. However, the final displacement for these instructions must be
+// between [-32768, 32768), so if the TLS address + its non-zero offset is
+// greater than 32KB, a new MCExpr is produced to accommodate this situation.
+const MCExpr *PPCAsmPrinter::getAdjustedLocalExecExpr(const MachineOperand &MO,
+                                                      int64_t Offset) {
+  // Non-zero offsets (for loads, stores or `addi`) require additional handling.
+  // When the offset is zero, there is no need to create an adjusted MCExpr.
+  if (!Offset)
+    return nullptr;
+
+  assert(MO.isGlobal() && "Only expecting a global MachineOperand here!");
+  const GlobalValue *GValue = MO.getGlobal();
+  assert(TM.getTLSModel(GValue) == TLSModel::LocalExec &&
+         "Only local-exec accesses are handled!");
+
+  bool IsGlobalADeclaration = GValue->isDeclarationForLinker();
+  // Find the GlobalVariable that corresponds to the particular TLS variable
+  // in the TLS variable-to-address mapping. All TLS variables should exist
+  // within this map, with the exception of TLS variables marked as extern.
+  const auto TLSVarsMapEntryIter = TLSVarsToAddressMapping.find(GValue);
+  if (TLSVarsMapEntryIter == TLSVarsToAddressMapping.end())
+    assert(IsGlobalADeclaration &&
+           "Only expecting to find extern TLS variables not present in the TLS "
+           "variable-to-address map!");
+
+  unsigned TLSVarAddress =
+      IsGlobalADeclaration ? 0 : TLSVarsMapEntryIter->second;
+  ptrdiff_t FinalAddress = (TLSVarAddress + Offset);
+  // If the address of the TLS variable + the offset is less than 32KB,
+  // or if the TLS variable is extern, we simply produce an MCExpr to add the
+  // non-zero offset to the TLS variable address.
+  // For when TLS variables are extern, this is safe to do because we can
+  // assume that the address of extern TLS variables are zero.
+  const MCExpr *Expr = MCSymbolRefExpr::create(
+      getSymbol(GValue), MCSymbolRefExpr::VK_PPC_AIX_TLSLE, OutContext);
+  Expr = MCBinaryExpr::createAdd(
+      Expr, MCConstantExpr::create(Offset, OutContext), OutContext);
+  if (FinalAddress >= 32768) {
+    // Handle the written offset for cases where:
+    //   TLS variable address + Offset > 32KB.
+
+    // The assembly that is printed will look like:
+    //  TLSVar at le + Offset - Delta
+    // where Delta is a multiple of 64KB: ((FinalAddress + 32768) & ~0xFFFF).
+    ptrdiff_t Delta = ((FinalAddress + 32768) & ~0xFFFF);
+    // Check that the total instruction displacement fits within [-32768,32768).
+    ptrdiff_t InstDisp = TLSVarAddress + Offset - Delta;
+    assert((InstDisp < 32768) ||
+           (InstDisp >= -32768) &&
+               "Expecting the instruction displacement for local-exec TLS "
+               "variables to be between [-32768, 32768)!");
----------------
amy-kwan wrote:

For larger variables, the assert will be not triggered. This is because in [the initial patch](https://github.com/llvm/llvm-project/commit/3f46e5453d9310b15d974e876f6132e3cf50c4b1#diff-909a72141a3ecd6bfde54a634[…]4c3eb79622c5845f6c7c119145af6f) that introduced this feature, I only restricted this non-TOC-based access sequence if the size of the TLS variable is less than 32751 within `PPCISelLowering.cpp`.
```
constexpr uint64_t AIXSmallTlsPolicySizeLimit = 32751;
....

     // With the -maix-small-local-exec-tls option, produce a faster access
      // sequence for local-exec TLS variables where the offset from the TLS
      // base is encoded as an immediate operand.
      //
      // We only utilize the faster local-exec access sequence when the TLS
      // variable has a size within the policy limit. We treat types that are
      // not sized or are empty as being over the policy size limit.
      if (HasAIXSmallLocalExecTLS && IsTLSLocalExecModel) {
        Type *GVType = GV->getValueType();
        if (GVType->isSized() && !GVType->isEmptyTy() &&
            GV->getParent()->getDataLayout().getTypeAllocSize(GVType) <=
                AIXSmallTlsPolicySizeLimit)
          return DAG.getNode(PPCISD::Lo, dl, PtrVT, VariableOffsetTGA, TLSReg);
      }
```
So in the case where the size is `8187`, the test case for the large variable in `aix-small-local-exec-tls-largeaccess.ll` looks like:
```
        stw r3, mySmallLocalExecTLSv1[TL]@le(r13)
```
If the size is increased past `AIXSmallTlsPolicySizeLimit`, it will load from the TOC instead and do the regular local-exec sequence, which is expected:
```
        ld r4, L..C0(r2)                        # target-flags(ppc-tprel) @mySmallLocalExecTLSv1
        li r3, 1
        add r4, r13, r4
        stw r3, 0(r4)
. . .
```
Hope the above answers your question. 

https://github.com/llvm/llvm-project/pull/71485


More information about the llvm-commits mailing list