[llvm] [AIX][TLS] Optimize the small local-exec access sequence for non-zero offsets (PR #71485)

Qiu Chaofan via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 31 00:36:27 PST 2024


================
@@ -1523,30 +1590,73 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
     EmitToStreamer(*OutStreamer, MCInstBuilder(PPC::EnforceIEIO));
     return;
   }
-  case PPC::ADDI8: {
-    // The faster non-TOC-based local-exec sequence is represented by `addi`
-    // with an immediate operand having the MO_TPREL_FLAG. Such an instruction
-    // does not otherwise arise.
-    unsigned Flag = MI->getOperand(2).getTargetFlags();
-    if (Flag == PPCII::MO_TPREL_FLAG ||
-        Flag == PPCII::MO_GOT_TPREL_PCREL_FLAG ||
-        Flag == PPCII::MO_TPREL_PCREL_FLAG) {
-      assert(
-          Subtarget->hasAIXSmallLocalExecTLS() &&
-          "addi with thread-pointer only expected with local-exec small TLS");
-      LowerPPCMachineInstrToMCInst(MI, TmpInst, *this);
-      TmpInst.setOpcode(PPC::LA8);
-      EmitToStreamer(*OutStreamer, TmpInst);
-      return;
-    }
-    break;
-  }
   }
 
   LowerPPCMachineInstrToMCInst(MI, TmpInst, *this);
   EmitToStreamer(*OutStreamer, TmpInst);
 }
 
+// For non-TOC-based local-exec variables that have a non-zero offset,
+// we need to create a new MCExpr that adds the non-zero offset to the address
+// of the local-exec variable that will be used in either an addi, load or
+// store. However, the final displacement for these instructions must be
+// between [-32768, 32768), so if the TLS address + its non-zero offset is
+// greater than 32KB, a new MCExpr is produced to accommodate this situation.
+const MCExpr *PPCAsmPrinter::getAdjustedLocalExecExpr(const MachineOperand &MO,
+                                                      int64_t Offset) {
+  // Non-zero offsets (for loads, stores or `addi`) require additional handling.
+  // When the offset is zero, there is no need to create an adjusted MCExpr.
+  if (!Offset)
+    return nullptr;
+
+  assert(MO.isGlobal() && "Only expecting a global MachineOperand here!");
+  const GlobalValue *GValue = MO.getGlobal();
+  assert(TM.getTLSModel(GValue) == TLSModel::LocalExec &&
+         "Only local-exec accesses are handled!");
+
+  bool IsGlobalADeclaration = GValue->isDeclarationForLinker();
+  // Find the GlobalVariable that corresponds to the particular TLS variable
+  // in the TLS variable-to-address mapping. All TLS variables should exist
+  // within this map, with the exception of TLS variables marked as extern.
+  const auto TLSVarsMapEntryIter = TLSVarsToAddressMapping.find(GValue);
+  if (TLSVarsMapEntryIter == TLSVarsToAddressMapping.end())
+    assert(IsGlobalADeclaration &&
+           "Only expecting to find extern TLS variables not present in the TLS "
+           "variable-to-address map!");
+
+  unsigned TLSVarAddress =
+      IsGlobalADeclaration ? 0 : TLSVarsMapEntryIter->second;
+  ptrdiff_t FinalAddress = (TLSVarAddress + Offset);
+  // If the address of the TLS variable + the offset is less than 32KB,
+  // or if the TLS variable is extern, we simply produce an MCExpr to add the
+  // non-zero offset to the TLS variable address.
+  // For when TLS variables are extern, this is safe to do because we can
+  // assume that the address of extern TLS variables are zero.
+  const MCExpr *Expr = MCSymbolRefExpr::create(
+      getSymbol(GValue), MCSymbolRefExpr::VK_PPC_AIX_TLSLE, OutContext);
+  Expr = MCBinaryExpr::createAdd(
+      Expr, MCConstantExpr::create(Offset, OutContext), OutContext);
+  if (FinalAddress >= 32768) {
+    // Handle the written offset for cases where:
+    //   TLS variable address + Offset > 32KB.
+
+    // The assembly that is printed will look like:
+    //  TLSVar at le + Offset - Delta
+    // where Delta is a multiple of 64KB: ((FinalAddress + 32768) & ~0xFFFF).
+    ptrdiff_t Delta = ((FinalAddress + 32768) & ~0xFFFF);
+    // Check that the total instruction displacement fits within [-32768,32768).
+    ptrdiff_t InstDisp = TLSVarAddress + Offset - Delta;
+    assert((InstDisp < 32768) ||
+           (InstDisp >= -32768) &&
+               "Expecting the instruction displacement for local-exec TLS "
+               "variables to be between [-32768, 32768)!");
----------------
ecnelises wrote:

(not a review comment)

Per my understanding, allow peephole for non-zero offsets but without changes to AsmPrinter, we'll get assembler error like `The displacement must be greater than or equal to -32768 and less than or equal to 32767` (no-integrated-as).

Since code here is to rewrite offsets exceeding upper limit (32768) into negative, will this assert be hit if the offset is even larger? (for example, twice the size as objects in `aix-small-local-exec-tls-largeaccess.ll`)

https://github.com/llvm/llvm-project/pull/71485


More information about the llvm-commits mailing list