[PATCH] D155600: [AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0)

Amy Kwan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 7 22:06:16 PDT 2023


amyk added inline comments.


================
Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1525-1527
+    // Handle assembly printing for -maix-small-local-exec-tls on AIX (64-bit).
+    if (emitAIXSmallLocalExecTLSAccess(MI, TmpInst))
+        return;
----------------
hubert.reinterpretcast wrote:
> Not a single test failed when I removed this. Perhaps this does not belong in this patch (but a future one)?
This is no longer needed, as Nemanja previously pointed out.


================
Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1542-1559
+  case PPC::LBZ:
+  case PPC::STB:
+  case PPC::LBZ8:
+  case PPC::STB8:
+  case PPC::LHA:
+  case PPC::LHZ:
+  case PPC::STH:
----------------
hubert.reinterpretcast wrote:
> Same comment re: the cases other than `ADDI8`.
I'm going to remove the other cases since they are not needed.


================
Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1571
 
+bool PPCAsmPrinter::emitAIXSmallLocalExecTLSAccess(const MachineInstr *MI,
+                                                   MCInst &TmpInst) {
----------------
nemanjai wrote:
> I'm not sure I follow what the purpose of this function is. It would seem that it checks whether we are lowering the AIX Small TLS instruction and change any `PPCIADDI8` to `PPC::LA8`. However, if neither of those are true, it doesn't do anything.
> 
> The issue I have with it is that if this returns `false`, we break out of the switch and simply call
> ```
> LowerPPCMachineInstrToMCInst(MI, TmpInst, *this);
> EmitToStreamer(*OutStreamer, TmpInst);
> ```
> which would have happened if we never called this function anyway. So it would seem that we don't need the additions from lines 1541 to 1564. We can simply add `PPC::ADDI8` into the switch above, and change it to `PPC::LA8` if both of the following conditions are met:
> `Subtarget->hasAIXSmallLocalExecTLS() && MO.getTargetFlags() & PPCII::MO_TPREL_FLAG`
> 
> Or maybe I'm missing something.
Yes, you're right. This appears not to be needed. I originally had this because the implementation was more complex than expected, but now it is more simplified so I don't need this.


================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:3364
+      uint64_t GVTypeSize =
+          GV->getParent()->getDataLayout().getTypeAllocSize(GV->getValueType());
+      if (HasAIXSmallLocalExecTLS &&
----------------
hubert.reinterpretcast wrote:
> ICE/assert:
> ```
> clang: /terrannew/hstong/.Lpcoral03/llvmbld/dev/llvm-project/llvm/include/llvm/IR/DataLayout.h:669: TypeSize llvm::DataLayout::getTypeSizeInBits(Type *) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.
> ```
> with
> ```
> clang --target=powerpc64-ibm-aix -c -o /dev/null -xc++ -<<<$'extern __thread struct X x [[gnu::tls_model("local-exec")]];\nvoid *f() { return &x; }'
> ```
> 
> A similar case with an array of unknown bound does not assert; however, it does generate the small TLS access pattern despite the size being unknown (and potentially huge).
Thanks for catching this. I've made it so that extra cases like these just simply use the TOC-based local-exec method.


================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:3366
+      if (HasAIXSmallLocalExecTLS &&
+          (GVTypeSize < (AIXTLSUpperDisplacement - 8)))
+        return DAG.getNode(PPCISD::Lo, dl, PtrVT, VariableOffsetTGA, TLSReg);
----------------
nemanjai wrote:
> hubert.reinterpretcast wrote:
> > What does `AIXTLSUpperDisplacement` represent? It is already given a value of 32K - 8. Should the hard coded subtraction be here, should the value of `AIXTLSUpperDisplacement` be further adjusted instead, or is there unintentional double adjustment happening?
> > 
> > For reference, I encountered no issues linking the result of assembly that performs `la` (using the subject access pattern) for the past-the-end address of a 32K - 1 local-exec TLS variable.
> I agree. We shouldn't be doing any arithmetic here on `AIXTLSUpperDisplacement`. We should set it to what we want it and only perform comparisons.
I have set `AIXTLSUpperDisplacement` to be the value that we want.
It is meant to represent that maximum size of the TLS variable, in which we can produce the small local-exec sequence for.


================
Comment at: llvm/lib/Target/PowerPC/PPCMCInstLower.cpp:32-69
 static MCSymbol *GetSymbolFromOperand(const MachineOperand &MO,
                                       AsmPrinter &AP) {
   const TargetMachine &TM = AP.TM;
   Mangler &Mang = TM.getObjFileLowering()->getMangler();
   const DataLayout &DL = AP.getDataLayout();
   MCContext &Ctx = AP.OutContext;
 
----------------
hubert.reinterpretcast wrote:
> I replaced the entire function to unconditionally use `AP.getSymbol(GV)` for `MO_GlobalAddress` cases. All LIT and test-suite cases pass in 3-stage bootstrap builds on ppc64le Linux.
> 
> This should make sense: Going through the previously-default path to get a symbol using just the name is a loss of information (which was found to be problematic for toc-data and small TLS). I see no reason to keep the lossy path.
> 
> I strongly suggest going with a generic common path (which becomes well-tested) versus having special-case paths.
> 
> Since the change is not actually NFC and this patch seems to be the thing that we know can test it, I guess it goes in with this patch.
> 
> Note: I removed the part that does things with the offset. I am not sure if belongs in this function. The name of the function, `GetSymbolFromOperand` suggests that perhaps only the symbol itself should be considered. Also, the result of this function gets passed to `GetSymbolRef`, which already expects to get the offset from the MachineOperand.
Thanks for the suggestion and for discussing your questions, and concerns with me offline regarding this change.


================
Comment at: llvm/lib/Target/PowerPC/PPCMCInstLower.cpp:115-116
     RefKind = MCSymbolRefExpr::VK_PPC_GOT_TPREL_PCREL;
+  else if (MO.getTargetFlags() == PPCII::MO_TPREL_FLAG)
+    RefKind = MCSymbolRefExpr::VK_PPC_AIX_TLSLE;
 
----------------
hubert.reinterpretcast wrote:
> In light of https://reviews.llvm.org/D156292, I believe this should have an assertion added about the TLS model (see `llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp` change in the aforementioned patch).
I have updated it to add an assert for the local-exec model. Is this what you meant?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155600/new/

https://reviews.llvm.org/D155600



More information about the llvm-commits mailing list