[PATCH] D155600: [AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0)

Amy Kwan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 17 22:14:27 PDT 2023


amyk marked 6 inline comments as done.
amyk added a comment.

@hubert.reinterpretcast I was going to update the patch, and then I thought about your comments a bit more.
Just wanted to double check on some suggestions before I misunderstand  anything and update the patch if that's OK.



================
Comment at: llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp:7625-7630
+      if (RequiresMod4Offset) {
+        if (isa<GlobalAddressSDNode>(BaseOp1) && HasAIXSmallLocalExecTLS) {
+          const auto *GV = dyn_cast<GlobalAddressSDNode>(BaseOp1)->getGlobal();
+          Align Alignment = GV->getPointerAlignment(CurDAG->getDataLayout());
+          if (Alignment < 4)
+            continue;
----------------
hubert.reinterpretcast wrote:
> hubert.reinterpretcast wrote:
> > If we are to keep the condition, this needs a comment for why `HasAIXSmallLocalExecTLS` is a sufficient guard to limit the behaviour to just the new AIX "small TLS" accesses. The logic would be that the option is only valid for AIX and, on AIX, a `ADDI`/`ADDI8` node with a global address for the immediate operand would necessarily be for "small TLS" (although possibly for the `local-dynamic` TLS model with future developments).
> > 
> > However: The optimization (with the checks) is generally sound, so maybe there should be no such check at all (except to limit the scope of the current change because some tests need updating). Yes, there can be downstream code that may have issues with generating the global address as an immediate operand for the load/store instruction, but changing the downstream code seems to be the solution for those cases.
> > 
> > If we keep the check, there should be a TODO comment to remove the artificial limitation.
> Aside from the checks that limit the global address (as opposed to constant) immediate operand handling to "small TLS", this code does the same thing as the code for the toc-data cases (except that which operand is the immediate operand is different between the two cases).
> 
> The code can be made common (using `ImmOpnd`) in the non-`ReplaceFlags` path at line 7717.
> 
> Note: `isa` followed by `cast`/`dyn_cast` is an anti-pattern.
- I think there are possibilities that `HasAIXSmallLocalExecTLS` may not be necessary here. Just wanted to check, when you say "artificial limitation",  do you mean that `HasAIXSmallLocalExecTLS` check?
- Are you suggesting that this `else if` should be removed as we have `ImmOpnd` on 7677, and that it goes into the the code in 7717? Or am I misunderstanding? Just figured I'd double check because as you mentioned, this is checking the operand of the `ADDI`, whereas I believe the below accounts for the cases where we're dealing with the offset from the load/store and combined immediate operand.
- Are you also suggesting that the initial condition should not be `if (isa<GlobalAddressSDNode>(BaseOp1)`  but rather a `dyn_cast` of a `GlobalAddressSDNode`?


================
Comment at: llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp:7623
       // other cases provide aligned addresses and are always safe.)
-      if (RequiresMod4Offset &&
+      if (RequiresMod4Offset && !HasAIXSmallLocalExecTLS &&
           (!isa<ConstantSDNode>(Base.getOperand(1)) ||
----------------
nemanjai wrote:
> hubert.reinterpretcast wrote:
> > This change looks suspicious.
> > 
> > I am getting:
> > ```
> > Global must be word-aligned for LD, STD, LWA!
> > UNREACHABLE executed at /terrannew/hstong/.Lpcoral03/llvmbld/dev/llvm-project/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1523!
> > ```
> > 
> > With:
> > ```
> > clang --target=powerpc64-ibm-aix -maix-small-local-exec-tls -O -c -o /dev/null -xc - \
> > <<<$'__attribute__((__tls_model__("local-exec"))) __thread struct { int x __attribute__((__packed__)); } a;\nint f() { return a.x; }'
> > ```
> Right, if `Base.getOperand(1)` is not a constant, we need to check the alignment of the `GlobalAddress`.
This has been accounted for in `PPCISelLowering.cpp`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155600/new/

https://reviews.llvm.org/D155600



More information about the llvm-commits mailing list