[PATCH] D155600: [AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0)

Amy Kwan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 28 22:18:16 PDT 2023


amyk added inline comments.


================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-nonfunc-calls.ll:40
 ; CHECK-DAG: addi [[REG3:[0-9]+]], [[REG1]], tls_something at tprel@l
-; CHECK-DAG: ld [[REG2:[0-9]+]], 0([[REG3]])
+; CHECK-DAG: ld [[REG2:[0-9]+]], tls_something at tprel@l([[REG1]])
 ; CHECK-DAG: ld 11, 16([[REG3]])
----------------
@nemanjai I think remember before that you mentioned that perhaps it is not necessary to check if the `Subtarget.hasAIXSmallLocalExecTLS();` flag in `PPCISelDAGToDAG.cpp` - and that checking if the global variable is aligned should be sufficient.

Through the recent comments on this patch, I think it makes sense that we don’t need to check for this flag.  This would mean the change that I am doing apply to Linux, as well. Furthermore, I also updated `PPCISelDAGToDAG.cpp` to remove a portion that seemed like it had an equivalent portion later on in the file.

After doing some functional testing, I didn’t see any issues, with the exception of one Linux test case affected by the change I am making. This test case was introduced awhile back in `87deb0b8e3f14855e6c4652993be86684c2dd429`. I believe in here we’re bitcasting a TLS global and we’re treating this as a call through a function pointer. The node we’re dealing with in the test case is a `GlobalAddressSDNode` and it is aligned/passes my alignment check, so it ends up getting folded.

Before my patch, the TLS relocation is on the addis/addi as expected, and the subsequent loads load off of r3:
```
# %bb.0:                                # %entry
	mflr 0
	stdu 1, -112(1)
	addis 3, 13, tls_something at tprel@ha
	std 0, 128(1)
	addi 3, 3, tls_something at tprel@l
	std 2, 40(1)
	ld 4, 0(3)
	ld 11, 16(3)
	ld 2, 8(3)
	mtctr 4
	bctrl
	ld 2, 40(1)
	addi 1, 1, 112
	ld 0, 16(1)
	mtlr 0
	blr
```
And after my patch,  the ld with the zero offset gets affected, and none of the others do (which makes sense because we have non-zero offsets).
```
# %bb.0:                                # %entry
	mflr 0
	stdu 1, -112(1)
	addis 3, 13, tls_something at tprel@ha
	std 0, 128(1)
	addi 4, 3, tls_something at tprel@l
	ld 3, tls_something at tprel@l(3)
	std 2, 40(1)
	ld 11, 16(4)
	ld 2, 8(4)
	mtctr 3
	bctrl
	ld 2, 40(1)
	addi 1, 1, 112
	ld 0, 16(1)
	mtlr 0
	blr
```
In any case, I figured I’d ask if you have any thoughts/opinions on this since you requested changes and also had the original thought of not checking for `hasAIXSmallLocalExecTLS()` in the beginning.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155600/new/

https://reviews.llvm.org/D155600



More information about the llvm-commits mailing list