[all-commits] [llvm/llvm-project] 2a5092: [AIX][TLS] Optimize the small local-exec access se...
Amy Kwan via All-commits
all-commits at lists.llvm.org
Thu Feb 1 06:29:34 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 2a50921553798d2db52ca6330c89f0f8a5bc2215
https://github.com/llvm/llvm-project/commit/2a50921553798d2db52ca6330c89f0f8a5bc2215
Author: Amy Kwan <amy.kwan1 at ibm.com>
Date: 2024-02-01 (Thu, 01 Feb 2024)
Changed paths:
M llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
M llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-char.ll
M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-double.ll
M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-float.ll
M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-int.ll
M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-largeaccess.ll
A llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-largeaccess2.ll
M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-short.ll
Log Message:
-----------
[AIX][TLS] Optimize the small local-exec access sequence for non-zero offsets (#71485)
This patch utilizes the -maix-small-local-exec-tls option to produce a
faster,
non-TOC-based access sequence for the local-exec TLS model.
Specifically, for
when the offsets from the TLS variable are non-zero.
In particular, this patch produces either a single:
- addi/la with a displacement off of R13 plus a non-zero offset for when
an address is calculated, or
- load or store off of R13 plus a non-zero offset for when an address is
calculated and used for further
access where R13 is the thread pointer, respectively.
In order to produce a single addi or load/store off of the thread
pointer with a non-zero offset,
this patch also adds the necessary support in the assembly printer when
printing these instructions.
Specifically:
- The non-zero offset is added to the TLS variable address when the
address of the
TLS variable + it's offset is less than 32KB.
- Otherwise, when the address of the TLS variable + its offset is
greater than 32KB, the
non-zero offset (and a multiple of 64KB) is subtracted from the TLS
address.
This handling in the assembly printer is necessary to ensure that the
TLS address + the non-zero offset
is between [-32768, 32768), so that the total displacement can fit
within the addi/load/store instructions.
This patch is meant to be a follow-up to
3f46e5453d9310b15d974e876f6132e3cf50c4b1 (where the
optimization occurs for when the offset is zero).
More information about the All-commits
mailing list