[all-commits] [llvm/llvm-project] 2a5092: [AIX][TLS] Optimize the small local-exec access se...

Amy Kwan via All-commits all-commits at lists.llvm.org
Thu Feb 1 06:29:34 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 2a50921553798d2db52ca6330c89f0f8a5bc2215
      https://github.com/llvm/llvm-project/commit/2a50921553798d2db52ca6330c89f0f8a5bc2215
  Author: Amy Kwan <amy.kwan1 at ibm.com>
  Date:   2024-02-01 (Thu, 01 Feb 2024)

  Changed paths:
    M llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
    M llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
    M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-char.ll
    M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-double.ll
    M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-float.ll
    M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-int.ll
    M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-largeaccess.ll
    A llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-largeaccess2.ll
    M llvm/test/CodeGen/PowerPC/aix-small-local-exec-tls-short.ll

  Log Message:
  -----------
  [AIX][TLS] Optimize the small local-exec access sequence for non-zero offsets (#71485)

This patch utilizes the -maix-small-local-exec-tls option to produce a
faster,
non-TOC-based access sequence for the local-exec TLS model.
Specifically, for
when the offsets from the TLS variable are non-zero.

In particular, this patch produces either a single:
- addi/la with a displacement off of R13 plus a non-zero offset for when
an address is calculated, or
- load or store off of R13 plus a non-zero offset for when an address is
calculated and used for further
  access where R13 is the thread pointer, respectively.

In order to produce a single addi or load/store off of the thread
pointer with a non-zero offset,
this patch also adds the necessary support in the assembly printer when
printing these instructions.

Specifically:
- The non-zero offset is added to the TLS variable address when the
address of the
  TLS variable + it's offset is less than 32KB.
- Otherwise, when the address of the TLS variable + its offset is
greater than 32KB, the
non-zero offset (and a multiple of 64KB) is subtracted from the TLS
address.

This handling in the assembly printer is necessary to ensure that the
TLS address + the non-zero offset
is between [-32768, 32768), so that the total displacement can fit
within the addi/load/store instructions.

This patch is meant to be a follow-up to
3f46e5453d9310b15d974e876f6132e3cf50c4b1 (where the
optimization occurs for when the offset is zero).




More information about the All-commits mailing list