[PATCH] D120000: [1/2] TLS loads opimization (hoist)

Phoebe Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 20 23:38:40 PST 2022


pengfei added inline comments.


================
Comment at: llvm/include/llvm/CodeGen/CodeGenPassBuilder.h:54
 #include "llvm/Transforms/Scalar/ScalarizeMaskedMemIntrin.h"
+//#include "llvm/Transforms/Scalar/Intel_TLSVariableControl.h" //zxzx
 #include "llvm/Transforms/Utils.h"
----------------
Remove.


================
Comment at: llvm/include/llvm/IR/Module.h:913-915
+   /// Get/set the model of TLS address loading.
+   StringRef getTlsAddrLoadControl() const;
+   void setTlsAddrLoadControl(StringRef Model);
----------------
Remove extra space.


================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:1
+//==- TLSVariableControl.h - Remove Redundant TLS Loads --*- C++ -*-==//
+//
----------------
Less than 80 columns.


================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:9
+//
+// This pass identifies/eliminate Redundant TLS Loads if related option is set.
+// For example:
----------------
eliminates


================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:20
+//
+// will generated Redundant TLS Loads by compiling it with
+// Clang++ -fPIC -ftls-model=global-dynamic -O2 -S
----------------
generate


================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:98
+struct TLSCandidate {
+  TLSUseListType Uses;
+  GlobalVariable *GV;
----------------
Why don't use `SmallVector<TLSUser, 8>` directly?


================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:109
+
+using TLSCandidateListType = SmallVector<TLSCandidate, 8>;
+
----------------
No user.


================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:124
+  bool tryReplaceTLSCandidate(Function &Fn, tlscontrol::TLSCandidate &Cand);
+  Instruction *findMatInsertPt(Instruction *Inst, unsigned Idx = ~0U) const;
+
----------------
No user.


================
Comment at: llvm/lib/Passes/PassBuilder.cpp:216
 #include "llvm/Transforms/Scalar/TailRecursionElimination.h"
+#include "llvm/Transforms/Scalar/TLSVariableControl.h"
 #include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
----------------
`L` is before `a`, so move it ahead.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:1
+//===- TLSVariableControl.cpp - Remove Redundant TLS Loads --------===//
+//
----------------
Less than 80.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:9-58
+// This pass identifies/eliminate Redundant TLS Loads if related option is set.
+// For exmaple:
+// static __thread int x;
+// int g();
+// int f(int c) {
+//   int *px = &x;
+//   while (c--)
----------------
Duplicated with header file.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:94
+             "tls-load-control=Optimize: Eleminate remove redundant TLS load(s)."
+             "tls-load-control=Strict: Strictly load TLS address before every use."
+             "tls-load-control=Non-Optimize: Generally load TLS before use(s)."),
----------------
`Strict` is not handled?


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:115
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.setPreservesCFG();
+    AU.addRequired<DominatorTreeWrapperPass>();
----------------
setPreservesAll?


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:148-149
+  bool MadeChange =
+      Impl.runImpl(Fn, getAnalysis<DominatorTreeWrapperPass>().getDomTree(),
+                   getAnalysis<LoopInfoWrapperPass>().getLoopInfo());
+
----------------
Move `getAnalysis` later in the function.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:176
+    if (TLSCandMap.count(GV) == 0) {
+      tlscontrol::TLSCandidate Candidate(GV);
+      Candidate.addUser(Inst, Idx);
----------------
No need namespace.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:178
+      Candidate.addUser(Inst, Idx);
+      TLSCandMap[GV] = Candidate;
+      GVs.push_back(GV);
----------------
GV has been used as Key, no need store in Candidate.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:200
+    return;
+
+  TLSCandMap.clear();
----------------
Better postpone getting `LI` and `DT` here.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:215
+
+static bool OneUseOutsideLoop(tlscontrol::TLSCandidate &Cand, LoopInfo *LI) {
+  if (Cand.Uses.size() != 1)
----------------
No need namespace, the same below.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:260-262
+  for (auto *GV : GVs) {
+    Replaced = tryReplaceTLSCandidate(Fn, TLSCandMap[GV]) || Replaced;
+  }
----------------
No need parentheses.


================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:294-296
+  PreservedAnalyses PA;
+  PA.preserveSet<CFGAnalyses>();
+  return PA;
----------------
Do we break any analyse info? Can we return `PreservedAnalyses::all()`?


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:1
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic -o - %s | FileCheck %s --check-prefixes=CONTROL0
----------------
Don't need since both check the same.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:2
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic -o - %s | FileCheck %s --check-prefixes=CONTROL0
+
----------------
ditto.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:11-13
+$_ZTW5thl_x = comdat any
+
+$_ZTW6thl_x2 = comdat any
----------------
What are they used for?


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:155-163
+; Function Attrs: uwtable
+define weak_odr hidden i32* @_ZTW5thl_x() local_unnamed_addr #2 comdat {
+  ret i32* @thl_x
+}
+
+; Function Attrs: uwtable
+define weak_odr hidden i32* @_ZTW6thl_x2() local_unnamed_addr #2 comdat {
----------------
What are them used for?


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:169-186
+!llvm.module.flags = !{!0, !1, !2, !3}
+!llvm.ident = !{!4}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{i32 7, !"uwtable", i32 1}
+!3 = !{i32 1, !"tls-load-control", !"Optimize"}
----------------
Remove meta data other than `tls-load-control`


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control2.ll:1
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Non-Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL2
----------------
Use `--check-prefix` for single prefix. The same below.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control2.ll:38-41
+; Function Attrs: uwtable
+define weak_odr hidden i32* @_ZTW5thl_x() local_unnamed_addr #2 comdat {
+  ret i32* @thl_x
+}
----------------
What's it used for?


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control2.ll:47-57
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{i32 7, !"uwtable", i32 1}
+!3 = !{!"Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.YYYYMMDD)"}
----------------
Remove them.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-unknown -O2 --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: llc -mtriple=x86_64-unknown-unknown -O2 --relocation-model=pic --tls-load-control=Non-Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL2
----------------
ditto.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:12-14
+$_ZTW5thl_x = comdat any
+
+$_ZTW6thl_x2 = comdat any
----------------
No use.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:350
+
+attributes #0 = { mustprogress uwtable "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { "frame-pointer"="none" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
----------------
Add `nounwind` to avoid dfi directives.


================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:354-370
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{i32 7, !"uwtable", i32 1}
+!3 = !{!"Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.YYYYMMDD)"}
----------------
Remove.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120000/new/

https://reviews.llvm.org/D120000



More information about the llvm-commits mailing list