[PATCH] D120000: [1/2] TLS loads opimization (hoist)
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 20 23:38:40 PST 2022
pengfei added inline comments.
================
Comment at: llvm/include/llvm/CodeGen/CodeGenPassBuilder.h:54
#include "llvm/Transforms/Scalar/ScalarizeMaskedMemIntrin.h"
+//#include "llvm/Transforms/Scalar/Intel_TLSVariableControl.h" //zxzx
#include "llvm/Transforms/Utils.h"
----------------
Remove.
================
Comment at: llvm/include/llvm/IR/Module.h:913-915
+ /// Get/set the model of TLS address loading.
+ StringRef getTlsAddrLoadControl() const;
+ void setTlsAddrLoadControl(StringRef Model);
----------------
Remove extra space.
================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:1
+//==- TLSVariableControl.h - Remove Redundant TLS Loads --*- C++ -*-==//
+//
----------------
Less than 80 columns.
================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:9
+//
+// This pass identifies/eliminate Redundant TLS Loads if related option is set.
+// For example:
----------------
eliminates
================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:20
+//
+// will generated Redundant TLS Loads by compiling it with
+// Clang++ -fPIC -ftls-model=global-dynamic -O2 -S
----------------
generate
================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:98
+struct TLSCandidate {
+ TLSUseListType Uses;
+ GlobalVariable *GV;
----------------
Why don't use `SmallVector<TLSUser, 8>` directly?
================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:109
+
+using TLSCandidateListType = SmallVector<TLSCandidate, 8>;
+
----------------
No user.
================
Comment at: llvm/include/llvm/Transforms/Scalar/TLSVariableControl.h:124
+ bool tryReplaceTLSCandidate(Function &Fn, tlscontrol::TLSCandidate &Cand);
+ Instruction *findMatInsertPt(Instruction *Inst, unsigned Idx = ~0U) const;
+
----------------
No user.
================
Comment at: llvm/lib/Passes/PassBuilder.cpp:216
#include "llvm/Transforms/Scalar/TailRecursionElimination.h"
+#include "llvm/Transforms/Scalar/TLSVariableControl.h"
#include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
----------------
`L` is before `a`, so move it ahead.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:1
+//===- TLSVariableControl.cpp - Remove Redundant TLS Loads --------===//
+//
----------------
Less than 80.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:9-58
+// This pass identifies/eliminate Redundant TLS Loads if related option is set.
+// For exmaple:
+// static __thread int x;
+// int g();
+// int f(int c) {
+// int *px = &x;
+// while (c--)
----------------
Duplicated with header file.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:94
+ "tls-load-control=Optimize: Eleminate remove redundant TLS load(s)."
+ "tls-load-control=Strict: Strictly load TLS address before every use."
+ "tls-load-control=Non-Optimize: Generally load TLS before use(s)."),
----------------
`Strict` is not handled?
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:115
+ void getAnalysisUsage(AnalysisUsage &AU) const override {
+ AU.setPreservesCFG();
+ AU.addRequired<DominatorTreeWrapperPass>();
----------------
setPreservesAll?
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:148-149
+ bool MadeChange =
+ Impl.runImpl(Fn, getAnalysis<DominatorTreeWrapperPass>().getDomTree(),
+ getAnalysis<LoopInfoWrapperPass>().getLoopInfo());
+
----------------
Move `getAnalysis` later in the function.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:176
+ if (TLSCandMap.count(GV) == 0) {
+ tlscontrol::TLSCandidate Candidate(GV);
+ Candidate.addUser(Inst, Idx);
----------------
No need namespace.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:178
+ Candidate.addUser(Inst, Idx);
+ TLSCandMap[GV] = Candidate;
+ GVs.push_back(GV);
----------------
GV has been used as Key, no need store in Candidate.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:200
+ return;
+
+ TLSCandMap.clear();
----------------
Better postpone getting `LI` and `DT` here.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:215
+
+static bool OneUseOutsideLoop(tlscontrol::TLSCandidate &Cand, LoopInfo *LI) {
+ if (Cand.Uses.size() != 1)
----------------
No need namespace, the same below.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:260-262
+ for (auto *GV : GVs) {
+ Replaced = tryReplaceTLSCandidate(Fn, TLSCandMap[GV]) || Replaced;
+ }
----------------
No need parentheses.
================
Comment at: llvm/lib/Transforms/Scalar/TLSVariableControl.cpp:294-296
+ PreservedAnalyses PA;
+ PA.preserveSet<CFGAnalyses>();
+ return PA;
----------------
Do we break any analyse info? Can we return `PreservedAnalyses::all()`?
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:1
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic -o - %s | FileCheck %s --check-prefixes=CONTROL0
----------------
Don't need since both check the same.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:2
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic -o - %s | FileCheck %s --check-prefixes=CONTROL0
+
----------------
ditto.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:11-13
+$_ZTW5thl_x = comdat any
+
+$_ZTW6thl_x2 = comdat any
----------------
What are they used for?
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:155-163
+; Function Attrs: uwtable
+define weak_odr hidden i32* @_ZTW5thl_x() local_unnamed_addr #2 comdat {
+ ret i32* @thl_x
+}
+
+; Function Attrs: uwtable
+define weak_odr hidden i32* @_ZTW6thl_x2() local_unnamed_addr #2 comdat {
----------------
What are them used for?
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control.ll:169-186
+!llvm.module.flags = !{!0, !1, !2, !3}
+!llvm.ident = !{!4}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{i32 7, !"uwtable", i32 1}
+!3 = !{i32 1, !"tls-load-control", !"Optimize"}
----------------
Remove meta data other than `tls-load-control`
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control2.ll:1
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: opt -S -mtriple=x86_64-unknown-unknown -tlscontrol --relocation-model=pic --tls-load-control=Non-Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL2
----------------
Use `--check-prefix` for single prefix. The same below.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control2.ll:38-41
+; Function Attrs: uwtable
+define weak_odr hidden i32* @_ZTW5thl_x() local_unnamed_addr #2 comdat {
+ ret i32* @thl_x
+}
----------------
What's it used for?
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control2.ll:47-57
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{i32 7, !"uwtable", i32 1}
+!3 = !{!"Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.YYYYMMDD)"}
----------------
Remove them.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-unknown -O2 --relocation-model=pic --tls-load-control=Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL0
+; RUN: llc -mtriple=x86_64-unknown-unknown -O2 --relocation-model=pic --tls-load-control=Non-Optimize -o - %s | FileCheck %s --check-prefixes=CONTROL2
----------------
ditto.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:12-14
+$_ZTW5thl_x = comdat any
+
+$_ZTW6thl_x2 = comdat any
----------------
No use.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:350
+
+attributes #0 = { mustprogress uwtable "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { "frame-pointer"="none" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
----------------
Add `nounwind` to avoid dfi directives.
================
Comment at: llvm/test/CodeGen/X86/tls-loads-control3.ll:354-370
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{i32 7, !"uwtable", i32 1}
+!3 = !{!"Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.YYYYMMDD)"}
----------------
Remove.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120000/new/
https://reviews.llvm.org/D120000
More information about the llvm-commits
mailing list