[Patch] Use address-taken to disambiguate global var and indirect accesses
Shuxin Yang
shuxin.llvm at gmail.com
Tue Oct 15 14:39:51 PDT 2013
Hi,
The attached patch is to take advantage of address-taken to
disambiguate global
variable and indirect memory accesses.
The motivation
===========
I was asked to investigate the problem where the static variable
is not hoisted as
loop invariant:
---------------
static int xyz;
void foo(int *p) {
for (int i = 0; i < xyz; i++)
*p++ = ....
}
-----------------
The compiler dose have a concept call "addr-capture". However, I
don't think it can
be used for disambiguate global variable and indirect access. The
reasons is that
a variable dose not have its address *CAPTURED*, dose not necessarily
mean this variable
cannot be indirectly accessed.
So, I rely on "address taken"
How it works
========
1. In globalopt, when a global var is identified as
not-addr-taken, cache the result
to GlobalVariable::notAddrTaken.
2. In alias-analyzer, supposed the mem-op involved are m1 and m2.
Let o1 and o2
be the "object" (obtained via get_underlying_object() of m1
and m2 respectively.
if O1 != O2 && one of the them are global-variable without
address taken,
then m1 and m2 are disjointed access.
Misc:
=========
Note that I *cache* the result of not-addr-taken. Unlike
function, it is far more expensive
to figure out if a globalvar has its address taken or not. So, it is not
appropriate to analyze
the address-taken on the fly.
On the other hand, I personally think not-addr-taken flag is
almost maintenance free.
(FIXME) Only few optimizer could make a not-addr-taken variable become
addr-taken (say, outlining),
and I don't think currently we have such passes (FIXME again!). In case
such rare cases take place,
it is up to the pass the to reset the not-addr-taken flags.
Of course, a variable previously considered addr-taken may later on
proved to be not-addr-taken.
In that case, compiler dose not have to update it -- it is
conservatively correct.
Performance impact
=============
Measured on an oldish Mac Tower with 2x 2.26Ghz Quad-Core Xeon. Both
base-line and
the change are measured for couple of times. I did take a look of why
Olden/power is sped up --
the loads of static variable "P" and "Q" are promoted in many places. I
have not yet got chance
to investigate why the speedup to pairlocalalign with O3 disappear in
O3+LTO.
o. test-suite w/ O3:
-------------------
Benchmarks/Olden/power/power 1.6129 1.354
-16.0518321036642
Benchmarks/mafft/pairlocalalign 31.4211 26.5794
-15.4090722476298
Benchmarks/Ptrdist/yacr2/yacr2 0.881 0.804
-8.74006810442678
o. test-suite w/ O3 + LTO
-------------------------
Benchmarks/Olden/power/power 1.6143 1.3419 -16.8741869540978
Applications/spiff/spiff 2.9203 2.849 -2.44152997979659
o. spec2kint w/ O3+LTO
----------------------
bzip2 75.02 73.92 -1.4
Thanks
Shuxin
-------------- next part --------------
Index: include/llvm/IR/GlobalVariable.h
===================================================================
--- include/llvm/IR/GlobalVariable.h (revision 192719)
+++ include/llvm/IR/GlobalVariable.h (working copy)
@@ -48,6 +48,7 @@
// can change from its initial
// value before global
// initializers are run?
+ bool notAddrTaken : 1; // Dose not have address taken.
public:
// allocate space for exactly one operand
@@ -174,6 +175,9 @@
isExternallyInitializedConstant = Val;
}
+ void setNotAddressTaken(bool Val) { notAddrTaken = Val; }
+ bool notAddressTaken(void) const { return notAddrTaken; }
+
/// copyAttributesFrom - copy all additional attributes (those not needed to
/// create a GlobalVariable) from the GlobalVariable Src to this one.
void copyAttributesFrom(const GlobalValue *Src);
Index: docs/LangRef.rst
===================================================================
--- docs/LangRef.rst (revision 192719)
+++ docs/LangRef.rst (working copy)
@@ -514,6 +514,9 @@
``@llvm.used``. This assumption may be suppressed by marking the
variable with ``externally_initialized``.
+If a global variable dose not have its address taken, it will be optionally
+flagged ``notaddrtaken``.
+
An explicit alignment may be specified for a global, which must be a
power of 2. If not present, or if the alignment is set to zero, the
alignment of the global is set by the target to whatever it feels
Index: lib/Analysis/BasicAliasAnalysis.cpp
===================================================================
--- lib/Analysis/BasicAliasAnalysis.cpp (revision 192719)
+++ lib/Analysis/BasicAliasAnalysis.cpp (working copy)
@@ -1238,6 +1238,14 @@
return NoAlias;
if (isEscapeSource(O2) && isNonEscapingLocalObject(O1))
return NoAlias;
+
+ if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(O1))
+ if (GV->notAddressTaken())
+ return NoAlias;
+
+ if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(O2))
+ if (GV->notAddressTaken())
+ return NoAlias;
}
// If the size of one access is larger than the entire object on the other
Index: lib/AsmParser/LLParser.cpp
===================================================================
--- lib/AsmParser/LLParser.cpp (revision 192719)
+++ lib/AsmParser/LLParser.cpp (working copy)
@@ -704,7 +704,7 @@
unsigned Linkage, bool HasLinkage,
unsigned Visibility) {
unsigned AddrSpace;
- bool IsConstant, UnnamedAddr, IsExternallyInitialized;
+ bool IsConstant, UnnamedAddr, IsExternallyInitialized, notAddrTaken;
GlobalVariable::ThreadLocalMode TLM;
LocTy UnnamedAddrLoc;
LocTy IsExternallyInitializedLoc;
@@ -719,6 +719,7 @@
IsExternallyInitialized,
&IsExternallyInitializedLoc) ||
ParseGlobalType(IsConstant) ||
+ ParseOptionalToken(lltok::kw_notaddrtaken, notAddrTaken) ||
ParseType(Ty, TyLoc))
return true;
@@ -776,6 +777,7 @@
GV->setLinkage((GlobalValue::LinkageTypes)Linkage);
GV->setVisibility((GlobalValue::VisibilityTypes)Visibility);
GV->setExternallyInitialized(IsExternallyInitialized);
+ GV->setNotAddressTaken(notAddrTaken);
GV->setThreadLocalMode(TLM);
GV->setUnnamedAddr(UnnamedAddr);
Index: lib/AsmParser/LLLexer.cpp
===================================================================
--- lib/AsmParser/LLLexer.cpp (revision 192719)
+++ lib/AsmParser/LLLexer.cpp (working copy)
@@ -504,6 +504,7 @@
KEYWORD(zeroinitializer);
KEYWORD(undef);
KEYWORD(null);
+ KEYWORD(notaddrtaken);
KEYWORD(to);
KEYWORD(tail);
KEYWORD(target);
Index: lib/AsmParser/LLToken.h
===================================================================
--- lib/AsmParser/LLToken.h (revision 192719)
+++ lib/AsmParser/LLToken.h (working copy)
@@ -51,6 +51,7 @@
kw_localdynamic, kw_initialexec, kw_localexec,
kw_zeroinitializer,
kw_undef, kw_null,
+ kw_notaddrtaken,
kw_to,
kw_tail,
kw_target,
Index: lib/Transforms/IPO/GlobalOpt.cpp
===================================================================
--- lib/Transforms/IPO/GlobalOpt.cpp (revision 192719)
+++ lib/Transforms/IPO/GlobalOpt.cpp (working copy)
@@ -1922,6 +1922,8 @@
if (AnalyzeGlobal(GV, GS, PHIUsers))
return false;
+ GV->setNotAddressTaken(true);
+
if (!GS.isCompared && !GV->hasUnnamedAddr()) {
GV->setUnnamedAddr(true);
NumUnnamed++;
Index: lib/Bitcode/Reader/BitcodeReader.cpp
===================================================================
--- lib/Bitcode/Reader/BitcodeReader.cpp (revision 192719)
+++ lib/Bitcode/Reader/BitcodeReader.cpp (working copy)
@@ -1848,6 +1848,9 @@
new GlobalVariable(*TheModule, Ty, isConstant, Linkage, 0, "", 0,
TLM, AddressSpace, ExternallyInitialized);
NewGV->setAlignment(Alignment);
+ if (Record.size() > 10)
+ NewGV->setNotAddressTaken(Record[10]);
+
if (!Section.empty())
NewGV->setSection(Section);
NewGV->setVisibility(Visibility);
Index: lib/Bitcode/Writer/BitcodeWriter.cpp
===================================================================
--- lib/Bitcode/Writer/BitcodeWriter.cpp (revision 192719)
+++ lib/Bitcode/Writer/BitcodeWriter.cpp (working copy)
@@ -616,11 +616,13 @@
Vals.push_back(GV->hasSection() ? SectionMap[GV->getSection()] : 0);
if (GV->isThreadLocal() ||
GV->getVisibility() != GlobalValue::DefaultVisibility ||
- GV->hasUnnamedAddr() || GV->isExternallyInitialized()) {
+ GV->hasUnnamedAddr() || GV->isExternallyInitialized() ||
+ GV->notAddressTaken()) {
Vals.push_back(getEncodedVisibility(GV));
Vals.push_back(getEncodedThreadLocalMode(GV));
Vals.push_back(GV->hasUnnamedAddr());
Vals.push_back(GV->isExternallyInitialized());
+ Vals.push_back(GV->notAddressTaken());
} else {
AbbrevToUse = SimpleGVarAbbrev;
}
Index: lib/IR/AsmWriter.cpp
===================================================================
--- lib/IR/AsmWriter.cpp (revision 192719)
+++ lib/IR/AsmWriter.cpp (working copy)
@@ -1459,6 +1459,7 @@
if (GV->hasUnnamedAddr()) Out << "unnamed_addr ";
if (GV->isExternallyInitialized()) Out << "externally_initialized ";
Out << (GV->isConstant() ? "constant " : "global ");
+ if (GV->notAddressTaken()) Out << "notaddrtaken ";
TypePrinter.print(GV->getType()->getElementType(), Out);
if (GV->hasInitializer()) {
Index: lib/IR/Globals.cpp
===================================================================
--- lib/IR/Globals.cpp (revision 192719)
+++ lib/IR/Globals.cpp (working copy)
@@ -99,6 +99,7 @@
}
LeakDetector::addGarbageObject(this);
+ setNotAddressTaken(false);
}
GlobalVariable::GlobalVariable(Module &M, Type *Ty, bool constant,
@@ -125,6 +126,7 @@
Before->getParent()->getGlobalList().insert(Before, this);
else
M.getGlobalList().push_back(this);
+ setNotAddressTaken(false);
}
void GlobalVariable::setParent(Module *parent) {
@@ -185,6 +187,7 @@
GlobalValue::copyAttributesFrom(Src);
const GlobalVariable *SrcVar = cast<GlobalVariable>(Src);
setThreadLocal(SrcVar->isThreadLocal());
+ setNotAddressTaken(SrcVar->notAddressTaken());
}
More information about the llvm-commits
mailing list