[Patch] Use address-taken to disambiguate global var and indirect accesses

Shuxin Yang shuxin.llvm at gmail.com
Tue Oct 15 14:39:51 PDT 2013


Hi,

     The attached patch is to take advantage of address-taken to 
disambiguate global
  variable and indirect memory accesses.

  The motivation
  ===========
      I was asked to investigate the problem where the static variable 
is not hoisted as
loop invariant:

     ---------------
      static int xyz;
      void foo(int *p) {
          for (int i = 0; i < xyz; i++)
             *p++ = ....
      }
    -----------------

    The compiler dose have a concept call "addr-capture". However, I 
don't think it can
be used for disambiguate global variable and indirect access. The 
reasons is that
a variable dose not have its address *CAPTURED*, dose not necessarily 
mean this variable
cannot be indirectly accessed.

    So, I rely on "address taken"

  How it works
  ========
       1. In globalopt, when a global var is identified as 
not-addr-taken, cache the result
           to GlobalVariable::notAddrTaken.

       2. In alias-analyzer, supposed the mem-op involved are m1 and m2. 
Let o1 and o2
           be the "object" (obtained via get_underlying_object() of m1 
and m2 respectively.

          if O1 != O2 && one of the them are global-variable without 
address taken,
          then m1 and m2 are disjointed access.

  Misc:
  =========
       Note that I *cache* the result of not-addr-taken. Unlike 
function, it is far more expensive
to figure out if a globalvar has its address taken or not. So, it is not 
appropriate to analyze
the address-taken on the fly.

      On the other hand,  I personally think not-addr-taken flag is 
almost maintenance free.
(FIXME) Only few optimizer could make a not-addr-taken variable become 
addr-taken (say, outlining),
and I don't think currently we have such passes (FIXME again!).  In case 
such rare cases take place,
it is up to the pass the to reset the not-addr-taken flags.

     Of course, a variable previously considered addr-taken may later on 
proved to be not-addr-taken.
In that case, compiler dose not have to update it -- it is 
conservatively correct.

  Performance impact
=============
   Measured on an oldish Mac Tower with 2x 2.26Ghz Quad-Core Xeon. Both 
base-line and
the change are measured for couple of times.  I did take a look of why 
Olden/power is sped up --
the loads of static variable "P" and "Q" are promoted in many places.  I 
have not yet got chance
to investigate why the speedup to pairlocalalign with O3 disappear in 
O3+LTO.


o. test-suite w/ O3:
-------------------
Benchmarks/Olden/power/power                  1.6129 1.354       
-16.0518321036642
Benchmarks/mafft/pairlocalalign               31.4211 26.5794     
-15.4090722476298
Benchmarks/Ptrdist/yacr2/yacr2                0.881 0.804       
-8.74006810442678

o. test-suite w/ O3 + LTO
-------------------------
Benchmarks/Olden/power/power  1.6143      1.3419 -16.8741869540978
Applications/spiff/spiff      2.9203      2.849 -2.44152997979659

o. spec2kint w/ O3+LTO
----------------------
bzip2  75.02 73.92 -1.4


Thanks
Shuxin


-------------- next part --------------
Index: include/llvm/IR/GlobalVariable.h
===================================================================
--- include/llvm/IR/GlobalVariable.h	(revision 192719)
+++ include/llvm/IR/GlobalVariable.h	(working copy)
@@ -48,6 +48,7 @@
                                                // can change from its initial
                                                // value before global
                                                // initializers are run?
+  bool notAddrTaken : 1;                       // Dose not have address taken.
 
 public:
   // allocate space for exactly one operand
@@ -174,6 +175,9 @@
     isExternallyInitializedConstant = Val;
   }
 
+  void setNotAddressTaken(bool Val) { notAddrTaken = Val; }
+  bool notAddressTaken(void) const { return notAddrTaken; }
+
   /// copyAttributesFrom - copy all additional attributes (those not needed to
   /// create a GlobalVariable) from the GlobalVariable Src to this one.
   void copyAttributesFrom(const GlobalValue *Src);
Index: docs/LangRef.rst
===================================================================
--- docs/LangRef.rst	(revision 192719)
+++ docs/LangRef.rst	(working copy)
@@ -514,6 +514,9 @@
 ``@llvm.used``. This assumption may be suppressed by marking the
 variable with ``externally_initialized``.
 
+If a global variable dose not have its address taken, it will be optionally
+flagged ``notaddrtaken``.
+
 An explicit alignment may be specified for a global, which must be a
 power of 2. If not present, or if the alignment is set to zero, the
 alignment of the global is set by the target to whatever it feels
Index: lib/Analysis/BasicAliasAnalysis.cpp
===================================================================
--- lib/Analysis/BasicAliasAnalysis.cpp	(revision 192719)
+++ lib/Analysis/BasicAliasAnalysis.cpp	(working copy)
@@ -1238,6 +1238,14 @@
       return NoAlias;
     if (isEscapeSource(O2) && isNonEscapingLocalObject(O1))
       return NoAlias;
+
+    if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(O1))
+      if (GV->notAddressTaken())
+        return NoAlias;
+
+    if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(O2))
+      if (GV->notAddressTaken())
+        return NoAlias;
   }
 
   // If the size of one access is larger than the entire object on the other
Index: lib/AsmParser/LLParser.cpp
===================================================================
--- lib/AsmParser/LLParser.cpp	(revision 192719)
+++ lib/AsmParser/LLParser.cpp	(working copy)
@@ -704,7 +704,7 @@
                            unsigned Linkage, bool HasLinkage,
                            unsigned Visibility) {
   unsigned AddrSpace;
-  bool IsConstant, UnnamedAddr, IsExternallyInitialized;
+  bool IsConstant, UnnamedAddr, IsExternallyInitialized, notAddrTaken;
   GlobalVariable::ThreadLocalMode TLM;
   LocTy UnnamedAddrLoc;
   LocTy IsExternallyInitializedLoc;
@@ -719,6 +719,7 @@
                          IsExternallyInitialized,
                          &IsExternallyInitializedLoc) ||
       ParseGlobalType(IsConstant) ||
+      ParseOptionalToken(lltok::kw_notaddrtaken, notAddrTaken) ||
       ParseType(Ty, TyLoc))
     return true;
 
@@ -776,6 +777,7 @@
   GV->setLinkage((GlobalValue::LinkageTypes)Linkage);
   GV->setVisibility((GlobalValue::VisibilityTypes)Visibility);
   GV->setExternallyInitialized(IsExternallyInitialized);
+  GV->setNotAddressTaken(notAddrTaken);
   GV->setThreadLocalMode(TLM);
   GV->setUnnamedAddr(UnnamedAddr);
 
Index: lib/AsmParser/LLLexer.cpp
===================================================================
--- lib/AsmParser/LLLexer.cpp	(revision 192719)
+++ lib/AsmParser/LLLexer.cpp	(working copy)
@@ -504,6 +504,7 @@
   KEYWORD(zeroinitializer);
   KEYWORD(undef);
   KEYWORD(null);
+  KEYWORD(notaddrtaken);
   KEYWORD(to);
   KEYWORD(tail);
   KEYWORD(target);
Index: lib/AsmParser/LLToken.h
===================================================================
--- lib/AsmParser/LLToken.h	(revision 192719)
+++ lib/AsmParser/LLToken.h	(working copy)
@@ -51,6 +51,7 @@
     kw_localdynamic, kw_initialexec, kw_localexec,
     kw_zeroinitializer,
     kw_undef, kw_null,
+    kw_notaddrtaken,
     kw_to,
     kw_tail,
     kw_target,
Index: lib/Transforms/IPO/GlobalOpt.cpp
===================================================================
--- lib/Transforms/IPO/GlobalOpt.cpp	(revision 192719)
+++ lib/Transforms/IPO/GlobalOpt.cpp	(working copy)
@@ -1922,6 +1922,8 @@
   if (AnalyzeGlobal(GV, GS, PHIUsers))
     return false;
 
+  GV->setNotAddressTaken(true);
+
   if (!GS.isCompared && !GV->hasUnnamedAddr()) {
     GV->setUnnamedAddr(true);
     NumUnnamed++;
Index: lib/Bitcode/Reader/BitcodeReader.cpp
===================================================================
--- lib/Bitcode/Reader/BitcodeReader.cpp	(revision 192719)
+++ lib/Bitcode/Reader/BitcodeReader.cpp	(working copy)
@@ -1848,6 +1848,9 @@
         new GlobalVariable(*TheModule, Ty, isConstant, Linkage, 0, "", 0,
                            TLM, AddressSpace, ExternallyInitialized);
       NewGV->setAlignment(Alignment);
+      if (Record.size() > 10)
+        NewGV->setNotAddressTaken(Record[10]);
+
       if (!Section.empty())
         NewGV->setSection(Section);
       NewGV->setVisibility(Visibility);
Index: lib/Bitcode/Writer/BitcodeWriter.cpp
===================================================================
--- lib/Bitcode/Writer/BitcodeWriter.cpp	(revision 192719)
+++ lib/Bitcode/Writer/BitcodeWriter.cpp	(working copy)
@@ -616,11 +616,13 @@
     Vals.push_back(GV->hasSection() ? SectionMap[GV->getSection()] : 0);
     if (GV->isThreadLocal() ||
         GV->getVisibility() != GlobalValue::DefaultVisibility ||
-        GV->hasUnnamedAddr() || GV->isExternallyInitialized()) {
+        GV->hasUnnamedAddr() || GV->isExternallyInitialized() ||
+        GV->notAddressTaken()) {
       Vals.push_back(getEncodedVisibility(GV));
       Vals.push_back(getEncodedThreadLocalMode(GV));
       Vals.push_back(GV->hasUnnamedAddr());
       Vals.push_back(GV->isExternallyInitialized());
+      Vals.push_back(GV->notAddressTaken());
     } else {
       AbbrevToUse = SimpleGVarAbbrev;
     }
Index: lib/IR/AsmWriter.cpp
===================================================================
--- lib/IR/AsmWriter.cpp	(revision 192719)
+++ lib/IR/AsmWriter.cpp	(working copy)
@@ -1459,6 +1459,7 @@
   if (GV->hasUnnamedAddr()) Out << "unnamed_addr ";
   if (GV->isExternallyInitialized()) Out << "externally_initialized ";
   Out << (GV->isConstant() ? "constant " : "global ");
+  if (GV->notAddressTaken()) Out << "notaddrtaken ";
   TypePrinter.print(GV->getType()->getElementType(), Out);
 
   if (GV->hasInitializer()) {
Index: lib/IR/Globals.cpp
===================================================================
--- lib/IR/Globals.cpp	(revision 192719)
+++ lib/IR/Globals.cpp	(working copy)
@@ -99,6 +99,7 @@
   }
 
   LeakDetector::addGarbageObject(this);
+  setNotAddressTaken(false);
 }
 
 GlobalVariable::GlobalVariable(Module &M, Type *Ty, bool constant,
@@ -125,6 +126,7 @@
     Before->getParent()->getGlobalList().insert(Before, this);
   else
     M.getGlobalList().push_back(this);
+  setNotAddressTaken(false);
 }
 
 void GlobalVariable::setParent(Module *parent) {
@@ -185,6 +187,7 @@
   GlobalValue::copyAttributesFrom(Src);
   const GlobalVariable *SrcVar = cast<GlobalVariable>(Src);
   setThreadLocal(SrcVar->isThreadLocal());
+  setNotAddressTaken(SrcVar->notAddressTaken());
 }
 
 


More information about the llvm-commits mailing list