[LLVMdev] Re: __main() function and AliasSet

Nai Xia nelson.xia at gmail.com
Tue May 16 23:36:30 PDT 2006


On Tuesday 16 May 2006 03:19, Chris Lattner wrote:
> On Mon, 15 May 2006, Nai Xia wrote:
> 
> > In other words, if I only use -steens-aa and the data_XXXs are all 
> > external global variables( and so inComplete ),
> 
> Sounds right!
> 
> > the call to printf will 
> > make the same effect, which I have tested it.
> >
> > Am I right ? :)
> 
> If you've tested it then, yes you're right :).  I haven't played with this 
> stuff for a long time, but I was pretty sure that steens-aa and ds-aa had 
> a special case for printf so that it thinks that printf does not mod/ref 
> *anything*, including external globals.

Unfortunately, I did not locate the lines in steens-aa for "printf" special case.
In ds-aa, I found the lines below: 
--------------------------------
if (F->getName() == "printf" || F->getName() == "fprintf" ||
                   F->getName() == "sprintf") {
          CallSite::arg_iterator AI = CS.arg_begin(), E = CS.arg_end();

          if (F->getName() == "fprintf") {
            // fprintf reads and writes the FILE argument, and applies the type
            // to it.
            DSNodeHandle H = getValueDest(**AI);
            if (DSNode *N = H.getNode()) {
              N->setModifiedMarker();
              const Type *ArgTy = (*AI)->getType();
              if (const PointerType *PTy = dyn_cast<PointerType>(ArgTy))
                N->mergeTypeInfo(PTy->getElementType(), H.getOffset());
            }
          } else if (F->getName() == "sprintf") {
            // sprintf writes the first string argument.
            DSNodeHandle H = getValueDest(**AI++);
            if (DSNode *N = H.getNode()) {
              N->setModifiedMarker();
              const Type *ArgTy = (*AI)->getType();
              if (const PointerType *PTy = dyn_cast<PointerType>(ArgTy))
                N->mergeTypeInfo(PTy->getElementType(), H.getOffset());
            }
          }

          for (; AI != E; ++AI) {
            // printf reads all pointer arguments.
            if (isPointerType((*AI)->getType()))
              if (DSNode *N = getValueDest(**AI).getNode())
                N->setReadMarker();
          }
-----------------------------
So from my point of view, the ds-aa thinks "printf reads all pointer arguments", 
but does not take into account the "%n" format string , which may cause "printf" MOD to a specific address.

As for steens-aa, I think the logic for mod/ref is very clear:
---------------------
AliasAnalysis::ModRefResult
Steens::getModRefInfo(CallSite CS, Value *P, unsigned Size) {
  AliasAnalysis::ModRefResult Result = ModRef;

  // Find the node in question.
  DSGraph::ScalarMapTy &GSM = ResultGraph->getScalarMap();
  DSGraph::ScalarMapTy::iterator I = GSM.find(P);

  if (I != GSM.end() && !I->second.isNull()) {
    DSNode *N = I->second.getNode();
    if (N->isComplete()) {
      // If this is a direct call to an external function, and if the pointer
      // points to a complete node, the external function cannot modify or read
      // the value (we know it's not passed out of the program!).
      if (Function *F = CS.getCalledFunction())
        if (F->isExternal())
          return NoModRef;

      // Otherwise, if the node is complete, but it is only M or R, return this.
      // This can be useful for globals that should be marked const but are not.
      if (!N->isModified())
        Result = (ModRefResult)(Result & ~Mod);
      if (!N->isRead())
        Result = (ModRefResult)(Result & ~Ref);
    }
  }

  return (ModRefResult)(Result & AliasAnalysis::getModRefInfo(CS, P, Size));
}
---------------------
So steens-aa alone thinks "printf" ModRef *anything* that is inComplete(e.g. external globalVar) .

Did I missed anything? 
Again, thanks for your patience :)

> 
> My guess is that steens-aa gives up earlier because it's an external 
> global and it's an external function call or something.  You'd have to 
> trace through the logic of the code to be sure.
> 
> -Chris
> 
> > On Monday 15 May 2006 12:52, Chris Lattner wrote:
> >> On Mon, 15 May 2006, Nai Xia wrote:
> >>> Thank you very much for your detailed help.
> >>> You are definitely a good man. :)
> >>> I feel so much to learn.
> >>
> >> Happy to help!
> >>
> >> -Chris
> >>
> >>> On Monday 15 May 2006 04:07, you wrote:
> >>>> On Sun, 14 May 2006, Nai XIA wrote:
> >>>>> Oh, I appologize that I should not have asked about __main() ---- it appears
> >>>>> in FAQ.
> >>>>> But the question remains that why call to __main() can alias stack location?
> >>>>> I think the memory location pointed by data_X pointers are not visible to
> >>>>> __main().
> >>>>> In comparison, calls to printf() do not have similar effect.
> >>>>
> >>>> First, some background: -steens-aa and -anders-aa work reasonable well,
> >>>> but aren't production quality.  In particular, they both assume that
> >>>> printf doesn't have side effects, when (in fact) printf can on certain GNU
> >>>> systems when the right format string is used.  This is why they both think
> >>>> that printf has no side effects: they special case it.
> >>>>
> >>>> In practice, aliasing is a bunch of heuristics, and you cannot ever be
> >>>> guaranteed to get an exact answer.  As an example of this, both of these
> >>>> passes are "context insensitive".  As such, they don't know anything about
> >>>> what the effect of a call is, so the call to __main (even though they
> >>>> could theoretically know) is treated quite conservatively.
> >>>>
> >>>> There are a couple of different options you have here.  The alias passes
> >>>> can be combined together, so something like this:
> >>>>
> >>>> opt -globalsmodref-aa -steens-aa ...
> >>>>
> >>>> should be able to tell that __main has no side effects.  globalsmodref-aa
> >>>> is a production quality pass that does some simple context sensitive
> >>>> analysis (such as noticing functions with no side effects at all).
> >>>>
> >>>> Another option is the -ds-aa pass.  This pass is very powerful, but is
> >>>> also the farthest from production quality.  That said, it does get almost
> >>>> all common things right, it just has some bugs in areas like variable
> >>>> length arrays etc.
> >>>>
> >>>> -Chris
> >>>>
> >>>>
> >>>>> On 5/14/06, Nai Xia <nelson.xia at gmail.com> wrote:
> >>>>>>
> >>>>>> In a code segment of my pass plugin, I try to gather AliasSets for all
> >>>>>> StoreInst, LoadInst and CallInst instructions in a function.
> >>>>>> Some behaviors of the pass puzzled me.
> >>>>>> Below is the *.ll of the test program which I run the pass on,
> >>>>>> it was get with "llvm-gcc -Wl,--disable-opt" from a rather simple *.c
> >>>>>> program.
> >>>>>>
> >>>>>> ----------------------------------
> >>>>>> ; ModuleID = 'ptralias.bc'
> >>>>>> target endian = little
> >>>>>> target pointersize = 32
> >>>>>> target triple = "i686-pc-linux-gnu"
> >>>>>> deplibs = [ "c", "crtend" ]
> >>>>>> %.str_1 = internal constant [25 x sbyte] c"ptra=0x ptrb=0x
> >>>>>> ptrc=0x\0A\00"               ; <[25 x sbyte]*> [#uses=1]
> >>>>>> %ptr = weak global void ()* null                ; <void ()**> [#uses=0]
> >>>>>>
> >>>>>> implementation   ; Functions:
> >>>>>>
> >>>>>> declare int %printf(sbyte*, ...)
> >>>>>>
> >>>>>> void %foo1() {
> >>>>>>         ret void
> >>>>>> }
> >>>>>>
> >>>>>> void %foo2() {
> >>>>>>         ret void
> >>>>>> }
> >>>>>>
> >>>>>> int %main(int %argc, sbyte** %argv) {
> >>>>>> entry:
> >>>>>>         %data_b = alloca int            ; <int*> [#uses=2]
> >>>>>>         %data_c = alloca int            ; <int*> [#uses=1]
> >>>>>>         %data_d = alloca int            ; <int*> [#uses=3]
> >>>>>>         %data_e = alloca int            ; <int*> [#uses=2]
> >>>>>>         %data_f = alloca int            ; <int*> [#uses=2]
> >>>>>>         call void %__main( )
> >>>>>>         store int 2, int* %data_b
> >>>>>>         store int 3, int* %data_c
> >>>>>>         store int 4, int* %data_d
> >>>>>>         store int 5, int* %data_e
> >>>>>>         store int 6, int* %data_f
> >>>>>>         switch int %argc, label %switchexit [
> >>>>>>                  int 3, label %label.3
> >>>>>>                  int 2, label %then.2
> >>>>>>                  int 1, label %label.1
> >>>>>>                  int 0, label %endif.2
> >>>>>>         ]
> >>>>>>
> >>>>>> label.1:        ; preds = %entry
> >>>>>>         br label %switchexit
> >>>>>>
> >>>>>> label.3:                ; preds = %entry
> >>>>>>         br label %then.2
> >>>>>>
> >>>>>> switchexit:             ; preds = %label.1, %entry
> >>>>>>         %ptr_b.0 = phi int* [ %data_d, %label.1 ], [ null, %entry ]     ;
> >>>>>> <int*>  [#uses=1]
> >>>>>>         br label %endif.2
> >>>>>>
> >>>>>> then.2:         ; preds = %label.3, %entry
> >>>>>>         %ptr_a.1.0 = phi int* [ %data_f, %label.3 ], [ %data_e, %entry
> >>>>>> ]                ; <int*> [#uses=1]
> >>>>>>         store int 0, int* %ptr_a.1.0
> >>>>>>         br label %then.3
> >>>>>>
> >>>>>> endif.2:                ; preds = %switchexit, %entry
> >>>>>>         %ptr_b.0.1 = phi int* [ %ptr_b.0, %switchexit ], [ %data_b, %entry
> >>>>>> ]            ; <int*> [#uses=2]
> >>>>>>         %tmp.12 = seteq int* %ptr_b.0.1, null           ; <bool> [#uses=1]
> >>>>>>         br bool %tmp.12, label %then.4, label %then.3
> >>>>>>
> >>>>>> then.3:         ; preds = %endif.2, %then.2
> >>>>>>         %ptr_b.0.2 = phi int* [ %data_d, %then.2 ], [ %ptr_b.0.1, %endif.2
> >>>>>> ]            ; <int*> [#uses=1]
> >>>>>>         store int 0, int* %ptr_b.0.2
> >>>>>>         %tmp.1913 = call int (sbyte*, ...)* %printf( sbyte* getelementptr
> >>>>>> ([25 x sbyte]* %.str_1, int 0, int 0) )               ; <int> [#uses=0]
> >>>>>>         ret int 0
> >>>>>>
> >>>>>> then.4:         ; preds = %endif.2
> >>>>>>         %tmp.19 = call int (sbyte*, ...)* %printf( sbyte* getelementptr
> >>>>>> ([25 x sbyte]* %.str_1, int 0, int 0) )         ; <int> [#uses=0]
> >>>>>>         ret int 0
> >>>>>> }
> >>>>>>
> >>>>>> void %__main() {
> >>>>>> entry:
> >>>>>>         ret void
> >>>>>> }
> >>>>>>
> >>>>>> ----------------------------------
> >>>>>> I think the right AliasSet information calculated for this program should
> >>>>>> be
> >>>>>>
> >>>>>>   Information for alias set0:
> >>>>>>       pointer0=data_b
> >>>>>>       pointer1=data_d
> >>>>>>       pointer2=ptr_b.0.2
> >>>>>>   Information for alias set1:
> >>>>>>       pointer0=data_c
> >>>>>>   Information for alias set2:
> >>>>>>   Information for alias set3:
> >>>>>>       pointer0=data_e
> >>>>>>       pointer1=data_f
> >>>>>>       pointer2=ptr_a.1.0
> >>>>>>   Information for alias set4:
> >>>>>>   Information for alias set5:
> >>>>>>
> >>>>>> ,where the empty AliasSets I think should be "Forwarded".
> >>>>>>
> >>>>>> However, the result of the pass was:
> >>>>>>
> >>>>>>   Information for alias set0:
> >>>>>>     pointer0=data_b
> >>>>>>     pointer1=data_d
> >>>>>>     pointer2=data_e
> >>>>>>     pointer3=data_f
> >>>>>>     pointer4=ptr_a.1.0
> >>>>>>     pointer5=ptr_b.0.2
> >>>>>>   Information for alias set1:
> >>>>>>     pointer0=data_c
> >>>>>> After I deleted "call void %__main( )" in %main(), the pass get the right
> >>>>>> answer.
> >>>>>>
> >>>>>> So my question is:
> >>>>>>
> >>>>>> 1. What is the purpose for call to __main() ? __main() is just a empty
> >>>>>> function. My .c program only contains main().
> >>>>>> 2. My explanation for this behavior is that the CallSite of "call void
> >>>>>> %__main( )" alias some pointers in the program,
> >>>>>>    however, __main() is obviously empty, why should the AliasAnalysis
> >>>>>> think that it may/must Mod/Ref any stack location of main()?
> >>>>>>
> >>>>>> btw: the AliasAnalysis pass I used is -steens-aa and it also the same with
> >>>>>> -anders-aa.
> >>>>>>
> >>>>>> --
> >>>>>> Regards,
> >>>>>> Nai
> >>>>>>
> >>>>>
> >>>>
> >>>> -Chris
> >>>>
> >>>
> >>>
> >>
> >> -Chris
> >>
> >
> >
> 
> -Chris
> 

-- 
Regards,
Nai




More information about the llvm-dev mailing list