[LLVMdev] Re: __main() function and AliasSet

Nai Xia nelson.xia at gmail.com
Sat May 20 21:11:12 PDT 2006


On Thursday 18 May 2006 00:51, Chris Lattner wrote:
> On Wed, 17 May 2006, Nai Xia wrote:
> > Unfortunately, I did not locate the lines in steens-aa for "printf" special case.
> > In ds-aa, I found the lines below:
> 
> Right, steens-aa and ds-aa share code for "local analysis", they just 
> stitch it together into an interprocedural analysis in different ways. 
> The code below is used for steens-aa.
> 
> > --------------------------------
> > if (F->getName() == "printf" || F->getName() == "fprintf" ||
> >                   F->getName() == "sprintf") {
> >          CallSite::arg_iterator AI = CS.arg_begin(), E = CS.arg_end();
> ...
> >          for (; AI != E; ++AI) {
> >            // printf reads all pointer arguments.
> >            if (isPointerType((*AI)->getType()))
> >              if (DSNode *N = getValueDest(**AI).getNode())
> >                N->setReadMarker();
> >          }
> > -----------------------------
> > So from my point of view, the ds-aa thinks "printf reads all pointer arguments",
> 
> Right.
> 
> > but does not take into account the "%n" format string , which may cause 
> > "printf" MOD to a specific address.
> 
> Right.  As I mentioned before, steens-aa is not "production quality".  For 
> example, it doesn't handle this.  The right approach would change this 
> code to:
> 
> 1. Check to see if there is a constant format string.  If not, handle it 
> conservatively.
> 2. Scan the format string, matching up format specs to the arguments.  Set 
> the mod/ref bits for each thing as appropriate.
> 
> In addition to getting the mod bit right for %n, this would allow the ref 
> bit to not be set for %p, for example.

Thanks for your advice. :-)

> 
> > As for steens-aa, I think the logic for mod/ref is very clear:
> 
> Right, this is the logic  for querying the graph built by steens.
> 
> > So steens-aa alone thinks "printf" ModRef *anything* that is inComplete(e.g. external globalVar) .
> 
> Makes sense.
> 
> -Chris
> 
> >>
> >> My guess is that steens-aa gives up earlier because it's an external
> >> global and it's an external function call or something.  You'd have to
> >> trace through the logic of the code to be sure.
> >>
> >> -Chris
> >>
> >>> On Monday 15 May 2006 12:52, Chris Lattner wrote:
> >>>> On Mon, 15 May 2006, Nai Xia wrote:
> >>>>> Thank you very much for your detailed help.
> >>>>> You are definitely a good man. :)
> >>>>> I feel so much to learn.
> >>>>
> >>>> Happy to help!
> >>>>
> >>>> -Chris
> >>>>
> >>>>> On Monday 15 May 2006 04:07, you wrote:
> >>>>>> On Sun, 14 May 2006, Nai XIA wrote:
> >>>>>>> Oh, I appologize that I should not have asked about __main() ---- it appears
> >>>>>>> in FAQ.
> >>>>>>> But the question remains that why call to __main() can alias stack location?
> >>>>>>> I think the memory location pointed by data_X pointers are not visible to
> >>>>>>> __main().
> >>>>>>> In comparison, calls to printf() do not have similar effect.
> >>>>>>
> >>>>>> First, some background: -steens-aa and -anders-aa work reasonable well,
> >>>>>> but aren't production quality.  In particular, they both assume that
> >>>>>> printf doesn't have side effects, when (in fact) printf can on certain GNU
> >>>>>> systems when the right format string is used.  This is why they both think
> >>>>>> that printf has no side effects: they special case it.
> >>>>>>
> >>>>>> In practice, aliasing is a bunch of heuristics, and you cannot ever be
> >>>>>> guaranteed to get an exact answer.  As an example of this, both of these
> >>>>>> passes are "context insensitive".  As such, they don't know anything about
> >>>>>> what the effect of a call is, so the call to __main (even though they
> >>>>>> could theoretically know) is treated quite conservatively.
> >>>>>>
> >>>>>> There are a couple of different options you have here.  The alias passes
> >>>>>> can be combined together, so something like this:
> >>>>>>
> >>>>>> opt -globalsmodref-aa -steens-aa ...
> >>>>>>
> >>>>>> should be able to tell that __main has no side effects.  globalsmodref-aa
> >>>>>> is a production quality pass that does some simple context sensitive
> >>>>>> analysis (such as noticing functions with no side effects at all).
> >>>>>>
> >>>>>> Another option is the -ds-aa pass.  This pass is very powerful, but is
> >>>>>> also the farthest from production quality.  That said, it does get almost
> >>>>>> all common things right, it just has some bugs in areas like variable
> >>>>>> length arrays etc.
> >>>>>>
> >>>>>> -Chris
> >>>>>>
> >>>>>>
> >>>>>>> On 5/14/06, Nai Xia <nelson.xia at gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> In a code segment of my pass plugin, I try to gather AliasSets for all
> >>>>>>>> StoreInst, LoadInst and CallInst instructions in a function.
> >>>>>>>> Some behaviors of the pass puzzled me.
> >>>>>>>> Below is the *.ll of the test program which I run the pass on,
> >>>>>>>> it was get with "llvm-gcc -Wl,--disable-opt" from a rather simple *.c
> >>>>>>>> program.
> >>>>>>>>
> >>>>>>>> ----------------------------------
> >>>>>>>> ; ModuleID = 'ptralias.bc'
> >>>>>>>> target endian = little
> >>>>>>>> target pointersize = 32
> >>>>>>>> target triple = "i686-pc-linux-gnu"
> >>>>>>>> deplibs = [ "c", "crtend" ]
> >>>>>>>> %.str_1 = internal constant [25 x sbyte] c"ptra=0x ptrb=0x
> >>>>>>>> ptrc=0x\0A\00"               ; <[25 x sbyte]*> [#uses=1]
> >>>>>>>> %ptr = weak global void ()* null                ; <void ()**> [#uses=0]
> >>>>>>>>
> >>>>>>>> implementation   ; Functions:
> >>>>>>>>
> >>>>>>>> declare int %printf(sbyte*, ...)
> >>>>>>>>
> >>>>>>>> void %foo1() {
> >>>>>>>>         ret void
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> void %foo2() {
> >>>>>>>>         ret void
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> int %main(int %argc, sbyte** %argv) {
> >>>>>>>> entry:
> >>>>>>>>         %data_b = alloca int            ; <int*> [#uses=2]
> >>>>>>>>         %data_c = alloca int            ; <int*> [#uses=1]
> >>>>>>>>         %data_d = alloca int            ; <int*> [#uses=3]
> >>>>>>>>         %data_e = alloca int            ; <int*> [#uses=2]
> >>>>>>>>         %data_f = alloca int            ; <int*> [#uses=2]
> >>>>>>>>         call void %__main( )
> >>>>>>>>         store int 2, int* %data_b
> >>>>>>>>         store int 3, int* %data_c
> >>>>>>>>         store int 4, int* %data_d
> >>>>>>>>         store int 5, int* %data_e
> >>>>>>>>         store int 6, int* %data_f
> >>>>>>>>         switch int %argc, label %switchexit [
> >>>>>>>>                  int 3, label %label.3
> >>>>>>>>                  int 2, label %then.2
> >>>>>>>>                  int 1, label %label.1
> >>>>>>>>                  int 0, label %endif.2
> >>>>>>>>         ]
> >>>>>>>>
> >>>>>>>> label.1:        ; preds = %entry
> >>>>>>>>         br label %switchexit
> >>>>>>>>
> >>>>>>>> label.3:                ; preds = %entry
> >>>>>>>>         br label %then.2
> >>>>>>>>
> >>>>>>>> switchexit:             ; preds = %label.1, %entry
> >>>>>>>>         %ptr_b.0 = phi int* [ %data_d, %label.1 ], [ null, %entry ]     ;
> >>>>>>>> <int*>  [#uses=1]
> >>>>>>>>         br label %endif.2
> >>>>>>>>
> >>>>>>>> then.2:         ; preds = %label.3, %entry
> >>>>>>>>         %ptr_a.1.0 = phi int* [ %data_f, %label.3 ], [ %data_e, %entry
> >>>>>>>> ]                ; <int*> [#uses=1]
> >>>>>>>>         store int 0, int* %ptr_a.1.0
> >>>>>>>>         br label %then.3
> >>>>>>>>
> >>>>>>>> endif.2:                ; preds = %switchexit, %entry
> >>>>>>>>         %ptr_b.0.1 = phi int* [ %ptr_b.0, %switchexit ], [ %data_b, %entry
> >>>>>>>> ]            ; <int*> [#uses=2]
> >>>>>>>>         %tmp.12 = seteq int* %ptr_b.0.1, null           ; <bool> [#uses=1]
> >>>>>>>>         br bool %tmp.12, label %then.4, label %then.3
> >>>>>>>>
> >>>>>>>> then.3:         ; preds = %endif.2, %then.2
> >>>>>>>>         %ptr_b.0.2 = phi int* [ %data_d, %then.2 ], [ %ptr_b.0.1, %endif.2
> >>>>>>>> ]            ; <int*> [#uses=1]
> >>>>>>>>         store int 0, int* %ptr_b.0.2
> >>>>>>>>         %tmp.1913 = call int (sbyte*, ...)* %printf( sbyte* getelementptr
> >>>>>>>> ([25 x sbyte]* %.str_1, int 0, int 0) )               ; <int> [#uses=0]
> >>>>>>>>         ret int 0
> >>>>>>>>
> >>>>>>>> then.4:         ; preds = %endif.2
> >>>>>>>>         %tmp.19 = call int (sbyte*, ...)* %printf( sbyte* getelementptr
> >>>>>>>> ([25 x sbyte]* %.str_1, int 0, int 0) )         ; <int> [#uses=0]
> >>>>>>>>         ret int 0
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> void %__main() {
> >>>>>>>> entry:
> >>>>>>>>         ret void
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> ----------------------------------
> >>>>>>>> I think the right AliasSet information calculated for this program should
> >>>>>>>> be
> >>>>>>>>
> >>>>>>>>   Information for alias set0:
> >>>>>>>>       pointer0=data_b
> >>>>>>>>       pointer1=data_d
> >>>>>>>>       pointer2=ptr_b.0.2
> >>>>>>>>   Information for alias set1:
> >>>>>>>>       pointer0=data_c
> >>>>>>>>   Information for alias set2:
> >>>>>>>>   Information for alias set3:
> >>>>>>>>       pointer0=data_e
> >>>>>>>>       pointer1=data_f
> >>>>>>>>       pointer2=ptr_a.1.0
> >>>>>>>>   Information for alias set4:
> >>>>>>>>   Information for alias set5:
> >>>>>>>>
> >>>>>>>> ,where the empty AliasSets I think should be "Forwarded".
> >>>>>>>>
> >>>>>>>> However, the result of the pass was:
> >>>>>>>>
> >>>>>>>>   Information for alias set0:
> >>>>>>>>     pointer0=data_b
> >>>>>>>>     pointer1=data_d
> >>>>>>>>     pointer2=data_e
> >>>>>>>>     pointer3=data_f
> >>>>>>>>     pointer4=ptr_a.1.0
> >>>>>>>>     pointer5=ptr_b.0.2
> >>>>>>>>   Information for alias set1:
> >>>>>>>>     pointer0=data_c
> >>>>>>>> After I deleted "call void %__main( )" in %main(), the pass get the right
> >>>>>>>> answer.
> >>>>>>>>
> >>>>>>>> So my question is:
> >>>>>>>>
> >>>>>>>> 1. What is the purpose for call to __main() ? __main() is just a empty
> >>>>>>>> function. My .c program only contains main().
> >>>>>>>> 2. My explanation for this behavior is that the CallSite of "call void
> >>>>>>>> %__main( )" alias some pointers in the program,
> >>>>>>>>    however, __main() is obviously empty, why should the AliasAnalysis
> >>>>>>>> think that it may/must Mod/Ref any stack location of main()?
> >>>>>>>>
> >>>>>>>> btw: the AliasAnalysis pass I used is -steens-aa and it also the same with
> >>>>>>>> -anders-aa.
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Regards,
> >>>>>>>> Nai
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> -Chris
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> -Chris
> >>>>
> >>>
> >>>
> >>
> >> -Chris
> >>
> >
> >
> 
> -Chris
> 

-- 
Regards,
Nai




More information about the llvm-dev mailing list