[LLVMdev] Re: __main() function and AliasSet
Chris Lattner
sabre at nondot.org
Wed May 17 09:51:35 PDT 2006
On Wed, 17 May 2006, Nai Xia wrote:
> Unfortunately, I did not locate the lines in steens-aa for "printf" special case.
> In ds-aa, I found the lines below:
Right, steens-aa and ds-aa share code for "local analysis", they just
stitch it together into an interprocedural analysis in different ways.
The code below is used for steens-aa.
> --------------------------------
> if (F->getName() == "printf" || F->getName() == "fprintf" ||
> F->getName() == "sprintf") {
> CallSite::arg_iterator AI = CS.arg_begin(), E = CS.arg_end();
...
> for (; AI != E; ++AI) {
> // printf reads all pointer arguments.
> if (isPointerType((*AI)->getType()))
> if (DSNode *N = getValueDest(**AI).getNode())
> N->setReadMarker();
> }
> -----------------------------
> So from my point of view, the ds-aa thinks "printf reads all pointer arguments",
Right.
> but does not take into account the "%n" format string , which may cause
> "printf" MOD to a specific address.
Right. As I mentioned before, steens-aa is not "production quality". For
example, it doesn't handle this. The right approach would change this
code to:
1. Check to see if there is a constant format string. If not, handle it
conservatively.
2. Scan the format string, matching up format specs to the arguments. Set
the mod/ref bits for each thing as appropriate.
In addition to getting the mod bit right for %n, this would allow the ref
bit to not be set for %p, for example.
> As for steens-aa, I think the logic for mod/ref is very clear:
Right, this is the logic for querying the graph built by steens.
> So steens-aa alone thinks "printf" ModRef *anything* that is inComplete(e.g. external globalVar) .
Makes sense.
-Chris
>>
>> My guess is that steens-aa gives up earlier because it's an external
>> global and it's an external function call or something. You'd have to
>> trace through the logic of the code to be sure.
>>
>> -Chris
>>
>>> On Monday 15 May 2006 12:52, Chris Lattner wrote:
>>>> On Mon, 15 May 2006, Nai Xia wrote:
>>>>> Thank you very much for your detailed help.
>>>>> You are definitely a good man. :)
>>>>> I feel so much to learn.
>>>>
>>>> Happy to help!
>>>>
>>>> -Chris
>>>>
>>>>> On Monday 15 May 2006 04:07, you wrote:
>>>>>> On Sun, 14 May 2006, Nai XIA wrote:
>>>>>>> Oh, I appologize that I should not have asked about __main() ---- it appears
>>>>>>> in FAQ.
>>>>>>> But the question remains that why call to __main() can alias stack location?
>>>>>>> I think the memory location pointed by data_X pointers are not visible to
>>>>>>> __main().
>>>>>>> In comparison, calls to printf() do not have similar effect.
>>>>>>
>>>>>> First, some background: -steens-aa and -anders-aa work reasonable well,
>>>>>> but aren't production quality. In particular, they both assume that
>>>>>> printf doesn't have side effects, when (in fact) printf can on certain GNU
>>>>>> systems when the right format string is used. This is why they both think
>>>>>> that printf has no side effects: they special case it.
>>>>>>
>>>>>> In practice, aliasing is a bunch of heuristics, and you cannot ever be
>>>>>> guaranteed to get an exact answer. As an example of this, both of these
>>>>>> passes are "context insensitive". As such, they don't know anything about
>>>>>> what the effect of a call is, so the call to __main (even though they
>>>>>> could theoretically know) is treated quite conservatively.
>>>>>>
>>>>>> There are a couple of different options you have here. The alias passes
>>>>>> can be combined together, so something like this:
>>>>>>
>>>>>> opt -globalsmodref-aa -steens-aa ...
>>>>>>
>>>>>> should be able to tell that __main has no side effects. globalsmodref-aa
>>>>>> is a production quality pass that does some simple context sensitive
>>>>>> analysis (such as noticing functions with no side effects at all).
>>>>>>
>>>>>> Another option is the -ds-aa pass. This pass is very powerful, but is
>>>>>> also the farthest from production quality. That said, it does get almost
>>>>>> all common things right, it just has some bugs in areas like variable
>>>>>> length arrays etc.
>>>>>>
>>>>>> -Chris
>>>>>>
>>>>>>
>>>>>>> On 5/14/06, Nai Xia <nelson.xia at gmail.com> wrote:
>>>>>>>>
>>>>>>>> In a code segment of my pass plugin, I try to gather AliasSets for all
>>>>>>>> StoreInst, LoadInst and CallInst instructions in a function.
>>>>>>>> Some behaviors of the pass puzzled me.
>>>>>>>> Below is the *.ll of the test program which I run the pass on,
>>>>>>>> it was get with "llvm-gcc -Wl,--disable-opt" from a rather simple *.c
>>>>>>>> program.
>>>>>>>>
>>>>>>>> ----------------------------------
>>>>>>>> ; ModuleID = 'ptralias.bc'
>>>>>>>> target endian = little
>>>>>>>> target pointersize = 32
>>>>>>>> target triple = "i686-pc-linux-gnu"
>>>>>>>> deplibs = [ "c", "crtend" ]
>>>>>>>> %.str_1 = internal constant [25 x sbyte] c"ptra=0x ptrb=0x
>>>>>>>> ptrc=0x\0A\00" ; <[25 x sbyte]*> [#uses=1]
>>>>>>>> %ptr = weak global void ()* null ; <void ()**> [#uses=0]
>>>>>>>>
>>>>>>>> implementation ; Functions:
>>>>>>>>
>>>>>>>> declare int %printf(sbyte*, ...)
>>>>>>>>
>>>>>>>> void %foo1() {
>>>>>>>> ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> void %foo2() {
>>>>>>>> ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> int %main(int %argc, sbyte** %argv) {
>>>>>>>> entry:
>>>>>>>> %data_b = alloca int ; <int*> [#uses=2]
>>>>>>>> %data_c = alloca int ; <int*> [#uses=1]
>>>>>>>> %data_d = alloca int ; <int*> [#uses=3]
>>>>>>>> %data_e = alloca int ; <int*> [#uses=2]
>>>>>>>> %data_f = alloca int ; <int*> [#uses=2]
>>>>>>>> call void %__main( )
>>>>>>>> store int 2, int* %data_b
>>>>>>>> store int 3, int* %data_c
>>>>>>>> store int 4, int* %data_d
>>>>>>>> store int 5, int* %data_e
>>>>>>>> store int 6, int* %data_f
>>>>>>>> switch int %argc, label %switchexit [
>>>>>>>> int 3, label %label.3
>>>>>>>> int 2, label %then.2
>>>>>>>> int 1, label %label.1
>>>>>>>> int 0, label %endif.2
>>>>>>>> ]
>>>>>>>>
>>>>>>>> label.1: ; preds = %entry
>>>>>>>> br label %switchexit
>>>>>>>>
>>>>>>>> label.3: ; preds = %entry
>>>>>>>> br label %then.2
>>>>>>>>
>>>>>>>> switchexit: ; preds = %label.1, %entry
>>>>>>>> %ptr_b.0 = phi int* [ %data_d, %label.1 ], [ null, %entry ] ;
>>>>>>>> <int*> [#uses=1]
>>>>>>>> br label %endif.2
>>>>>>>>
>>>>>>>> then.2: ; preds = %label.3, %entry
>>>>>>>> %ptr_a.1.0 = phi int* [ %data_f, %label.3 ], [ %data_e, %entry
>>>>>>>> ] ; <int*> [#uses=1]
>>>>>>>> store int 0, int* %ptr_a.1.0
>>>>>>>> br label %then.3
>>>>>>>>
>>>>>>>> endif.2: ; preds = %switchexit, %entry
>>>>>>>> %ptr_b.0.1 = phi int* [ %ptr_b.0, %switchexit ], [ %data_b, %entry
>>>>>>>> ] ; <int*> [#uses=2]
>>>>>>>> %tmp.12 = seteq int* %ptr_b.0.1, null ; <bool> [#uses=1]
>>>>>>>> br bool %tmp.12, label %then.4, label %then.3
>>>>>>>>
>>>>>>>> then.3: ; preds = %endif.2, %then.2
>>>>>>>> %ptr_b.0.2 = phi int* [ %data_d, %then.2 ], [ %ptr_b.0.1, %endif.2
>>>>>>>> ] ; <int*> [#uses=1]
>>>>>>>> store int 0, int* %ptr_b.0.2
>>>>>>>> %tmp.1913 = call int (sbyte*, ...)* %printf( sbyte* getelementptr
>>>>>>>> ([25 x sbyte]* %.str_1, int 0, int 0) ) ; <int> [#uses=0]
>>>>>>>> ret int 0
>>>>>>>>
>>>>>>>> then.4: ; preds = %endif.2
>>>>>>>> %tmp.19 = call int (sbyte*, ...)* %printf( sbyte* getelementptr
>>>>>>>> ([25 x sbyte]* %.str_1, int 0, int 0) ) ; <int> [#uses=0]
>>>>>>>> ret int 0
>>>>>>>> }
>>>>>>>>
>>>>>>>> void %__main() {
>>>>>>>> entry:
>>>>>>>> ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> ----------------------------------
>>>>>>>> I think the right AliasSet information calculated for this program should
>>>>>>>> be
>>>>>>>>
>>>>>>>> Information for alias set0:
>>>>>>>> pointer0=data_b
>>>>>>>> pointer1=data_d
>>>>>>>> pointer2=ptr_b.0.2
>>>>>>>> Information for alias set1:
>>>>>>>> pointer0=data_c
>>>>>>>> Information for alias set2:
>>>>>>>> Information for alias set3:
>>>>>>>> pointer0=data_e
>>>>>>>> pointer1=data_f
>>>>>>>> pointer2=ptr_a.1.0
>>>>>>>> Information for alias set4:
>>>>>>>> Information for alias set5:
>>>>>>>>
>>>>>>>> ,where the empty AliasSets I think should be "Forwarded".
>>>>>>>>
>>>>>>>> However, the result of the pass was:
>>>>>>>>
>>>>>>>> Information for alias set0:
>>>>>>>> pointer0=data_b
>>>>>>>> pointer1=data_d
>>>>>>>> pointer2=data_e
>>>>>>>> pointer3=data_f
>>>>>>>> pointer4=ptr_a.1.0
>>>>>>>> pointer5=ptr_b.0.2
>>>>>>>> Information for alias set1:
>>>>>>>> pointer0=data_c
>>>>>>>> After I deleted "call void %__main( )" in %main(), the pass get the right
>>>>>>>> answer.
>>>>>>>>
>>>>>>>> So my question is:
>>>>>>>>
>>>>>>>> 1. What is the purpose for call to __main() ? __main() is just a empty
>>>>>>>> function. My .c program only contains main().
>>>>>>>> 2. My explanation for this behavior is that the CallSite of "call void
>>>>>>>> %__main( )" alias some pointers in the program,
>>>>>>>> however, __main() is obviously empty, why should the AliasAnalysis
>>>>>>>> think that it may/must Mod/Ref any stack location of main()?
>>>>>>>>
>>>>>>>> btw: the AliasAnalysis pass I used is -steens-aa and it also the same with
>>>>>>>> -anders-aa.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>> Nai
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> -Chris
>>>>>>
>>>>>
>>>>>
>>>>
>>>> -Chris
>>>>
>>>
>>>
>>
>> -Chris
>>
>
>
-Chris
--
http://nondot.org/sabre/
http://llvm.org/
More information about the llvm-dev
mailing list