[LLVMdev] [Propose] Add address-taken bit to GlobalVariable for disambiguation purpose

Mon Nov 4 14:31:14 PST 2013

Hi, Hal:

     Thank you so much for the informative reply and your expertise!

Shuxin

On 11/4/13 2:14 PM, Hal Finkel wrote:
> ----- Original Message -----
>> Hi, all:
>>
>> Per Chris and Nadav's request, I begin to write the code about
>> analyzing address-taken
>> lazily. I realize the alias query could be initiated from any context
>> (*function* pass, loop pass etc),
>> however, the analysis for global-variable-address-taken is conducted
>> in *module* scope.
>> Is there any potential problem over here? (For instance, function
>> foo() and bar() comprise module m,
>> however, at time optimizer is working on foo(), bar() is not
>> physically in that module. In this case,
>> analyze global-variable on the fly doesn't make sense.)
> Shuxin,
>
> I don't see how this can work, without either a) breaking apart the current pass structure by inserting a module-level pass or b) violating the implied pass scope boundaries.
>
> FWIW, I asked Chandler last month on IRC for this thoughts about whether the new pass manager will support running different (function, basic block, etc.) passes in parallel. There is obviously a lot of work yet before this is possible, but he said that is the eventual goal. I don't yet understand what kind of locking scheme we'd need for this in practice, but it seems that something like this might get in the way.
>
> On the other hand, if we're okay with this, I'd like to do something similar so that static functions can determine aliasing information from their callers.
>
>   -Hal
>
>> Thanks in advance!
>> Shuxin
>>
>>
>> On 10/30/13 2:08 PM, Chris Lattner wrote:
>>
>>
>>
>>
>> On Oct 30, 2013, at 10:37 AM, Shuxin Yang < shuxin.llvm at gmail.com >
>> wrote:
>>
>>
>> Nadav:
>>
>> I don't think this is right approach for engineering.
>> The time-complexity of re-analyzing addr_taken for each single alias
>> query depends on
>> 1. how many global variable
>> 2. how many occurrence of these global variables.
>> 3. how many queries the compiler have.
>>
>> 3) depends on compiler. You never know what we will have in the
>> following few years.
>> 1 and 2 depends on the program. You never know what kind of program
>> you will run into.
>> How can we use what we have today the extrapolate the future ignoring
>> the highly
>> unpredictable complexity.
>>
>>
>> This logic doesn't make sense to me. You can implement it both ways
>> and get empirical results on *programs we have today* and *in our
>> compiler*. This is not a theoretical exercise.
>>
>>
>> In practice, walking the use list of a global variable is very fast.
>> As you've noticed, we already use this approach (in an admittedly
>> ad-hoc and decentralized way) throughout the compiler.
>>
>>
>>
>>
>> It's interesting that recently, many EE magazine (circuit cellar,
>> Elector, EE times) are
>> discussing buggy SW kill people. I remember some posts complaining
>> that some buggy program
>> have amazingly large # of global variables. I can find one post in
>> Chinese website:
>>
>> http://forum.xitek.com/thread-1226816-5-1-1.html
>>
>> The 1st post says, "a program has 11000 global variables"!
>>
>>
>>
>> This is just FUD and completely unrelated to the discussion.
>>
>>
>>
>> As to "Can you provide this data"? My answer is no, and I will not to
>> implement the analysis
>> which perform on-the-fly analysis unless I'm convinced that saving
>> addr_taken bit to llvm::GlobalVariable
>> is fundamentally flawed.
>>
>>
>> You don't have to be convinced. The burden of proof is on you - not
>> on us to convince you.
>>
>>
>> Here's the deal: there are tons of "potentially useful" things that
>> could be encoded in the IR. Each thing added to IR has a complexity
>> increase on the entire compiler. Passes that work on global
>> variables will have to reason about this bit, and transformations
>> that could invalidate it (e.g. global merging) will have to have
>> code added to update/preserve it.
>>
>>
>> We are very conservative about changing IR for good reason. We don't
>> add caches to IR unless there is pretty much no other way to achieve
>> the result. In a perfect world, we would have nothing redundant in
>> the IR at all.
>>
>>
>> That said, I'm open to this attribute, because I think the semantics
>> can be nailed down tightly (though your "volatile" discussion
>> doesn't make any sense to me) it is widely useful, and I don't think
>> the burden of maintaining it will be that high. However, before we
>> do it, you need to demonstrate that lazily computing it from use-def
>> chains is *empirically worse*.
>>
>>
>> -Chris
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>