[LLVMdev] Casting between address spaces and address space semantics

Mon Ping Wang wangmp at apple.com
Tue Sep 16 15:26:51 PDT 2008

Hi Matthijs,

On Sep 15, 2008, at 5:13 AM, Matthijs Kooijman wrote:

> Hi Mon Ping,
>> If I remember correctly, I was also not fond of passing another
>> TargetAddrSpace reference to the TargetData object.  I was hoping  
>> that we
>> could encode the information as a target description string like we  
>> do for
>> ABI information.   I just don't want to end up with too many  
>> objects that
>> describe the machine. One can argue that we shouldn't pollute the
>> TargetData since it describes the ABI  alignment, size, and object
>> layoutbut I feel that this data fits naturally there.  If you and  
>> other
>> people feel it is cleaner with a separate pass, I'm fine with it.
> Perhaps encoding it in TargetData makes sense. However, I was  
> avoiding this
> for now, since Chris commented a while back that he wanted to have  
> it in
> TargetData "only if absolutely required".
> However, thinking of this a bit more I do see your point about  
> TargetData.
> Another interesting advantage is that it would be a lot easier to  
> make things
> consistent between clang and LLVM, by simply using the TargetData  
> string
> that gets embedded in the module.
> So, I guess embedding this info in TargetData makes more sense. How  
> would this
> look like? I would think of something like:
> as1:<2:=3:>4:!3
> This would mean address space 1 is a subset of 2, equivalent to 3, a  
> superset
> of 4 and disjoint with 3. A number of these could be present in a  
> TargetData
> string, to fully describe the situation. Any relations not described  
> mean
> disjoint. Relations can also be implicitely defined, ie, as1:<2- 
> as2:<3 also
> implies as1:<3.
> I'm not sure if the > should be present, since that's always  
> reversible. Also,
> ! is probably not so nice for disjoint, any other suggestions?

I'm thinking it is sufficient to just have subset and equivalence. If  
superset makes the description more concise, it might be worth to put  
in.  Since the default is disjoint, I don't think we need it.  BTW, in  
the example, I assume you are just showing syntax as the system would  
flag an error if 1 is equivalent to 3 and disjoint to 3.

> As to implementing this, I'm thinking of a equivalency table (for  
> every
> address space, store the equivalent address space with the lowest  
> id) and for
> each of those lowest id spaces in each equivalency group store a set  
> of
> address spaces that are a subset. This set should be complete, so  
> when the
> string says A > B > C, the set should store both B and C as subsets  
> of A.
> This allows for resolving the relation between two address spaces by  
> two
> lookups in the equivalency table and (one or) two lookups in the  
> subset table.
> No results in the subset table means the relation is disjoint, then.

I think this is a reasonable way of going here.

> Any comments on this? Chris, would this be acceptable?

>> I want to treat my next point with some delicacy as I don't want to  
>> start a
>> religious war.  I just want to get clarification from the community  
>> on the
>> use of multiple inheritance for the case of Phases like
>> AllDisjointAddrspaces.  From what I can gather, the use of multiple
>> inheritance is to separate the interface (TargetAddrSpace) to  
>> access data
>> from the interface of the phase (ImmutablePhase).  In this case,  
>> will we
>> ever create a concrete class from TargetAddrSpace that doesn't also  
>> derive
>> from ImmutablePass? If not, I don't think is worth using multiple
>> inheritance in this case.
> I think you are right here, changing the inheritance in this way  
> also works
> fine.
>>>> For the case of a GetElementPointer, we are replacing a bitcast  
>>>> to a
>>>> pointer, getelem with a getelem bitcast.  The assumption is the  
>>>> latter
>>>> bitcast will hopefully go away when we iterate through those uses.
>>> Uh? Is this a comment about what the current code or my patch  
>>> does, or what
>>> it should do? I don't understand what you mean here.
>> My comment was more on what I thought the patch did and I wanted to  
>> confirm
>> that it will cleanup newly generated bit cast that are created.
> In that case, yes, the newly generated bitcasts should be  
> iteratively cleaned
> up whenever possible.
>>> True, anyone actually using address space should make sure that  
>>> this info
>>> is correct anyway. So, no need for an unknown default?
>> That is my feeling.
> Ok.
>> No, you got  my point even though my example is not a good one. If  
>> the
>> address calculation was using a variable, I don't think we can fold  
>> it into
>> the GEP and we might lose this information.
> Ie, a variable that is stored to, you mean? In that case, the  
> address space is
> probably propagated until the store instruction. Perhaps it can even  
> be
> propagated through the store instruction, so it stores to a  
> bitcasted pointer
> (ie, bitcast i32 addrspace(2)* * to i32 addrspace(1) * *). I suspect  
> other
> parts of instcombine might handle this from here and change the  
> variable's
> type to i32 addrspace(1), if possible. I guess this is something for  
> later on.

I think I meant that the address space information needs to be  
propagated through any temporary that is used to hold the value that  
might be using a different address spaces.  In general, this isn't  
difficult but one just have to be aware of it when manipulating  

>> The point I was trying to make is that the information needs to be
>> propagated through any address calculation when possible.
> Yes, but that shouldn't be too hard to add to the existing code.

I agree,

   -- Mon Ping

More information about the llvm-dev mailing list