[LLVMdev] Casting between address spaces and address space semantics
Mon P Wang
monping at apple.com
Wed Aug 6 23:16:47 PDT 2008
Hi Matthijs,
>
>>> Specifically, I would like instcombining to be able to use this
>>> info to
>>> remove useless bitcasts. Also, this information is useful for
>>> clang to
>>> know when inserting an implicit cast makes sense and when it
>>> would be an
>>> error.
>> Clang should just reject implicit casts in *any* case when the addr
>> spaces differ IMO. In any case, clang has it's own parameterization
>> of the target which is richer than TargetData already, so if it
>> really
>> wanted to allow implicit casts between address space subsets, it
>> could
>> do so based on its own info.
> I think that adding this info to Clang's targetinfo code would
> indeed make
> sense, though perhaps disallowing implicit casts entirely is also
> acceptable
> (provided that you can still do explicit casts).
>
Like most other type qualifier, explicit casts are allowed. However,
if some one uses it in a context which is illegal, the result is
undefined (e.g.,., cast a pointer from one space to another one
address and then uses it where the address spaces don't overlap). The
C99 extension indicates that a cast of a pointer to one address space
to another is valid only if the object that pointer points to exists
in both address spaces (otherwise, the result is undefined). I think
we all agree that the default behavior in clang should reject implicit
casts between address spaces unless clang has some additional
information to know better.
> However, what I find more important (this discussion is greatly
> helping me in
> finding out what exactly it is that I find important :-p) is that
> instcombine
> (or some other pass, for all I care) can remove bitcasts that are
> not strictly
> needed.
>
> I will focus on address spaces that are nested, since that is the most
> interesting and general case. Say we have address spaces SubA, SubB
> and Super, and
> SubA and SubB are subsets of Super (but disjoint to each other).
>
> When is bitcast between address spaces really needed? I can see two
> main
> reasons.
> 1) A function gets passed pointers into both SubA and SubB (which
> then need
> to be bitcasted to pointer-into-Super).
> 2) A pointer variable is assigned pointers into both SubA and SubB.
> In LLVM
> this would probably mean a phi node that joins pointers into
> different
> address spaces (which is not possible, hence a bitcast is needed
> and the
> phi node would be of type pointer-into-super).
>
> However, after some transformations (mainly code duplication
> AFAICS), the
> above situations might get removed, making the bitcasts redundant.
> The patch I
> attached a few posts back implements this by removing the bitcast when
> possible. This does mean that any users of the bitcast now have one
> of their
> operands changed (the address space gets changed).
>
> Take for example a load instruction, that directly uses a bitcasted
> pointer.
>
> When removing the bitcast, the load instruction can be replaced by a
> new one
> that takes a pointer into another (smaller) address space. The
> result of the
> load instruction will still be the same value, but a bitcast was
> removed (and
> the load instruction was possibly improved as well).
>
> Let's look at two use cases I can see for nested address spaces.
>
> 1) Different address spaces translate to different accessing
> instructions at
> code generation time. Instructions that access smaller address
> spaces
> (SubA, SubB) can execute faster than instructions that access the
> Super
> address space. Because of this, removing unneeded bitcasts to
> Super (and
> then replacing, for example, load instructions with their SubA or
> SubB
> variants) can result in faster code. This is currently not our use
> case,
> but depending on where our hardware design goes might become so.
> 2) Having the Super address space as an address space that is usable
> in the
> input C code, but should be optimized away before it hits the
> backend.
> Removing those redundant bitcasts allows us to prevent a lot of code
> duplication or ugly macro use in our C code, while still ending up
> with
> code our backend can compile. This is our current use case.
>
> For both these use cases, it would be needed to remove redundant
> bitcasts. To
> be able to do that, the transformation passes (or at least the one
> the will do
> this) needs to know the relations between the different address
> spaces.
>
I would imagine that such a cleanup phase will probably need to track
what objects a pointer can point to in some region of code so it can
either remove redundant casts or add a cast to refine the information
of what a pointer can point to.
> At the very least, redundant bitcasts should be removed. It hacked
> up a patch
> to make instcombine do this, but that only makes sense if removing
> those
> bitcasts allows instcombine to do further optimizations, which again
> allows
> bitcasts to be removed. I have the feeling that propagating address
> spaces
> might be quite orthogonal to the other transformations instcombine
> does, so a
> seperate pass for just this might be better.
>
> I think the main question here is, are there any other
> transformations that
> would need (or benefit from) information on address space relations?
> If not,
> then it might be sufficient to pass an "AddressSpacesDescriptor" to
> just that
> one pass.
Alias analysis could use the information to help determine if two
pointer aliases. As you indicated, a cleanup phases could use the
information. I don't think most other phases care as what they want
to know is what address space a pointer points to and not the
relationship about the address spaces.
> If there are other transformations, it would be useful to have one
> central
> place to store this information (like TargetData now).
>
> I've only just realized that this is an essential question, so I
> haven't given
> it too much thought yet.
>
>
> In either case, we should wonder in what way this information is
> given to the
> tools (opt/llc). If we make it dependent on the target architecture,
> this only
> becomes an issue once we support an architecture where this is
> relevant.
> Alternatively, we could have some way of specifying the relations on
> the
> commandline (either directly or by choosing between a number of
> models). Which
> of these is chosen (if any) would not be too relevant for us, since
> we have
> our own compiler driver which can pass anything to the (instcombine
> or other)
> pass in a hardcoded fashion.
IMO, I would think that address space relationship is based on the
target architecture so it seems most natural to associate with the
target.
Regards,
-- Mon Ping
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/d7d22bce/attachment.html>
More information about the llvm-dev
mailing list