[LLVMdev] Reducing Generic Address Space Usage

Tue Mar 25 17:07:18 PDT 2014

On Tue, Mar 25, 2014 at 3:21 PM, Matt Arsenault
<Matthew.Arsenault at amd.com>wrote:

>  On 03/25/2014 02:31 PM, Jingyue Wu wrote:
>
>
> However, we have three concerns on this:
> a) I doubt this optimization is valid for all targets, because LLVM
> language reference (
> http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction) says
> addrspacecast "can be a no-op cast or a complex value modification,
> depending on the target and the address space pair."
>
> I think most of the simple cast optimizations would be acceptable. The
> addrspacecasted pointer still needs to point to the same memory location,
> so changing an access to use a different address space would be OK. I think
> canonicalizing accesses to use the original address space of a casted
> pointer when possible would make sense.
>

"the address space conversion is legal then both result and operand refer
to the same memory location". I don't quite understand this sentence. Does
the same memory location mean the same numeric value?

>
>
>   b) NVPTX and R600 have different address numbering for the generic
> address space, which makes things more complicated.
> c) We don't have a good understanding of the R600 backend.
>
>
> R600 currently does not support the flat address space instructions
> intended to use for the generic address space. I posted a patch a while ago
> that half added it, which I can try to work on finishing if it would help.
>
> I also do not understand how NVPTX uses address spaces, particularly how
> it can use 0 as the the generic address space.
>

NVPTX backend generates ld.f32 for reading from the generic address space.
There's no special machine instruction to read/write from/to the generic
address space in R600?

>
>
>   2. How effective do we want this optimization to be?
>
>  In the short term, I want it to be able to eliminate unnecessary
> non-generic-to-generic addrspacecasts the front-end generates for the NVPTX
> target. For example,
>
>  %p1 = addrspace i32 addrspace(3)* %p0 to i32*
> %v = load i32* %p1
>
>  =>
>
>  %v = load i32 addrspace(3)* %p0
>
>  We want similar optimization for store+addrspacecast and
> gep+addrspacecast as well.
>
>  In a long term, we could for sure improve this optimization to handle
> more instructions and more patterns.
>
>   I believe most of the cast simplifications that apply to bitcasts of
> pointers also apply to addrspacecast. I have some patches waiting that
> extend some of the more basic ones to understand addrspacecast (e.g.
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140120/202296.html),
> plus a few more that I haven't posted yet. Mostly they are little cast
> simplifications like your example in instcombine, but also SROA to
> eliminate allocas that are addrspacecasted.
>

We also think InstCombine is a good place to put this optimization, if we
decide to go with target-independent. Looking forward to your patches!

>
>
> -Matt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140325/7214198d/attachment.html>