[LLVMdev] Question on equivalence of pointer types

Mon Dec 15 23:48:03 PST 2014

Hi David,

It seems that your use-case is similar to mine (though
http://chericpu.org does not resolve for me :( ).

> In your original examples, on our architecture, one would copy 64
> bits, the other would copy 256 bits (and preserve the tag bit).
> Replacing one with the other would be a very bad idea.

> Mostly we've had to make changes to SCEV and the
> vectoriser (and lots to SelectionDAG) to understand that pointers are
> not always integers

I take it this means that casting a capability to and from an i256
(via ptrtoint and inttoptr) is not semantically a no-op?

In our use case pointers in addrspace(0) and addrspace(1) are both 64
bits wide; but like in your case, storing (loading) an addrspace(0)
pointer is semantically a different operation than storing (loading) a
addrspace(1) pointer.  I wonder if withholding pointer sizes from
target-independent optimizations will give us the strong distinction
we need, because llvm has to assume that pointers of different address
spaces are of different sizes in general.

-- Sanjoy

On Tue, Dec 9, 2014 at 2:14 AM, David Chisnall
<David.Chisnall at cl.cam.ac.uk> wrote:
> On 9 Dec 2014, at 02:12, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>
>> In the example that you gave, the difference is between
>>
>>  load i32 addrspace(0)* null
>> and
>>  load i32 addrspace(1)* null
>>
>> (that the first one is UB, while the second one may be well-defined).
>>
>> But my question is different, it is that are the following two different:
>>
>>  load i32 addrspace(0)* addrspace(0)* %p0
>>  load i32 addrspace(1)* addrspace(0)* %p1
>>
>> if %p0 == %p1 (because they're bitcasts of each other or something --
>> note that you don't need an addrspacecast to go from "i32
>> addrspace(1)* addrspace(0)*" to "i32 addrspace(0)* addrspace(0)*").
>>
>> They are different in the type system because one of them produces an
>> "i32 addrspace(0)*" while the other produces an "i32 addrspace(1)*",
>> so in general they cannot be substituted by each other, but there are
>> edge cases as in the example I started this thread with.
>>
>> In other words: the semantics of a load or store depend on the address
>> space of the pointer operand.  But in case the *value* we're storing
>> is also a pointer, does the semantics depend on the value's address
>> space to?
>>
>> I suspect the answer is no, but I may be missing something here and
>> wish to confirm.
>
> To give a concrete example, on our architecture we're using one address space for 64-bit pointers that are relative to a global capability register[1] and another address space for 256-bit capabilities.  You can bitcast a pointer to a pointer to a pointer to a capability (or vice versa) and the value that you load may or may not be meaningful (it probably won't be).  The instructions emitted will be different, because one is loading a 64-bit integer into an integer value, the other is loading a 256-bit capability into a capability register (and will trap if the value isn't 256-bit aligned).
>
> In your original examples, on our architecture, one would copy 64 bits, the other would copy 256 bits (and preserve the tag bit).  Replacing one with the other would be a very bad idea.
>
> We've had to fix a few things in mid-level optimisers to deal with this, but not many.  Mostly we've had to make changes to SCEV and the vectoriser (and lots to SelectionDAG) to understand that pointers are not always integers and a few to CodeGen to understand that you can't replace the LLVM memcpy intrinsic that's copying from one address space to another with a call to the memcpy library routine (actually, we fudge this with a custom lowering in the IR).
>
> David
>
> [1] You can think of memory capabilities as being something like segments and something like fat pointers, depending on how you use them.  If you really want to know more: http://chericpu.org