[LLVMdev] Proposal: New IR instruction for casting between address spaces

Thu Sep 13 12:00:04 PDT 2012

On Sep 13, 2012, at 7:51 AM, "Villmow, Micah" <Micah.Villmow at amd.com> wrote:

> 
> 
>> -----Original Message-----
>> From: Mon Ping Wang [mailto:monping at apple.com]
>> Sent: Thursday, September 13, 2012 1:55 AM
>> To: Villmow, Micah
>> Cc: llvmdev at cs.uiuc.edu
>> Subject: Re: [LLVMdev] Proposal: New IR instruction for casting between
>> address spaces
>> 
>> 
>> On Sep 12, 2012, at 2:45 PM, "Villmow, Micah" <Micah.Villmow at amd.com>
>> wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Mon P Wang [mailto:monping at apple.com]
>>>> Sent: Wednesday, September 12, 2012 1:12 PM
>>>> To: Villmow, Micah
>>>> Cc: Dan Gohman; llvmdev at cs.uiuc.edu
>>>> Subject: Re: [LLVMdev] Proposal: New IR instruction for casting
>>>> between address spaces
>>>> 
>>>> Hi,
>>>> 
>>>> On Sep 11, 2012, at 2:30 PM, Villmow, Micah wrote:
>>>> 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Dan Gohman [mailto:gohman at apple.com]
>>>>>> Sent: Tuesday, September 11, 2012 1:28 PM
>>>>>> To: Villmow, Micah
>>>>>> Cc: llvmdev at cs.uiuc.edu
>>>>>> Subject: Re: [LLVMdev] Proposal: New IR instruction for casting
>>>> between
>>>>>> address spaces
>>>>>> 
>>>>>> On Sep 11, 2012, at 1:03 PM, "Villmow, Micah"
>>>> <Micah.Villmow at amd.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> From: Villmow, Micah
>>>>>>> Sent: Tuesday, September 11, 2012 12:51 PM
>>>>>>> To: llvm-commits at cs.uiuc.edu
>>>>>>> Subject: Proposal: New IR instruction for casting between address
>>>>>>> spaces
>>>>>>> 
>>>>>>> Problem:
>>>>>>> Bit casting between pointers of different address spaces only
>>>>>>> works
>>>> if
>>>>>> all address space pointers are the same size. With changes from
>>>> email
>>>>>> chain [1][2], support for different pointer sizes breaks the
>>>>>> bitcast instruction since there is no guarantee that the pointer
>>>>>> size for
>>>> the
>>>>>> address space is on the source and destination arguments are of the
>>>> same
>>>>>> size.
>>>>>> 
>>>>>> Can you comment on whether the need for this seems like a
>>>> fundamental
>>>>>> need, in your field, or more of a limitation of the current
>>>> generation
>>>>>> of architectures?
>>>>> [Villmow, Micah] I think this is a little of both. While the current
>>>> and previous generation of GPU architectures are limited in what they
>>>> are capable of doing based on hardware restrictions, I also think
>>>> there is a fundamental need. Not all devices will run with 64bit
>>>> operations at full speed(or even have native instructions in any
>>>> case), but memory sizes will soon eclipse what is addressable with
>>>> 32bit pointers on non- PC systems. What this is causing is 32bit
>>>> systems requiring addressing into 64bit memory and switching over to
>>>> 64bit for address calculations destroys the performance advantage
>> that the segmented memory provides.
>>>>> In the CPU world, this isn't that much of a problem that I can tell
>>>> as they have already been solved(32bit vs 64bit math is 1-1 in most
>>>> cases), but in non-CPU architectures, this is a huge performance
>>>> penalty(64bit mul runs 6x slower than 32bit mul). So being able to
>>>> switch to 32bit in the cases where it is required and switch to 64bit
>>>> where it is required is a fundamental need that I don't think will go
>>>> away even if the architectures improve their memory infrastructure.
>>>>>> 
>>>>>>> Solution:
>>>>>>> Remove the ability of bitcast to cast between pointers of
>>>>>>> different
>>>>>> address spaces and replace with an instruction that handles this
>>>> case
>>>>>> explicitely.
>>>>>>> 
>>>>>>> Proposed changes:
>>>>>>> *         Add restriction to the verifier on the bitcast
>>>> instruction
>>>>>> making bitcasting between address spaces illegal.
>>>>>>> *         Change documentation[3] to state the bitcast to pointers
>>>> of
>>>>>> different address spaces is illegal.
>>>>>>> *         Add in a new IR node, addrspacecast, that allows
>>>> conversions
>>>>>> between address spaces
>>>>>>> *         Updated the reader/writer to handle these cases
>>>>>>> *         Update the documentation to insert the new IR node.
>>>>>>> *         Add the following documentation:
>>>>>>> 'addrspacecast .. to' Instruction
>>>>>>> 
>>>>>>> Syntax:
>>>>>>> 
>>>>>>> <result> = addrspacecast <ty> <value> to <ty2>             ;
>>>> yields
>>>>>> ty2
>>>>>>> Overview:
>>>>>>> 
>>>>>>> The ' addrspacecast ' instruction converts value to type ty2
>>>> without
>>>>>> changing any bits.
>>>>>> 
>>>>>> This is mildly imprecise, because the whole point of this
>>>> instruction is
>>>>>> that it can change the bit width.
>>>>> [Villmow, Micah] Doh, cut and paste error, will fix it.
>>>>>> 
>>>>>>> 
>>>>>>> Arguments:
>>>>>>> 
>>>>>>> The ' addrspacecast ' instruction takes a value to cast, which
>>>>>>> must
>>>> be
>>>>>> a non-aggregate first class value with a pointer type, and a type
>>>>>> to cast it to, which must also be a pointer type. The pointer types
>>>>>> of value and the destination type, ty2, must be identical, except
>>>>>> for
>>>> the
>>>>>> address space.
>>>>>> 
>>>>>> Having a "pointer type" is sufficient to imply that it is a "non-
>>>>>> aggregate first class value".
>>>>>> 
>>>>>>> 
>>>>>>> Semantics:
>>>>>>> 
>>>>>>> The ' addrspacecast ' instruction converts value to type ty2. It
>>>>>> converts the type to the type that is implied by the address space
>>>> of
>>>>>> the destination pointer. If the destination pointer is smaller than
>>>> the
>>>>>> source pointer, the upper bits are truncated. If the inverse is
>>>> true,
>>>>>> the upper bits are sign extended, otherwise the operation is a no-
>>>> op.
>>>>>> 
>>>>>> Why sign-extended? Ptrtoint/inttoptr are zero-extended, and it's
>>>>>> surprising that addrspacecast would be different from them.
>>>>> [Villmow, Micah] Take for example a pointer representing a negative
>>>> pointer offset into a 16 bit address space, if this is converted to a
>>>> 64bit address space, the upper 48 bits would be zero and your
>>>> negative offset just became positive. The difference between these
>>>> two instruction types is that addrspacecast does not explicitly
>>>> convert to any size, only implicitly, so the bits would need to be
>>>> filled correctly.
>>>>>> 
>>>> 
>>>> I view a pointer as pointing to a location in memory and not as an
>>>> offset relative to some base register.  I think the proper semantic
>>>> here is the same as inttoptr where it does a zero-extension.
>>> [Villmow, Micah] Yeah, but the pointer won't point to the same
>> location if the conversion from a smaller pointer to a larger pointer is
>> zero extended.
>>> Take two address spaces(1 and 2) that are 16 and 64 bits in size.
>>> int(1) *a = 0xFFFFFFF9;
>>> int(2) *b = *a;
>>> Is b -10(SExt), or is it 4294967289(ZExt)?
>> 
>> I think you mean if is it -10 (Sext) or 65529 (Zext from 16b to 64b)?
>> 
>> I would expect the same result if I wrote
>>  int(1) *a = 0x0FFF9;
>>  int(2) *b = *a;
>> 
>> In C, integer to point conversions are implementation defined and
>> depends on what the addressing structure of the execution environment
>> is. Given the current definition of ptrtoint and intoptr, I feel that
>> the addressing structure feels like a flat memory model starting from 0
>> and the value "b" should be 65529.  In your example where we know the
>> largest pointer is 64b, I would expect the final result to be the same
>> as doing a ptrtoint from int(1) to i64 and intotptr to int(2)*.
> [Villmow, Micah] So then if there is already a way to do this, what really is the benefit of adding a new instruction? 
> Also there is a typo in my example, the second assignment should not have the '*'. I can add a new instruction if that
> is the recommended behavior, but I think it would also be fine to force ptrtoint and inttoptr, although it does take one instruction more.
> 

The problem with using ptrtoint and inttoptr is that one has to pick an intermediate integer type that is safe to convert to and from.  Since the pointer size is target dependent, it seems unnatural to use ptrtoint and inttoptr for that.  

-- Mon Ping

>> 
>> -- Mon Ping
>> 
>>> This works for inttoptr and ptrtoint because there is an assumption
>> that the pointer is always the same size. Maybe we even need to extend
>> ptrtoint/inttoptr to handle this case by adding unsigned versions?
>> 
>> 
>> 
>> 
>> 
>>> 
>>>> 
>>>> -- Mon Ping
>>>> 
>>>> 
>>>> 
>>>> 
>>>>>> Dan
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev