PATCH: fix clang to emit correct addrspacecast for CUDA

Jingyue Wu jingyue at google.com
Mon Mar 24 11:28:40 PDT 2014


I agree with your concern. However, both CUDA and OpenCL (two most popular
users of addrspacecast I believe) support generic address space, and could
benefit from this optimization Would we end up with duplicated code (at
least one for CUDA one for opencl) if we put it in the back-end?

Jingyue


On Mon, Mar 24, 2014 at 11:22 AM, Justin Holewinski
<jholewinski at nvidia.com>wrote:

>  The hard part would be making this optimization general enough to be
> target-independent.  Optimizing to non-zero address spaces may not make
> sense for all targets (or even all future versions of PTX).  I agree that
> there should be an IR-level optimization for this, but perhaps its too
> target-specific and should actually live in the back-end.
>
>
> On 03/24/2014 01:05 PM, Jingyue Wu wrote:
>
> Right. We are aware of this issue, and think it should be addressed in the
> IR optimizer (similar to InstCombineLoadCast and InstCombineStoreToCast)
> instead of clang. Do you think this is an appropriate approach? Is this
> optimization general enough to stay in the IR optimizer or
> target-dependent?
>
>  Jingyue
>
>
> On Mon, Mar 24, 2014 at 4:54 AM, Justin Holewinski <
> justin.holewinski at gmail.com> wrote:
>
>> Hi Jingyue,
>>
>>  I committed the addrspacecast isel patterns to NVPTX.  Also, I wanted
>> to point out that your changes in the last test case in this patch (
>> address-spaces.cu) represent changes that may lead to performance
>> degradation.  Specific address spaces should be used whenever possible for
>> loads/stores.  Casting everything to a generic address is still correct,
>> but may lead to additional indirections for the hardware.
>>
>>
>> On Fri, Mar 21, 2014 at 2:25 PM, Justin Holewinski <
>> jholewinski at nvidia.com> wrote:
>>
>>>  addrspacecast support in NVPTX is on my todo list.  I'll try to put
>>> something together in the next few days.
>>>
>>>
>>> On 3/21/14, 2:20 PM, Jingyue Wu wrote:
>>>
>>> Hi,
>>>
>>>  Static local variables in CUDA can be declared with address space
>>> qualifiers, such as __shared__. Therefore, the codegen needs to potentially
>>> addrspacecast a static local variable to the type expected by its
>>> declaration. Peter did something similar for global variables in r157167.
>>>
>>>  All clang tests passed.
>>>
>>>  Justin: The NVPTX backend support for addrspacecast seems not
>>> complete. We can send you follow-up patches once this one gets in.
>>>
>>>  Jingyue
>>>
>>>
>>>
>>>   --
>>> Thanks,
>>>
>>> Justin Holewinski
>>>
>>>  ------------------------------
>>>  This email message is for the sole use of the intended recipient(s)
>>> and may contain confidential information.  Any unauthorized review, use,
>>> disclosure or distribution is prohibited.  If you are not the intended
>>> recipient, please contact the sender by reply email and destroy all copies
>>> of the original message.
>>>  ------------------------------
>>>
>>
>>
>>
>>   --
>>
>> Thanks,
>>
>>  Justin Holewinski
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140324/96d90946/attachment.html>


More information about the cfe-commits mailing list