PATCH: fix clang to emit correct addrspacecast for CUDA

Justin Holewinski jholewinski at nvidia.com
Mon Mar 24 12:01:14 PDT 2014


On 03/24/2014 02:53 PM, Tom Stellard wrote:
> On Mon, Mar 24, 2014 at 02:46:06PM -0400, Justin Holewinski wrote:
>> I don't have anything against making this a target-independent
>> IR-level, as long as no-one complains about it being a core pass.
>> Perhaps the pass could only execute if a target explicitly enables a
>> flag.  Something like "preferNonGenericPointers".  The default could
>> be 'false', and the pass would only modify the IR if the target sets
>> it to 'true'.  Of course, this also assumes address space 0 is
>> generic.  This is currently true for the in-tree targets and
>> CUDA/OpenCL support in Clang, but I don't believe its a set rule
>> anywhere.
>>
> What do mean by 'generic' address space here?

In this context, 'generic' means an address space that encompasses all 
other address spaces.  This is roughly equivalent to the OpenCL 2.0 
generic address space.  Is addrspace(0) not a generic address space for 
R600?

>
> -Tom
>
>   
>> On 03/24/2014 02:28 PM, Jingyue Wu wrote:
>>> I agree with your concern. However, both CUDA and OpenCL (two most
>>> popular users of addrspacecast I believe) support generic address
>>> space, and could benefit from this optimization Would we end up
>>> with duplicated code (at least one for CUDA one for opencl) if we
>>> put it in the back-end?
>>>
>>> Jingyue
>>>
>>>
>>> On Mon, Mar 24, 2014 at 11:22 AM, Justin Holewinski
>>> <jholewinski at nvidia.com <mailto:jholewinski at nvidia.com>> wrote:
>>>
>>>     The hard part would be making this optimization general enough to
>>>     be target-independent.  Optimizing to non-zero address spaces may
>>>     not make sense for all targets (or even all future versions of
>>>     PTX).  I agree that there should be an IR-level optimization for
>>>     this, but perhaps its too target-specific and should actually live
>>>     in the back-end.
>>>
>>>
>>>     On 03/24/2014 01:05 PM, Jingyue Wu wrote:
>>>>     Right. We are aware of this issue, and think it should be
>>>>     addressed in the IR optimizer (similar to InstCombineLoadCast and
>>>>     InstCombineStoreToCast) instead of clang. Do you think this is an
>>>>     appropriate approach? Is this optimization general enough to stay
>>>>     in the IR optimizer or target-dependent?
>>>>
>>>>     Jingyue
>>>>
>>>>
>>>>     On Mon, Mar 24, 2014 at 4:54 AM, Justin Holewinski
>>>>     <justin.holewinski at gmail.com
>>>>     <mailto:justin.holewinski at gmail.com>> wrote:
>>>>
>>>>         Hi Jingyue,
>>>>
>>>>         I committed the addrspacecast isel patterns to NVPTX.  Also,
>>>>         I wanted to point out that your changes in the last test case
>>>>         in this patch (address-spaces.cu <http://address-spaces.cu>)
>>>>         represent changes that may lead to performance degradation.
>>>>          Specific address spaces should be used whenever possible for
>>>>         loads/stores.  Casting everything to a generic address is
>>>>         still correct, but may lead to additional indirections for
>>>>         the hardware.
>>>>
>>>>
>>>>         On Fri, Mar 21, 2014 at 2:25 PM, Justin Holewinski
>>>>         <jholewinski at nvidia.com <mailto:jholewinski at nvidia.com>> wrote:
>>>>
>>>>             addrspacecast support in NVPTX is on my todo list.  I'll
>>>>             try to put something together in the next few days.
>>>>
>>>>
>>>>             On 3/21/14, 2:20 PM, Jingyue Wu wrote:
>>>>>             Hi,
>>>>>
>>>>>             Static local variables in CUDA can be declared with
>>>>>             address space qualifiers, such as __shared__. Therefore,
>>>>>             the codegen needs to potentially addrspacecast a static
>>>>>             local variable to the type expected by its declaration.
>>>>>             Peter did something similar for global variables in
>>>>>             r157167.
>>>>>
>>>>>             All clang tests passed.
>>>>>
>>>>>             Justin: The NVPTX backend support for addrspacecast
>>>>>             seems not complete. We can send you follow-up patches
>>>>>             once this one gets in.
>>>>>
>>>>>             Jingyue
>>>>
>>>>             --             Thanks,
>>>>
>>>>             Justin Holewinski
>>>>
>>>>             ------------------------------------------------------------------------
>>>>             This email message is for the sole use of the intended
>>>>             recipient(s) and may contain confidential information.
>>>>             Any unauthorized review, use, disclosure or distribution
>>>>             is prohibited. If you are not the intended recipient,
>>>>             please contact the sender by reply email and destroy all
>>>>             copies of the original message.
>>>>             ------------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>>
>>>>         --
>>>>
>>>>         Thanks,
>>>>
>>>>         Justin Holewinski
>>>>
>>>>
>>>
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140324/9203d346/attachment.html>


More information about the cfe-commits mailing list