PATCH: fix clang to emit correct addrspacecast for CUDA

Mon Mar 24 11:46:06 PDT 2014

I don't have anything against making this a target-independent IR-level, 
as long as no-one complains about it being a core pass. Perhaps the pass 
could only execute if a target explicitly enables a flag.  Something 
like "preferNonGenericPointers".  The default could be 'false', and the 
pass would only modify the IR if the target sets it to 'true'.  Of 
course, this also assumes address space 0 is generic.  This is currently 
true for the in-tree targets and CUDA/OpenCL support in Clang, but I 
don't believe its a set rule anywhere.

On 03/24/2014 02:28 PM, Jingyue Wu wrote:
> I agree with your concern. However, both CUDA and OpenCL (two most 
> popular users of addrspacecast I believe) support generic address 
> space, and could benefit from this optimization Would we end up with 
> duplicated code (at least one for CUDA one for opencl) if we put it in 
> the back-end?
>
> Jingyue
>
>
> On Mon, Mar 24, 2014 at 11:22 AM, Justin Holewinski 
> <jholewinski at nvidia.com <mailto:jholewinski at nvidia.com>> wrote:
>
>     The hard part would be making this optimization general enough to
>     be target-independent.  Optimizing to non-zero address spaces may
>     not make sense for all targets (or even all future versions of
>     PTX).  I agree that there should be an IR-level optimization for
>     this, but perhaps its too target-specific and should actually live
>     in the back-end.
>
>
>     On 03/24/2014 01:05 PM, Jingyue Wu wrote:
>>     Right. We are aware of this issue, and think it should be
>>     addressed in the IR optimizer (similar to InstCombineLoadCast and
>>     InstCombineStoreToCast) instead of clang. Do you think this is an
>>     appropriate approach? Is this optimization general enough to stay
>>     in the IR optimizer or target-dependent?
>>
>>     Jingyue
>>
>>
>>     On Mon, Mar 24, 2014 at 4:54 AM, Justin Holewinski
>>     <justin.holewinski at gmail.com
>>     <mailto:justin.holewinski at gmail.com>> wrote:
>>
>>         Hi Jingyue,
>>
>>         I committed the addrspacecast isel patterns to NVPTX.  Also,
>>         I wanted to point out that your changes in the last test case
>>         in this patch (address-spaces.cu <http://address-spaces.cu>)
>>         represent changes that may lead to performance degradation.
>>          Specific address spaces should be used whenever possible for
>>         loads/stores.  Casting everything to a generic address is
>>         still correct, but may lead to additional indirections for
>>         the hardware.
>>
>>
>>         On Fri, Mar 21, 2014 at 2:25 PM, Justin Holewinski
>>         <jholewinski at nvidia.com <mailto:jholewinski at nvidia.com>> wrote:
>>
>>             addrspacecast support in NVPTX is on my todo list.  I'll
>>             try to put something together in the next few days.
>>
>>
>>             On 3/21/14, 2:20 PM, Jingyue Wu wrote:
>>>             Hi,
>>>
>>>             Static local variables in CUDA can be declared with
>>>             address space qualifiers, such as __shared__. Therefore,
>>>             the codegen needs to potentially addrspacecast a static
>>>             local variable to the type expected by its declaration.
>>>             Peter did something similar for global variables in
>>>             r157167.
>>>
>>>             All clang tests passed.
>>>
>>>             Justin: The NVPTX backend support for addrspacecast
>>>             seems not complete. We can send you follow-up patches
>>>             once this one gets in.
>>>
>>>             Jingyue
>>
>>
>>             -- 
>>             Thanks,
>>
>>             Justin Holewinski
>>
>>             ------------------------------------------------------------------------
>>             This email message is for the sole use of the intended
>>             recipient(s) and may contain confidential information.
>>             Any unauthorized review, use, disclosure or distribution
>>             is prohibited. If you are not the intended recipient,
>>             please contact the sender by reply email and destroy all
>>             copies of the original message.
>>             ------------------------------------------------------------------------
>>
>>
>>
>>
>>         -- 
>>
>>         Thanks,
>>
>>         Justin Holewinski
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140324/b28d4bab/attachment.html>