PATCH: fix clang to emit correct addrspacecast for CUDA

Tom Stellard tom at stellard.net
Mon Mar 24 12:08:35 PDT 2014


On Mon, Mar 24, 2014 at 03:01:14PM -0400, Justin Holewinski wrote:
> On 03/24/2014 02:53 PM, Tom Stellard wrote:
> >On Mon, Mar 24, 2014 at 02:46:06PM -0400, Justin Holewinski wrote:
> >>I don't have anything against making this a target-independent
> >>IR-level, as long as no-one complains about it being a core pass.
> >>Perhaps the pass could only execute if a target explicitly enables a
> >>flag.  Something like "preferNonGenericPointers".  The default could
> >>be 'false', and the pass would only modify the IR if the target sets
> >>it to 'true'.  Of course, this also assumes address space 0 is
> >>generic.  This is currently true for the in-tree targets and
> >>CUDA/OpenCL support in Clang, but I don't believe its a set rule
> >>anywhere.
> >>
> >What do mean by 'generic' address space here?
> 
> In this context, 'generic' means an address space that encompasses
> all other address spaces.  This is roughly equivalent to the OpenCL
> 2.0 generic address space.  Is addrspace(0) not a generic address
> space for R600?
> 

No, it's not generic.  It's what we use for OpenCL's private address space.
Also, I missed the beginning of this discussion, which target-independent pass
does this impact?

-Tom

> >
> >-Tom
> >
> >>On 03/24/2014 02:28 PM, Jingyue Wu wrote:
> >>>I agree with your concern. However, both CUDA and OpenCL (two most
> >>>popular users of addrspacecast I believe) support generic address
> >>>space, and could benefit from this optimization Would we end up
> >>>with duplicated code (at least one for CUDA one for opencl) if we
> >>>put it in the back-end?
> >>>
> >>>Jingyue
> >>>
> >>>
> >>>On Mon, Mar 24, 2014 at 11:22 AM, Justin Holewinski
> >>><jholewinski at nvidia.com <mailto:jholewinski at nvidia.com>> wrote:
> >>>
> >>>    The hard part would be making this optimization general enough to
> >>>    be target-independent.  Optimizing to non-zero address spaces may
> >>>    not make sense for all targets (or even all future versions of
> >>>    PTX).  I agree that there should be an IR-level optimization for
> >>>    this, but perhaps its too target-specific and should actually live
> >>>    in the back-end.
> >>>
> >>>
> >>>    On 03/24/2014 01:05 PM, Jingyue Wu wrote:
> >>>>    Right. We are aware of this issue, and think it should be
> >>>>    addressed in the IR optimizer (similar to InstCombineLoadCast and
> >>>>    InstCombineStoreToCast) instead of clang. Do you think this is an
> >>>>    appropriate approach? Is this optimization general enough to stay
> >>>>    in the IR optimizer or target-dependent?
> >>>>
> >>>>    Jingyue
> >>>>
> >>>>
> >>>>    On Mon, Mar 24, 2014 at 4:54 AM, Justin Holewinski
> >>>>    <justin.holewinski at gmail.com
> >>>>    <mailto:justin.holewinski at gmail.com>> wrote:
> >>>>
> >>>>        Hi Jingyue,
> >>>>
> >>>>        I committed the addrspacecast isel patterns to NVPTX.  Also,
> >>>>        I wanted to point out that your changes in the last test case
> >>>>        in this patch (address-spaces.cu <http://address-spaces.cu>)
> >>>>        represent changes that may lead to performance degradation.
> >>>>         Specific address spaces should be used whenever possible for
> >>>>        loads/stores.  Casting everything to a generic address is
> >>>>        still correct, but may lead to additional indirections for
> >>>>        the hardware.
> >>>>
> >>>>
> >>>>        On Fri, Mar 21, 2014 at 2:25 PM, Justin Holewinski
> >>>>        <jholewinski at nvidia.com <mailto:jholewinski at nvidia.com>> wrote:
> >>>>
> >>>>            addrspacecast support in NVPTX is on my todo list.  I'll
> >>>>            try to put something together in the next few days.
> >>>>
> >>>>
> >>>>            On 3/21/14, 2:20 PM, Jingyue Wu wrote:
> >>>>>            Hi,
> >>>>>
> >>>>>            Static local variables in CUDA can be declared with
> >>>>>            address space qualifiers, such as __shared__. Therefore,
> >>>>>            the codegen needs to potentially addrspacecast a static
> >>>>>            local variable to the type expected by its declaration.
> >>>>>            Peter did something similar for global variables in
> >>>>>            r157167.
> >>>>>
> >>>>>            All clang tests passed.
> >>>>>
> >>>>>            Justin: The NVPTX backend support for addrspacecast
> >>>>>            seems not complete. We can send you follow-up patches
> >>>>>            once this one gets in.
> >>>>>
> >>>>>            Jingyue
> >>>>
> >>>>            --             Thanks,
> >>>>
> >>>>            Justin Holewinski
> >>>>
> >>>>            ------------------------------------------------------------------------
> >>>>            This email message is for the sole use of the intended
> >>>>            recipient(s) and may contain confidential information.
> >>>>            Any unauthorized review, use, disclosure or distribution
> >>>>            is prohibited. If you are not the intended recipient,
> >>>>            please contact the sender by reply email and destroy all
> >>>>            copies of the original message.
> >>>>            ------------------------------------------------------------------------
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>        --
> >>>>
> >>>>        Thanks,
> >>>>
> >>>>        Justin Holewinski
> >>>>
> >>>>
> >>>
> >>_______________________________________________
> >>cfe-commits mailing list
> >>cfe-commits at cs.uiuc.edu
> >>http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> 



More information about the cfe-commits mailing list