[cfe-dev] [LLVMdev] RFC: Representation of OpenCL Memory Spaces

Fri Oct 14 05:42:27 PDT 2011

On Thu, Oct 13, 2011 at 7:56 PM, Mon P Wang <monping at apple.com> wrote:

> Hi,
>
> Tanya and I also prefer the extended TBAA solution as it naturally fits
> with LLVM.  From my understanding of TBAA, it seems to provide the power to
> describe the relationship between address spaces for alias analysis, i.e.,
> it can describe if two address spaces are disjoint or one may nest within
> another.  For OpenCL, it is most useful to indicate that address spaces are
> disjoint from the point of view of alias analysis even though the underlying
> memory may be the same like in x86.   The question is there something
> missing in TBAA that it can't properly describe the semantics we want for an
> address space?
>

>From what I can tell, extending TBAA is perfectly fine for the alias
problem.  I really just want to make sure we're providing enough hooks in
the front-end and IR so that any back-end can be used for OpenCL code gen.

>
>   -- Mon Ping
>
>
>
> On Oct 13, 2011, at 1:14 PM, Justin Holewinski wrote:
>
>
>
> On Thu, Oct 13, 2011 at 11:57 AM, Peter Collingbourne <peter at pcc.me.uk>wrote:
>
>> Hi Justin,
>>
>> Thanks for bringing this up, I think it's important to discuss
>> these issues here.
>>
>> On Thu, Oct 13, 2011 at 09:46:28AM -0400, Justin Holewinski wrote:
>> > It is becoming increasingly clear to me that LLVM address spaces are not
>> the
>> > general solution to OpenCL/CUDA memory spaces. They are a convenient
>> hack to
>> > get things working in the short term, but I think a more long-term
>> approach
>> > should be discussed and decided upon now before the OpenCL and CUDA
>> > implementations in Clang/LLVM get too mature. To be clear, I am not
>> > advocating that *targets* change to a different method for representing
>> > device memory spaces. The current use of address spaces to represent
>> > different types of device memory is perfectly valid, IMHO. However, this
>> > knowledge should not be encoded in front-ends and pre-SelectionDAG
>> > optimization passes.
>>
>> I disagree.  The targets should expose all the address spaces they
>> provide, and the frontend should know about the various address spaces
>> it needs to know about.  It is incumbent on the frontend to deliver
>> a valid IR for a particular language implementation, and part of
>> that involves knowing about the ABI requirements for the language
>> implementation (which may involve using specific address spaces)
>> and the capabilities of each target (including the capabilities of
>> the target's address spaces), together with the language semantics.
>> It is not the job of the optimisers or backend to know the semantics
>> for a specific language, a specific implementation of that language
>> or a specific ABI.
>>
>
> But this is assuming that a target's address spaces have a valid 1 to 1
> mapping between OpenCL memory spaces and back-end address spaces.  What
> happens for a target such as x86?  Do we introduce pseudo address spaces
> into the back-end just to satisfy the front-end OpenCL requirements?
>
>
>> >
>> >
>> > *2. Solutions*
>> >
>> > A couple of solutions to this problem are presented here, with the hope
>> that
>> > the Clang/LLVM community will offer a constructive discussion on how
>> best to
>> > proceed with OpenCL/CUDA support in Clang/LLVM. The following list is in
>> no
>> > way meant to be exhaustive; it merely serves as a starting basis for
>> > discussion.
>> >
>> >
>> > *2A. Extend TBAA*
>> >
>> > In theory, the type-based alias analysis pass could be extended to
>> > (properly) support aliasing queries for pointers in OpenCL kernels.
>> >  Currently, it has no way of knowing if two pointers in different
>> address
>> > spaces can alias, and in fact cannot know if this is the case given the
>> > definition of LLVM address spaces.  Instead of programming it with
>> > target-specific knowledge, it can be extended with language-specific
>> > knowledge.  Instead of considering address spaces, the Clang portion of
>> TBAA
>> > can be programmed to use OpenCL attributes to extend its pointer
>> metadata.
>> >  Specifically, pointers to different memory spaces are in essence
>> different
>> > types and cannot alias.  For the kernel shown above, the resulting LLVM
>> IR
>> > could be:
>> >
>> > ; ModuleID = 'test1.cl'
>> > target datalayout = "e-p:32:32-i64:64:64-f64:64:64-n1:8:16:32:64"
>> > target triple = "ptx32--"
>> >
>> > define ptx_kernel void @foo(float* nocapture %a, float addrspace(4)*
>> > nocapture %b) nounwind noinline {
>> > entry:
>> >   %0 = load float* %a, align 4, !tbaa !1
>> >   store float %0, float addrspace(4)* %b, align 4, !tbaa *!2*
>> >   ret void
>> > }
>> >
>> > !opencl.kernels = !{!0}
>> >
>> > !0 = metadata !{void (float*, float addrspace(4)*)* @foo}
>> > *!1 = metadata !{metadata !"float$__global", metadata !3}*
>> > *!2 = metadata !{metadata !"float$__local", metadata !3}*
>> > !3 = metadata !{metadata !"omnipotent char", metadata !4}
>> > !4 = metadata !{metadata !"Simple C/C++ TBAA", null}
>> >
>> > Differences are bolded.  Here, the TBAA pass would be able to identify
>> that
>> > the loads and stores do not alias.  Of course, when compiling in
>> > non-OpenCL/CUDA mode, TBAA would work just as before.
>>
>> I have to say that I much prefer the TBAA solution, as it encodes the
>> language semantics using the existing metadata for language semantics.
>>
>
> It's certainly the easiest to implement and would have the least impact
> (practically zero) on existing passes.
>
>
>>
>> > *Pros:*
>> >
>> > Relatively easy to implement
>> >
>> > *Cons:*
>> >
>> > Does not solve the full problem, such as how to represent OpenCL memory
>> > spaces in other backends, such as X86 which uses LLVM address spaces for
>> > different purposes.
>>
>> This presupposes that we need a way of representing OpenCL address
>> spaces in IR targeting X86 (and targets which lack GPU-like address
>> spaces).  As far as I can tell, the only real representations of
>> OpenCL address spaces on such targets that we need are a way of
>> distinguishing the different address spaces for alias analysis
>> and a representation for __local variables allocated on the stack.
>> TBAA metadata would solve the first problem, and we already have
>> mechanisms in the frontend that could be used to solve the second.
>>
>
> Which mechanisms could be used to differentiate between thread-private and
> __local data?
>
>
>>
>> > I see this solution as more of a short-term hack to solve the pointer
>> > aliasing issue without actually addressing the larger issues.
>>
>> I remain to be persuaded that there are any "larger issues" to solve.
>>
>> > *2B. Emit OpenCL/CUDA-specific Metadata or Attributes*
>> >
>> > Instead of using LLVM address spaces to represent OpenCL/CUDA memory
>> spaces,
>> > language-specific annotations can be provided on types.  This can take
>> the
>> > form of metadata, or additional LLVM IR attributes on types and
>> parameters,
>> > such as:
>> >
>> > ; ModuleID = 'test1.cl'
>> > target datalayout = "e-p:32:32-i64:64:64-f64:64:64-n1:8:16:32:64"
>> > target triple = "ptx32--"
>> >
>> > define *ocl_kernel* void @foo(float* nocapture *ocl_global* %a, float*
>> > nocapture *ocl_local* %b) nounwind noinline {
>> > entry:
>> >   %0 = load float* %a, align 4
>> >   store float %0, float* %b, align 4
>> >   ret void
>> > }
>> >
>> > Instead of extending the LLVM IR language, this information could also
>> be
>> > encoded as metadata by either (1) emitting some global metadata that
>> binds
>> > useful properties to globals and parameters, or (2) extending LLVM IR to
>> > allow attributes on parameters and globals.
>> >
>> > Optimization passes can make use of these additional attributes to
>> derive
>> > useful properties, such as %a cannot alias %b. Then, back-ends can use
>> these
>> > attributes to emit proper code sequences based on the pointer
>> attributes.
>> >
>> > *Pros:*
>> > *
>> > *
>> > If done right, would solve the general problem
>> >
>> > *Cons:*
>> > *
>> > *
>> > Large implementation commitment; could potentially touch many parts of
>> LLVM.
>>
>> You are being vague about what is required here.  A complete solution
>> following 2B would involve allowing these attributes on all pointer
>> types.  It would be very expensive to allow custom attributes or
>> metadata on pointer types, since they are used frequently in the IR,
>> and the common case is not to have attributes or metadata.  Also,
>> depending on how this is implemented, this would encode far too much
>> language specific information in the IR.
>>
>
> I agree that this would be expensive, and I'm not necessarily advocating
> it. If the consensus is that TBAA extensions are sufficient for all cases,
> then I'm fine with that.  It's much less work. :)
>
> I just want to make sure we're covering all of our bases before we proceed
> too far with this.
>
>
>>
>> Thanks,
>> --
>> Peter
>>
>
>
>
> --
>
> Thanks,
>
> Justin Holewinski
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>

-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111014/14386196/attachment.html>