[Parallel_libs-dev] StreamExecutor design questions

Justin Lebar jlebar at google.com
Fri Sep 2 15:00:59 PDT 2016


> we might want to have the platform pass out std::shared_ptr to Device instead of raw Device*

Maybe...but then we could easily get into a state where naively
written client code ends up causing us to eagerly destroy Device
objects...  Like, most client code probably wants the Device to live
until exit.

Also I'm a little paranoid about doing anything that would force us to
do atomic ops, since we've seen performance problems related to that.

I believe there is a CUDA "shutdown the device" function that we might
want to call atexit.  If nothing else, it flushes profiling
information, which is a good thing to do.  It would probably be nice
if we called that function automagically...

On Fri, Sep 2, 2016 at 2:27 PM, Jason Henline <jhen at google.com> wrote:
> Hi James,
>
> I'm glad to hear you are experimenting with OpenCL in SE, and I really
> appreciate your insights and questions about the design. I definitely don't
> think it's too early for questions like these, and I hope you will continue
> to ask any questions you have in the future as well.
>
> What’s the motivation for having the PlatformKernelHandle and
> PlatformStreamHandle classes, rather than just using opaque pointers? These
> classes result in a some extra subclassing when implementing an SE backend,
> just to wrap a platform specific pointer (e.g. cl_kernel or
> cl_command_queue), but maybe I’m missing the point here. It also seems to be
> slightly inconsistent with the device memory stuff, which uses raw opaque
> pointers at the PlatformDevice level.
>
> My idea with PlatformKernelHandle and PlatformStreamHandle was that their
> destructors would free any resources they held. So, for example, in the case
> of your OCLStreamHandle class, I was thinking that you would also define a
> destructor that called clReleaseCommandQueue.
>
> I agree this is inconsistent with the way device memory is currently handled
> using a void* handle, and I'm glad you pointed that out. The device memory
> object currently handles its own cleanup by making a call to
> PlatformDevice::freeDeviceMemory, so we could create consistency by
> following this model. PlatformKernelHandle and PlatformStreamHandle would
> each store a void* opaque handle to the platform-specific implementation and
> new functions would be added to the PlatformDevice interface to
> destroyKernel(PlatformKernelHandle*) and
> destroyStream(PlatformStreamHandle*). Then the destructors for
> PlatformKernelHandle and PlatformStreamHandle would call the corresponding
> Device destroy functions just like GlobalDeviceMemoryBase calls the
> freeDeviceMemory function in its destructor.
>
> Now that I've laid it out this way, I think you're right that that design
> would be better. It would be consistent and it would be easier to implement
> a platform. Does that sound like the right design to you, as well? If so,
> I'll write up some patches to make those changes.
>
> It also seems that there is no way for a user to fully release the resources
> used by a SE device, since the end-user doesn’t own the Device pointer.
> Maybe I’m worrying about nothing here, but I wonder whether there may be
> some devices that become ‘unhappy' if you don’t release their
> contexts/default queues etc before the application exits.
>
> I think you are right to say there is no way for a user to free an SE
> device. I don't know if anyone will ever need to free those resources, but I
> think we will want to add a method to the platform class to free the
> resources for a device with a given ordinal. Of course that might leave
> dangling references to a device that has been freed, so we might want to
> have the platform pass out std::shared_ptr to Device instead of raw Device*.
> In any case, I think we have a few good options here that won't be hard to
> change later. What do you think?
>
> - If you *do* want the PlatformKernelHandle class, PlatformInterfaces.cpp
> was missing its destructor implementation.
>
> Good catch. As per the discussion above, I will probably be adding it in
> with a non-default body if we switch to using void* handles.
>
> - Should Device have a getName() that forwards onto PlatformDevice?
> Otherwise there seems to be no way for the end-user to get at the name.
>
> I think that's a great idea. I can add a patch for it, or you can feel free
> to submit a patch as well and I'll be glad to review it and check it in as
> needed.
>
> - Missing LICENSE.txt at the top-level parallel-libs directory?
>
> You're probably right, but I want to check with chandlerc when he gets back
> from vacation next week. Since parallel-libs is supposed to be made up of
> different sub-projects, I think there's some possibility that there might be
> different licenses. I'll try to figure out if I should put a LICENSE.txt in
> the top-level, the streamexecutor subdirectory, or both.
>
> Thanks for all your help with this project,
>
> -Jason
>
> On Fri, Sep 2, 2016 at 12:32 PM James Price <J.Price at bristol.ac.uk> wrote:
>>
>> Hi,
>>
>> I’ve thrown together a quick and dirty OpenCL backend for SE to play
>> around with writing some application code that uses it, and as a result have
>> one or two design questions. I know it’s early days so I apologise if I’m
>> jumping the gun.
>>
>> What’s the motivation for having the PlatformKernelHandle and
>> PlatformStreamHandle classes, rather than just using opaque pointers? These
>> classes result in a some extra subclassing when implementing an SE backend,
>> just to wrap a platform specific pointer (e.g. cl_kernel or
>> cl_command_queue), but maybe I’m missing the point here. It also seems to be
>> slightly inconsistent with the device memory stuff, which uses raw opaque
>> pointers at the PlatformDevice level.
>>
>> It also seems that there is no way for a user to fully release the
>> resources used by a SE device, since the end-user doesn’t own the Device
>> pointer. Maybe I’m worrying about nothing here, but I wonder whether there
>> may be some devices that become ‘unhappy' if you don’t release their
>> contexts/default queues etc before the application exits.
>>
>> Other minor points that I came across:
>>
>> - If you *do* want the PlatformKernelHandle class, PlatformInterfaces.cpp
>> was missing its destructor implementation.
>>
>> - Should Device have a getName() that forwards onto PlatformDevice?
>> Otherwise there seems to be no way for the end-user to get at the name.
>>
>> - Missing LICENSE.txt at the top-level parallel-libs directory?
>>
>> For reference, I’ve thrown this OpenCL backend up here (it’s not great,
>> but it works):
>> https://gist.github.com/jrprice/84e3f80cf0c851041d274a663d7ef73f
>>
>> Cheers,
>>
>> James
>>
>


More information about the Parallel_libs-dev mailing list