[Parallel_libs-commits] [PATCH] D25701: Initial check-in of Acxxel (StreamExecutor renamed)

Justin Lebar via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Tue Oct 18 11:52:21 PDT 2016


jlebar added a comment.

> You were right that it did require checking the error at every line. To address this, I've added in a thread_local variable to keep track of the first error status. With this, users can make as many calls as they want without checking the error, then they can do a final check that nothing went wrong.

Should it be thread_local, or local to the stream itself?  (Are streams even thread-safe?  I would have assumed not, but if they are, we should comment that.)



================
Comment at: acxxel/acxxel.h:114
+  bool DeviceMap = false;
+  bool IOMemory = false;
+};
----------------
jhen wrote:
> jlebar wrote:
> > It's not obvious to me what these three mean; maybe they should have comments?
> I removed these and the config structs below because they were too CUDA-specific, and I don't think they were very helpful anyway.
So how do we access those configuration knobs now?  Or, is that functionality just not available to users?


================
Comment at: acxxel/acxxel.h:643
+                             Span<void *> Arguments, Span<size_t> ArgumentSizes,
+                             size_t SharedMemoryBytes = 0) {
+    return rawEnqueueKernelLaunch(TheStream.TheHandle.get(),
----------------
jhen wrote:
> jlebar wrote:
> > I am unclear how this kernel-launching stuff relates to the comment at the top of file saying we don't do this...  Is this for opencl only?
> I can't seem to find the comment at the top of the file that you mention here. Can you be a bit more specific about which comment you mean?
Sorry, I think I was thinking of the commit message.

Question stands, though, wrt exactly what we're saying our library does.  We can launch kernels on both CUDA and OpenCL, but only through this limited interface?


================
Comment at: acxxel/acxxel.h:1160
+        static_cast<ElementType *>(ThePointer.get())[I].~ElementType();
+      }
+    }
----------------
jhen wrote:
> jlebar wrote:
> >      T *Memory = new (MaybeMemory.getValue()) T[ElementCount];
> > 
> > Oh, hm.  Are you sure this is the right way to undo your allocation above?  Like, we don't have to somehow delete the placement new'ed array itself?  (I don't see a way to do it on cppreference.)
> My understanding is that the destructor of the `ThePointer` member will be called when this function ends. The `ThePointer` member is a `std::unique_ptr` with a custom deleter, and the custom deleter will call out to the platform to "free" the memory owned by `ThePointer`. So I don't think there is any memory leak here.
Definitely no memory leak.  The question is more whether C++ requires us to do an "array placement delete" to match the "array placement new", or whether it's OK to in-place destroy the elements piecewise like we do here.

Judging from the random people on the Internet here, it seems like using array placement new for this is dicey and may actually be a buffer overflow.  Maybe we should just construct the elements one at a time?  Then this destruction code is clearly correct.

http://stackoverflow.com/questions/8720425/array-placement-new-requires-unspecified-overhead-in-the-buffer


================
Comment at: acxxel/acxxel.h:295
+/// A program can be created by calling Platform::createProgramFromSource, and a
+/// Kernel can be created from a program by running Platform::createKernel.
+class Program {
----------------
What is the purpose of having separate program and kernel classes?  It seems like a kernel is just a program plus a name, so could we just merge the two?  If not, maybe we can briefly explain why they are separate concepts.


================
Comment at: acxxel/acxxel.h:462
+                              this->getUnregisterHostMemoryHandleDestructor());
+  }
+
----------------
Perhaps it would be consistent per elsewhere to take a Span?  I don't really care either way, though; if you like it like this, that's fine.


https://reviews.llvm.org/D25701





More information about the Parallel_libs-commits mailing list