[Parallel_libs-commits] [PATCH] D25701: Initial check-in of Acxxel (StreamExecutor renamed)
Jason Henline via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Tue Oct 25 11:04:44 PDT 2016
jhen added a comment.
In https://reviews.llvm.org/D25701#578210, @jlebar wrote:
> Would it be worth updating (some of?) the examples to the fluent interface?
Good idea, I updated both of them.
================
Comment at: acxxel/acxxel.cpp:25
+Expected<Platform *> getOpenCLPlatform();
+} // namespace opencl
+
----------------
jlebar wrote:
> These namespaces seem redundant with the function names? What are you trying to achieve by keeping these out of the root acxxel namespace? (I suspect you have a good reason...)
Maybe not a good reason, but the reason I did it was because I also defined functions with the same name below. The functions below are wrappers for these functions, and the wrappers are responsible for returning an error if the library was compiled without support for the given platform.
To try to remove the redundancy, I changed both of these function names to just `getPlatform` and left the namespaces in place to distinguish which platform they are for. I'm happy to rename these functions to anything else, but I just want to preserve the public names `getCUDAPlatform` and `getOpenCLPlatform` for the wrappers.
================
Comment at: acxxel/acxxel.h:227
+/// A Stream has an associated device index which is set to the platform's
+/// active device index for the thread that creates the Stream. For a thread to
+/// enqueue commands onto the Stream, its active device index in the platform
----------------
jlebar wrote:
> "platform's active device index for the thread" is awkward. This was hard for me to rephrase, but I think maybe it's better if we move the notion that it's platform-specific to the front of the paragraph, so we don't have to repeat it. Something like
>
> Each Platform has a notion of the currently active device on a particular thread (see Platform::getActiveDeviceForThread and Platform::setActiveDeviceForThread). Each Stream is associated with a specific, fixed device, set to the current thread's active device when we create the Stream. Whenever a thread enqueues commands onto a Stream, its active device must match the Stream's device.
Thanks! As you can tell, I was having a tough time explaining things there.
================
Comment at: acxxel/acxxel.h:246
+ /// Stream, or by any error in the synchronization process itself.
+ Status sync();
+
----------------
jlebar wrote:
> Upon reconsideration, I am not entirely sold on how we're handling errors.
>
> If you use the fluent API, you might reasonably write code like:
>
> Status s = stream.do_something().copy_async().sync();
> if (!s.ok()) return s;
>
> But this is subtly wrong. It will usually do what you want (return errors that occurred in any of the three operations), but if we happen to get context-switched in between the first two operations, an error from the first one may be reported when we launch the second one. In which case the sync may return OK.
>
> (You can write effectively the same code with the non-fluent API as well.)
>
> This problem is exacerbated by the fact that Status is now a warn-if-unused type, so if someone writes the above code and *doesn't* look at the return value, we'll warn them to!
>
> We could fix this by making sync() return void. Alternatively, we could make sync() implicitly do a takeStatus() (we'd want to comment this). I kind of like the latter, because, presuming that you compile with clang and synchronize your streams, this makes it hard to forget error checking. We could add the warn_unused attribute to this function and then get warnings with gcc, too. (You could do that in a separate patch if you wanted.)
>
> What do you think?
`sync()` is already doing an implicit `takeStatus()`, as you suggested. I didn't realize it wasn't mentioned in the comments. I've fixed the comments now.
I also added `warn_if_unused` the `Stream` functions returning a `Status` so that gcc can produce nice warnings too.
================
Comment at: acxxel/cuda_acxxel.cpp:536
+Expected<Platform *> getCUDAPlatform() {
+ static Expected<CUDAPlatform> MaybePlatform = [] {
+ return CUDAPlatform::create();
----------------
jlebar wrote:
> Storing it as a value type here will cause the destructor to run on program shutdown. :)
>
> Now that I see this I like even more that we pushed the encapsulation down to these functions, because it gives you the flexibility to run code on program shutdown, or not, depending on what your platform needs.
Now storing it as a raw pointer instead. Same change for OpenCL.
================
Comment at: acxxel/status.cpp:17
+// Cannot use default because the move assignment operator for std::string is
+// not marked noexcept in a common standard library implementation.
+Status &Status::operator=(Status &&That) noexcept {
----------------
jlebar wrote:
> As far as I can tell from cppreference, it's just not noexcept period.
>
> Any reason you don't want these in the header file?
Changed the comment simply to say that the move assignment operator for std::string is noexcept.
No good reason not to have these in the header. I moved them there.
================
Comment at: acxxel/status.h:22
+#ifdef __clang__
+#define ACXXEL_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
+#else
----------------
jlebar wrote:
> jhen wrote:
> > jlebar wrote:
> > > gcc seems to support this, at least at head: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
> > Unfortunately gcc only seems to support this as a function attribute (and that seems consistent with the documentation you cited). When I tried to use it as a class attribute, I got a compilation error.
> Aha, I see. That's cool, I didn't realize that clang let you warn_unused_result on a class.
>
> Maybe this macro should have a slightly different name, to address my point of confusion. Then maybe we can have a comment to the effect of your response above?
I changed the name to `ACXXEL_WARN_UNUSED_RESULT_TYPE` and added a comment explaining why it only works with clang.
https://reviews.llvm.org/D25701
More information about the Parallel_libs-commits
mailing list