[Parallel_libs-commits] [PATCH] D25701: Initial check-in of Acxxel (StreamExecutor renamed)
Justin Lebar via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Mon Oct 24 23:38:51 PDT 2016
jlebar added a comment.
I am happy with this except for my concern about the status handling on sync.
Would it be worth updating (some of?) the examples to the fluent interface?
================
Comment at: acxxel/acxxel.cpp:25
+Expected<Platform *> getOpenCLPlatform();
+} // namespace opencl
+
----------------
These namespaces seem redundant with the function names? What are you trying to achieve by keeping these out of the root acxxel namespace? (I suspect you have a good reason...)
================
Comment at: acxxel/acxxel.h:227
+/// A Stream has an associated device index which is set to the platform's
+/// active device index for the thread that creates the Stream. For a thread to
+/// enqueue commands onto the Stream, its active device index in the platform
----------------
"platform's active device index for the thread" is awkward. This was hard for me to rephrase, but I think maybe it's better if we move the notion that it's platform-specific to the front of the paragraph, so we don't have to repeat it. Something like
Each Platform has a notion of the currently active device on a particular thread (see Platform::getActiveDeviceForThread and Platform::setActiveDeviceForThread). Each Stream is associated with a specific, fixed device, set to the current thread's active device when we create the Stream. Whenever a thread enqueues commands onto a Stream, its active device must match the Stream's device.
================
Comment at: acxxel/acxxel.h:246
+ /// Stream, or by any error in the synchronization process itself.
+ Status sync();
+
----------------
Upon reconsideration, I am not entirely sold on how we're handling errors.
If you use the fluent API, you might reasonably write code like:
Status s = stream.do_something().copy_async().sync();
if (!s.ok()) return s;
But this is subtly wrong. It will usually do what you want (return errors that occurred in any of the three operations), but if we happen to get context-switched in between the first two operations, an error from the first one may be reported when we launch the second one. In which case the sync may return OK.
(You can write effectively the same code with the non-fluent API as well.)
This problem is exacerbated by the fact that Status is now a warn-if-unused type, so if someone writes the above code and *doesn't* look at the return value, we'll warn them to!
We could fix this by making sync() return void. Alternatively, we could make sync() implicitly do a takeStatus() (we'd want to comment this). I kind of like the latter, because, presuming that you compile with clang and synchronize your streams, this makes it hard to forget error checking. We could add the warn_unused attribute to this function and then get warnings with gcc, too. (You could do that in a separate patch if you wanted.)
What do you think?
================
Comment at: acxxel/cuda_acxxel.cpp:536
+Expected<Platform *> getCUDAPlatform() {
+ static Expected<CUDAPlatform> MaybePlatform = [] {
+ return CUDAPlatform::create();
----------------
Storing it as a value type here will cause the destructor to run on program shutdown. :)
Now that I see this I like even more that we pushed the encapsulation down to these functions, because it gives you the flexibility to run code on program shutdown, or not, depending on what your platform needs.
================
Comment at: acxxel/examples/opencl_example.cpp:59
+ for (int I = 0; I < 3; ++I)
+ assert(X[I] == Expected[I]);
+}
----------------
jhen wrote:
> jlebar wrote:
> > If we're using assert(), we should #error if we're built with NDEBUG, or alternatively #undef NDEBUG as the first thing in this file (before any #includes).
> I changed assert to "check and exit" to prevent having to deal with this.
Oh, I didn't realize that this wasn't a test, duh. Sorry about that.
As you have it is fine, or asserts are also fine, since it's just an example.
================
Comment at: acxxel/opencl_acxxel.cpp:535
+ OpenCLEvent *CLEvent = static_cast<OpenCLEvent *>(Event);
+ std::unique_lock<std::mutex> Lock(CLEvent->Mutex);
+ cl_event OldEvent = CLEvent->Event;
----------------
jhen wrote:
> jlebar wrote:
> > Did we decide to make streams single-threaded to avoid this overhead?
> >
> > If so, in addition to getting rid of the mutex, we should get rid of the comment about concurrently enqueueing an event, since it's not valid to touch the stream concurrently at all.
> Thanks for catching this. I got rid of the mutex and got rid of the whole `CLEvent` struct, since it is now just a `cl_event*`.
>
> I also got rid of some comments for `addCallback` that talked about accessing the same stream from multiple threads, but I didn't see a comment that talked about `Event` objects being enqueued on streams concurrently. Please let me know if I didn't get rid of a comment I should have.
You got the comments I was after; we're good.
================
Comment at: acxxel/span.h:94
+ !std::is_base_of<SpanBase, Container>::value &&
+ std::is_convertible<decltype(&Cont[0]), pointer>::value>::type * =
+ nullptr)
----------------
jhen wrote:
> jlebar wrote:
> > Nit, could we just use `data()` in the decltype? We're going to use it in the constructor below anyway. Or is this as specified and the spec is just weird?
> I did this to try to make sure that the container supported `operator[]` (the proposed spec suggested I should check for this). Do you think there is a cleaner way to do that?
If the spec says to check for operator[], I'm down with this as-is.
================
Comment at: acxxel/status.cpp:17
+// Cannot use default because the move assignment operator for std::string is
+// not marked noexcept in a common standard library implementation.
+Status &Status::operator=(Status &&That) noexcept {
----------------
As far as I can tell from cppreference, it's just not noexcept period.
Any reason you don't want these in the header file?
================
Comment at: acxxel/status.h:104
+ }
+ That.TheState = State::MOVED;
+ return *this;
----------------
Unnecessary now that we're taking `That` by value.
================
Comment at: acxxel/status.h:22
+#ifdef __clang__
+#define ACXXEL_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
+#else
----------------
jhen wrote:
> jlebar wrote:
> > gcc seems to support this, at least at head: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
> Unfortunately gcc only seems to support this as a function attribute (and that seems consistent with the documentation you cited). When I tried to use it as a class attribute, I got a compilation error.
Aha, I see. That's cool, I didn't realize that clang let you warn_unused_result on a class.
Maybe this macro should have a slightly different name, to address my point of confusion. Then maybe we can have a comment to the effect of your response above?
================
Comment at: acxxel/status.h:57
+ return *this;
+ }
+
----------------
jhen wrote:
> jlebar wrote:
> > Can we not `=default` these four?
> I learned something sad while trying to do this. `Status` has a `std::string` field, and it seems that the move assignment operator for `std::string` is not marked `noexcept` in my standard library. This means that the default move assignment operator for `Status` is not `noexcept` and I get a compile error if I try to declare it `noexcept` and `default`.
>
> To deal with this, I chose to retain the `noexcept` and to re-implement the default move assignment operator. I was able to default the other 3 with my standard library, so I did that.
OK, I think that's safe because std::string's move constructor doesn't throw, as I read http://en.cppreference.com/w/cpp/string/basic_string/basic_string.
================
Comment at: acxxel/status.h:94
+ // Intentionally implicit.
+ Expected(T &&Value) : TheState(State::SUCCESS), TheValue(std::move(Value)) {}
+
----------------
jhen wrote:
> jlebar wrote:
> > We can do the same thing with Value -- just take it by value and std::move it.
> I think I understood you correctly here. I collapsed the `const T &Value` and `T &&Value` constructors into one constructor taking the argument by value and moving it into `TheValue`. Let me know if I was only supposed to replace one of these constructors that way.
Yep. It's slightly less efficient because we move value twice instead of just once, but I seriously doubt it's going to matter.
================
Comment at: acxxel/tests/acxxel_test.cpp:352
+ std::condition_variable GoConditionVar;
+ std::atomic<bool> GoFlag(false);
+ std::mutex GoMutex;
----------------
Because we're now reading and writing it from inside a mutex, we don't need GoFlag to be an atomic. (I mean, I suppose it doesn't hurt...)
MarkerFlag is read outside a mutex, so it still needs to be an atomic, or otherwise we could just guard everything with the one mutex -- that seems like the simplest approach to me.
================
Comment at: acxxel/tests/span_test.cpp:37
+TEST(Span, PtrSizeConstruction) {
+ int ZeroSize = 0;
+ acxxel::Span<int> Span0(nullptr, ZeroSize);
----------------
jhen wrote:
> jlebar wrote:
> > Nit, not sure why we have a variable for this. If you just want to make it clear what the second arg to the Span constructor means, you could write
> >
> > Span(nullptr, /* size = */ 0);
> >
> > I believe we even have a clang-tidy check to check comments in this style.
> This is a fun one. If I pass in a literal `0` it tells me the call is ambiguous because `Span<int>` also has a constructor with the signature `Span<int>(int *FirstElem, int *LastElem)`. It seems that C++ is helpfully willing to convert a literal `0` to an `int *` (I'm sure it has to do with NULL and C compatibility).
>
> I still thought it was a useful test, though, because the size argument could be stored in a variable as I have done in my test.
Hahaha, okay then. :)
https://reviews.llvm.org/D25701
More information about the Parallel_libs-commits
mailing list