[Parallel_libs-commits] [PATCH] D23038: [StreamExecutor] Add KernelLoaderSpec

Justin Lebar via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Tue Aug 2 14:55:15 PDT 2016


jlebar added inline comments.

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:29
@@ +28,3 @@
+///
+/// The loader spec classes declared here are designed primarily to be
+/// instantiated by the compiler, but they can also be instantiated directly by
----------------
lgtm

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:52
@@ +51,3 @@
+///    } // namespace compiler_cuda_namespace
+///    \encode
+/// 5. The host code, having known beforehand that the compiler would initialize
----------------
I have no idea what that comment was about, sorry.  I think I was just very confused by that most vexing parse.

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:106
@@ +105,3 @@
+  ///
+  /// A pair is used because this will be the key of a map.
+  using ComputeCapability = std::pair<int, int>;
----------------
Not sure we need this line anymore.

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:120
@@ +119,3 @@
+  /// Holds a reference to each passed in PTXCode memory. Does not make a copy
+  /// and does not take ownership.
+  CUDAPTXInMemorySpec(llvm::StringRef KernelName,
----------------
Here and elsewhere, I think we should change "holds a reference to".  For example, if I have a std::shared_ptr<Foo>, I "hold a (strong) reference to Foo", which is the opposite of what you mean.

Maybe just "Does not take ownership of the PTXCode pointers in SpecList's elements." or something would make it clear what we're going after.

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:126
@@ +125,3 @@
+  ///
+  /// Returns nullptr on failed lookup (if the requested compute capability is
+  /// not available). Matches exactly the specified compute capability. Doesn't
----------------
> I think the original design is the right trade-off because this is only meant to be called by the compiler.

sgtm

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:144
@@ +143,3 @@
+/// A KernelLoaderSpec for CUDA fatbin code that resides in memory as a
+/// null-terminated string.
+class CUDAFatbinInMemorySpec : public KernelLoaderSpec {
----------------
It's not a null-terminated string, I think.  I thought it was at first, but I think I was being mislead by "char" -- it's more likely that the first 4 bytes are the length (or something).

================
Comment at: streamexecutor/include/streamexecutor/KernelSpec.h:178
@@ +177,3 @@
+private:
+  const char *Text;
+
----------------
Hm...what if we used char* for text strings and uint8_t* for byte data?

================
Comment at: streamexecutor/lib/unittests/KernelSpecTest.cpp:26
@@ +25,3 @@
+  ASSERT_EQ("KernelName", Spec.getKernelName());
+  ASSERT_EQ(nullptr, Spec.getCode(1, 0));
+}
----------------
ASSERT has an implicit "return", so we usually call EXPECT unless the test is going to crash if the condition fails.

================
Comment at: streamexecutor/lib/unittests/KernelSpecTest.cpp:74
@@ +73,3 @@
+  ASSERT_DEATH(MultiSpec.getOpenCLTextInMemory(),
+               "getting spec that is not present");
+#endif
----------------
EXPECT_DEBUG_DEATH and you don't need the ifdefs.  Same below.


https://reviews.llvm.org/D23038





More information about the Parallel_libs-commits mailing list