[PATCH] D18051: [CUDA] Provide CUDA's vector types implemented using clang's vector extension.

Thu Mar 10 13:04:54 PST 2016

jlebar added inline comments.

================
Comment at: lib/Headers/__clang_cuda_runtime_wrapper.h:72
@@ -71,1 +71,3 @@
 
+#if defined(CUDA_VECTOR_TYPES)
+// Prevent inclusion of CUDA's vector_types.h
----------------
Hm, this is a surprising (to me) way of controlling this feature.  Can we use a -f flag instead?  Even if all that -f flag does is define something (although in this case I'd suggest giving it a longer name so it's harder to collide with it).

-fsomething would be more discoverable and canonical, I think, and would be easier to document.

================
Comment at: lib/Headers/__clang_cuda_vector_types.h:76
@@ +75,3 @@
+
+__attribute__((host,device))
+struct dim3 {
----------------
I thought host/device attributes weren't needed on classes, only functions?

================
Comment at: lib/Headers/__clang_cuda_vector_types.h:80
@@ +79,3 @@
+  __attribute__((host, device))
+  dim3(unsigned __x = 1, unsigned __y = 1, unsigned __z = 1)
+      : x(__x), y(__y), z(__z) {}
----------------
Nit: double underscore is a little weird here, and sort of needlessly competes with the language-reserved __ identifier namespace.  Could we just use one underscore?

================
Comment at: lib/Headers/__clang_cuda_vector_types.h:82
@@ +81,3 @@
+      : x(__x), y(__y), z(__z) {}
+  __attribute__((host, device)) explicit dim3(uint3 __a)
+      : x(__a.x), y(__a.y), z(__a.z) {}
----------------
nvidia's version of this function is not explicit -- is this difference intentional?

================
Comment at: lib/Headers/__clang_cuda_vector_types.h:84
@@ +83,3 @@
+      : x(__a.x), y(__a.y), z(__a.z) {}
+  __attribute__((host, device)) operator uint3(void) { return {x, y, z}; }
+};
----------------
This requires C++11 -- is that intentional?


http://reviews.llvm.org/D18051