[PATCH] D76365: [cuda][hip] Add CUDA builtin surface/texture reference support.

Fri Mar 20 12:28:47 PDT 2020

tra added inline comments.

================
Comment at: clang/lib/CodeGen/CGCUDARuntime.h:51
+    };
+    unsigned Kind : 2;
+    unsigned Extern : 1;
----------------
This should be `DeviceVarKind`

================
Comment at: clang/lib/CodeGen/CGCUDARuntime.h:53
+    unsigned Extern : 1;
+    unsigned Constant : 2;   // Constant variable.
+    unsigned Normalized : 1; // Normalized texture.
----------------
Why does it need 2 bits?

In general, I think there's no point squeezing things into bitfields here as this struct is not going to be used all that often. I'd just use enum and bool.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:701-713
+  if (getLangOpts().CUDAIsDevice) {
+    // As CUDA builtin surface/texture types are replaced, skip generating TBAA
+    // access info.
+    if (AccessType->isCUDADeviceBuiltinSurfaceType()) {
+      if (getTargetCodeGenInfo().getCUDADeviceBuiltinSurfaceDeviceType() !=
+          nullptr)
+        return TBAAAccessInfo();
----------------
Would `isCUDADeviceBuiltinTextureType()` be sufficient criteria for skipping TBAA regeneration?
Or does it need to be 'it is the texture type and it will be replaced with something else'? What is 'something else' is the same type?

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4101-4127
+        if (const ClassTemplateSpecializationDecl *TD =
+                dyn_cast<ClassTemplateSpecializationDecl>(RD)) {
+          Linkage = llvm::GlobalValue::InternalLinkage;
+          const TemplateArgumentList &Args = TD->getTemplateInstantiationArgs();
+          if (RD->hasAttr<CUDADeviceBuiltinSurfaceTypeAttr>()) {
+            assert(Args.size() == 2 &&
+                   "Unexpcted number of template arguments of CUDA device "
----------------
This is the part I'm not comfortable with.
It's possible for the user to use the attribute on other types that do not match the expectations encoded here.
We should not be failing with an assert here because that's *user* error, not a compiler bug.

Expectations we have for the types should be enforced by Sema and compiler should produce proper diagnostics.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4107
+            assert(Args.size() == 2 &&
+                   "Unexpcted number of template arguments of CUDA device "
+                   "builtin surface type.");
----------------
Nit: 'Unexp*e*cted'

================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:6471-6472
+    // Lookup `addrspacecast` through the constant pointer if any.
+    if (auto *ASC = llvm::dyn_cast_or_null<llvm::AddrSpaceCastOperator>(C))
+      C = llvm::cast<llvm::Constant>(ASC->getPointerOperand());
+    if (auto *GV = llvm::dyn_cast_or_null<llvm::GlobalVariable>(C)) {
----------------
What's the expectation here? Do we care which address spaces we're casting to/from? 

================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:6561
+      if (Ty->isCUDADeviceBuiltinSurfaceType())
+        return ABIArgInfo::getDirect(llvm::Type::getInt64Ty(getVMContext()));
+      if (Ty->isCUDADeviceBuiltinTextureType())
----------------
This part could use some additional comments. Why do we return an int64? Is that the size of the handle object? Is it guaranteed to always be a 64-bit int, or does it depend on particualr PTX version?

================
Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:82-94
 #undef __CUDACC__
 #if CUDA_VERSION < 9000
 #define __CUDABE__
 #else
+#define __CUDACC__
 #define __CUDA_LIBDEVICE__
 #endif
----------------
Please add comments on why __CUDACC__ is needed for driver_types.h here? AFAICT, driver_types.h does not have any conditionals that depend on __CUDACC__. What happens if it's not defined.

================
Comment at: clang/lib/Sema/SemaDeclAttr.cpp:6944-6945
+    handleSimpleAttributeWithExclusions<CUDADeviceBuiltinSurfaceTypeAttr,
+                                        CUDADeviceBuiltinTextureTypeAttr>(S, D,
+                                                                          AL);
+    break;
----------------
Nit: Formatting is a bit odd here. Why is AL on a separate line?

================
Comment at: clang/test/CodeGenCUDA/surface.cu:12-14
+template<typename T, int type = 1>
+struct __attribute__((device_builtin_surface_type)) surface : public surfaceReference {
+};
----------------
Please add a test for applying the attribute to a wrong type. I.e. a non-template or a template with different number or kinds of parameters. We should have a proper syntax error and not a compiler crash or silent failure.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76365/new/

https://reviews.llvm.org/D76365