[PATCH] D38610: [AMDGPU] Lower enqueued blocks and generate runtime metadata

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 6 11:05:33 PDT 2017


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:63
+INITIALIZE_PASS(AMDGPUOpenCLEnqueuedBlockLowering,
+                "amdgpu-lower-enqueued-block", "Lower OpenCL enqueued blocks",
+                false, false)
----------------
Re-use DEBUG_TYPE


================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:76-78
+      if (!I.hasOneUse() || !I.user_begin()->hasOneUse() ||
+          !isa<ConstantExpr>(*I.user_begin()) ||
+          !isa<ConstantExpr>(*I.user_begin()->user_begin())) {
----------------
Looking at the users like this confuse me. Don't all uses need to be handled always? Why isn't this just a range loop over all the users?


================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:83
+      auto *AddrCast = cast<ConstantExpr>(*BitCast->user_begin());
+      std::string RuntimeHandle = I.getName().str() + "_runtime_handle";
+      auto *GV = new GlobalVariable(
----------------
Twine


================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:86
+          M, Type::getInt8Ty(C)->getPointerTo(AS.GLOBAL_ADDRESS),
+          /*IsConstant=*/true, GlobalValue::ExternalLinkage,
+          /*Initializer=*/nullptr, RuntimeHandle, /*InsertBefore=*/nullptr,
----------------
Why external linkage?


================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:90
+          /*IsExternallyInitialized=*/true);
+      DEBUG(llvm::dbgs() << "runtime handle created: " << *GV << '\n');
+      auto *NewPtr = ConstantExpr::getPointerCast(GV, AddrCast->getType());
----------------
llvm:: unnecessary


================
Comment at: test/CodeGen/AMDGPU/enqueue-kernel.ll:1
+; RUN: opt -amdgpu-lower-enqueued-block -S -verify-machineinstrs < %s | FileCheck %s
+
----------------
-verify-machineinstrs doesn't make sense with an opt test


================
Comment at: test/CodeGen/AMDGPU/enqueue-kernel.ll:79
+
+attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+fp64-fp16-denormals,-fp32-denormals" "unsafe-fp-math"="false" "use-soft-float"="false" }
+attributes #1 = { norecurse nounwind "enqueued-block" }
----------------
Remove unnecessary attributes


================
Comment at: test/CodeGen/AMDGPU/enqueue-kernel.ll:83-84
+
+!llvm.module.flags = !{!0}
+!opencl.ocl.version = !{!1}
+
----------------
These should be removable 


https://reviews.llvm.org/D38610





More information about the llvm-commits mailing list