[PATCH] D38610: [AMDGPU] Lower enqueued blocks and generate runtime metadata
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 6 11:05:33 PDT 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:63
+INITIALIZE_PASS(AMDGPUOpenCLEnqueuedBlockLowering,
+ "amdgpu-lower-enqueued-block", "Lower OpenCL enqueued blocks",
+ false, false)
----------------
Re-use DEBUG_TYPE
================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:76-78
+ if (!I.hasOneUse() || !I.user_begin()->hasOneUse() ||
+ !isa<ConstantExpr>(*I.user_begin()) ||
+ !isa<ConstantExpr>(*I.user_begin()->user_begin())) {
----------------
Looking at the users like this confuse me. Don't all uses need to be handled always? Why isn't this just a range loop over all the users?
================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:83
+ auto *AddrCast = cast<ConstantExpr>(*BitCast->user_begin());
+ std::string RuntimeHandle = I.getName().str() + "_runtime_handle";
+ auto *GV = new GlobalVariable(
----------------
Twine
================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:86
+ M, Type::getInt8Ty(C)->getPointerTo(AS.GLOBAL_ADDRESS),
+ /*IsConstant=*/true, GlobalValue::ExternalLinkage,
+ /*Initializer=*/nullptr, RuntimeHandle, /*InsertBefore=*/nullptr,
----------------
Why external linkage?
================
Comment at: lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp:90
+ /*IsExternallyInitialized=*/true);
+ DEBUG(llvm::dbgs() << "runtime handle created: " << *GV << '\n');
+ auto *NewPtr = ConstantExpr::getPointerCast(GV, AddrCast->getType());
----------------
llvm:: unnecessary
================
Comment at: test/CodeGen/AMDGPU/enqueue-kernel.ll:1
+; RUN: opt -amdgpu-lower-enqueued-block -S -verify-machineinstrs < %s | FileCheck %s
+
----------------
-verify-machineinstrs doesn't make sense with an opt test
================
Comment at: test/CodeGen/AMDGPU/enqueue-kernel.ll:79
+
+attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+fp64-fp16-denormals,-fp32-denormals" "unsafe-fp-math"="false" "use-soft-float"="false" }
+attributes #1 = { norecurse nounwind "enqueued-block" }
----------------
Remove unnecessary attributes
================
Comment at: test/CodeGen/AMDGPU/enqueue-kernel.ll:83-84
+
+!llvm.module.flags = !{!0}
+!opencl.ocl.version = !{!1}
+
----------------
These should be removable
https://reviews.llvm.org/D38610
More information about the llvm-commits
mailing list