[PATCH] D38610: [AMDGPU] Lower enqueued blocks and generate runtime metadata

Yaxun Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 5 19:14:25 PDT 2017


yaxunl created this revision.
Herald added subscribers: tpr, dstuttard, mgorny, nhaehnle, wdng.

This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle) and adds
"runtime-handle" attribute to the enqueued block kernel.

In LLVM CodeGen the runtime-handle metadata will be translated to
RuntimeHandle metadata in code object. Runtime allocates a global buffer
for each kernel with RuntimeHandel metadata and saves the kernel address
required for the AQL packet into the buffer. __enqueue_kernel function
in device library knows that the invoke function pointer in the block
literal is actually runtime handle and loads the kernel address from it
and puts it into AQL packet for dispatching.

This cannot be done in FE since FE cannot create a unique global variable
with external linkage across LLVM modules. The global variable with internal
linkage does not work since optimization passes will try to replace loads
of the global variable with its initialization value.


https://reviews.llvm.org/D38610

Files:
  include/llvm/Support/AMDGPUCodeObjectMetadata.h
  lib/Support/AMDGPUCodeObjectMetadata.cpp
  lib/Target/AMDGPU/AMDGPU.h
  lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp
  lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  lib/Target/AMDGPU/CMakeLists.txt
  lib/Target/AMDGPU/MCTargetDesc/AMDGPUCodeObjectMetadataStreamer.cpp
  test/CodeGen/AMDGPU/code-object-metadata-from-llvm-ir-full.ll
  test/CodeGen/AMDGPU/enqueue-kernel.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D38610.117949.patch
Type: text/x-patch
Size: 20915 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171006/f03b43c9/attachment.bin>


More information about the llvm-commits mailing list