[PATCH] D155856: [LLVM][Opt][RFC] Add LLVM support for C++ Parallel Algorithm Offload

Alex Voicu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 20 09:24:47 PDT 2023


AlexVlx created this revision.
AlexVlx added reviewers: yaxunl, arsenm, tra, jlebar, Anastasia, pekka.jaaskelainen.
AlexVlx added projects: LLVM, clang.
Herald added a subscriber: hiraditya.
Herald added a project: All.
AlexVlx requested review of this revision.
Herald added subscribers: llvm-commits, wdng.

This patch adds the LLVM changes needed by the standard algorithm offload feature being proposed here: https://discourse.llvm.org/t/rfc-adding-c-parallel-algorithm-offload-support-to-clang-llvm/72159/1. The verbose documentation is included in the head of the patch series, with all other patches targetting Clang. What we do here is add two passes, one mandatory and one optional:

1. `StdParAcceleratorCodeSelectionPass` is mandatory, depends on CallGraphAnalysis, and implements the following transform:
  - Traverse the call-graph, and check for functions that are roots for accelerator execution (at the moment, these are GPU kernels exclusively, and would originate in the accelerator specific algorithm library the toolchain uses as an implementation detail);
  - Starting from a root, do a BFS to find all functions that are reachable (called directly or indirectly via a call- chain) and record them;
  - After having done the above for all roots in the Module, we have the computed the set of reachable functions, which is the union of roots and functions reachable from roots;
  - All functions that are not in the reachable set are removed; for the special case where the reachable set is empty we completely clear the module;
2. `StdParAllocationInterpositionPass` is optional, is meant as a fallback with restricted functionality for cases where on-demand paging is unavailable on a platform, and implements the following transform:
  - Iterate all functions in a Module;
  - If a function's name is in a predefined set of allocation / deallocation that the runtime implementation is allowed and expected to interpose, replace all its uses with the equivalent accelerator aware function, iff the latter is available;
    - If the accelerator aware equivalent is unavailable we warn, but compilation will go ahead, which means that it is possible to get issues around the accelerator trying to access inaccessible memory at run time;
  - We rely on direct name matching as opposed to using the new alloc-kind family of attributes and / or the LibCall analysis pass because some of the legacy functions that need replacing would not carry the former or be identified by the latter.

This concludes the patch set around adding support for C++ Parallel Algorithm Offload.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D155856

Files:
  llvm/include/llvm/Transforms/StdPar/StdPar.h
  llvm/lib/Passes/CMakeLists.txt
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Passes/PassBuilderPipelines.cpp
  llvm/lib/Passes/PassRegistry.def
  llvm/lib/Transforms/CMakeLists.txt
  llvm/lib/Transforms/StdPar/CMakeLists.txt
  llvm/lib/Transforms/StdPar/StdPar.cpp
  llvm/test/Transforms/StdPar/accelerator-code-selection.ll
  llvm/test/Transforms/StdPar/allocation-interposition.ll
  llvm/test/Transforms/StdPar/allocation-no-interposition.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D155856.542547.patch
Type: text/x-patch
Size: 40782 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230720/84cf3dfe/attachment.bin>


More information about the llvm-commits mailing list