[all-commits] [llvm/llvm-project] e2cfbf: [OpenMP] Unified entry point for SPMD & generic ke...

Johannes Doerfert via All-commits all-commits at lists.llvm.org
Sat Jul 10 15:59:00 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e2cfbfcc0c1f3a89ab79c5615f0789b6a9966dc5
      https://github.com/llvm/llvm-project/commit/e2cfbfcc0c1f3a89ab79c5615f0789b6a9966dc5
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2021-07-10 (Sat, 10 Jul 2021)

  Changed paths:
    M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
    M clang/lib/CodeGen/CGOpenMPRuntimeGPU.h
    M clang/test/OpenMP/amdgcn_target_codegen.cpp
    M clang/test/OpenMP/assumes_include_nvptx.cpp
    M clang/test/OpenMP/declare_target_codegen_globalization.cpp
    M clang/test/OpenMP/nvptx_SPMD_codegen.cpp
    M clang/test/OpenMP/nvptx_data_sharing.cpp
    M clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
    M clang/test/OpenMP/nvptx_force_full_runtime_SPMD_codegen.cpp
    M clang/test/OpenMP/nvptx_lambda_capturing.cpp
    M clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
    M clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
    M clang/test/OpenMP/nvptx_parallel_codegen.cpp
    M clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
    M clang/test/OpenMP/nvptx_target_codegen.cpp
    M clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp
    M clang/test/OpenMP/nvptx_target_parallel_codegen.cpp
    M clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp
    M clang/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp
    M clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
    M clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
    M clang/test/OpenMP/nvptx_target_printf_codegen.c
    M clang/test/OpenMP/nvptx_target_simd_codegen.cpp
    M clang/test/OpenMP/nvptx_target_teams_codegen.cpp
    M clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
    M clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
    M clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
    M clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
    M clang/test/OpenMP/nvptx_target_teams_distribute_simd_codegen.cpp
    M clang/test/OpenMP/nvptx_teams_codegen.cpp
    M clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
    M clang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c
    M clang/test/OpenMP/remarks_parallel_in_target_state_machine.c
    M clang/test/OpenMP/target_parallel_debug_codegen.cpp
    M clang/test/OpenMP/target_parallel_for_debug_codegen.cpp
    M llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
    M llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
    M llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
    M llvm/lib/Transforms/IPO/OpenMPOpt.cpp
    M llvm/test/Transforms/OpenMP/replace_globalization.ll
    M llvm/test/Transforms/OpenMP/single_threaded_execution.ll
    A openmp/libomptarget/deviceRTLs/common/include/target.h
    M openmp/libomptarget/deviceRTLs/common/src/omptarget.cu
    M openmp/libomptarget/deviceRTLs/common/src/parallel.cu
    M openmp/libomptarget/deviceRTLs/interface.h
    M openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu

  Log Message:
  -----------
  [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976


  Commit: d9659bf6a036545125a39648b4abe838080299ec
      https://github.com/llvm/llvm-project/commit/d9659bf6a036545125a39648b4abe838080299ec
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2021-07-10 (Sat, 10 Jul 2021)

  Changed paths:
    M llvm/lib/Transforms/IPO/Attributor.cpp
    M llvm/lib/Transforms/IPO/OpenMPOpt.cpp
    A llvm/test/Transforms/OpenMP/custom_state_machines.ll
    A llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll
    M llvm/test/Transforms/OpenMP/globalization_remarks.ll
    M llvm/test/Transforms/OpenMP/remove_globalization.ll
    M llvm/test/Transforms/OpenMP/replace_globalization.ll
    M llvm/test/Transforms/OpenMP/single_threaded_execution.ll
    M llvm/test/Transforms/PhaseOrdering/openmp-opt-module.ll

  Log Message:
  -----------
  [OpenMP] Create custom state machines for generic target regions

In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.

The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D101977


Compare: https://github.com/llvm/llvm-project/compare/4761d29633ac...d9659bf6a036


More information about the All-commits mailing list