[all-commits] [llvm/llvm-project] e2cfbf: [OpenMP] Unified entry point for SPMD & generic ke...
Johannes Doerfert via All-commits
all-commits at lists.llvm.org
Sat Jul 10 15:59:00 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e2cfbfcc0c1f3a89ab79c5615f0789b6a9966dc5
https://github.com/llvm/llvm-project/commit/e2cfbfcc0c1f3a89ab79c5615f0789b6a9966dc5
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2021-07-10 (Sat, 10 Jul 2021)
Changed paths:
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.h
M clang/test/OpenMP/amdgcn_target_codegen.cpp
M clang/test/OpenMP/assumes_include_nvptx.cpp
M clang/test/OpenMP/declare_target_codegen_globalization.cpp
M clang/test/OpenMP/nvptx_SPMD_codegen.cpp
M clang/test/OpenMP/nvptx_data_sharing.cpp
M clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
M clang/test/OpenMP/nvptx_force_full_runtime_SPMD_codegen.cpp
M clang/test/OpenMP/nvptx_lambda_capturing.cpp
M clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
M clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
M clang/test/OpenMP/nvptx_parallel_codegen.cpp
M clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
M clang/test/OpenMP/nvptx_target_codegen.cpp
M clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp
M clang/test/OpenMP/nvptx_target_parallel_codegen.cpp
M clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp
M clang/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp
M clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
M clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
M clang/test/OpenMP/nvptx_target_printf_codegen.c
M clang/test/OpenMP/nvptx_target_simd_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_distribute_simd_codegen.cpp
M clang/test/OpenMP/nvptx_teams_codegen.cpp
M clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
M clang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c
M clang/test/OpenMP/remarks_parallel_in_target_state_machine.c
M clang/test/OpenMP/target_parallel_debug_codegen.cpp
M clang/test/OpenMP/target_parallel_for_debug_codegen.cpp
M llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
M llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
M llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
M llvm/lib/Transforms/IPO/OpenMPOpt.cpp
M llvm/test/Transforms/OpenMP/replace_globalization.ll
M llvm/test/Transforms/OpenMP/single_threaded_execution.ll
A openmp/libomptarget/deviceRTLs/common/include/target.h
M openmp/libomptarget/deviceRTLs/common/src/omptarget.cu
M openmp/libomptarget/deviceRTLs/common/src/parallel.cu
M openmp/libomptarget/deviceRTLs/interface.h
M openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
Log Message:
-----------
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
Commit: d9659bf6a036545125a39648b4abe838080299ec
https://github.com/llvm/llvm-project/commit/d9659bf6a036545125a39648b4abe838080299ec
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2021-07-10 (Sat, 10 Jul 2021)
Changed paths:
M llvm/lib/Transforms/IPO/Attributor.cpp
M llvm/lib/Transforms/IPO/OpenMPOpt.cpp
A llvm/test/Transforms/OpenMP/custom_state_machines.ll
A llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll
M llvm/test/Transforms/OpenMP/globalization_remarks.ll
M llvm/test/Transforms/OpenMP/remove_globalization.ll
M llvm/test/Transforms/OpenMP/replace_globalization.ll
M llvm/test/Transforms/OpenMP/single_threaded_execution.ll
M llvm/test/Transforms/PhaseOrdering/openmp-opt-module.ll
Log Message:
-----------
[OpenMP] Create custom state machines for generic target regions
In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.
The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D101977
Compare: https://github.com/llvm/llvm-project/compare/4761d29633ac...d9659bf6a036
More information about the All-commits
mailing list