[all-commits] [llvm/llvm-project] abe71b: [OpenMP][NFC] Delete dead code
Johannes Doerfert via All-commits
all-commits at lists.llvm.org
Mon Nov 6 11:51:54 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: abe71b77f9ac2fc689e899fa6aa2486120c53912
https://github.com/llvm/llvm-project/commit/abe71b77f9ac2fc689e899fa6aa2486120c53912
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2023-11-06 (Mon, 06 Nov 2023)
Changed paths:
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
Log Message:
-----------
[OpenMP][NFC] Delete dead code
Commit: 921bd299134fbe17c676b2486af269e18281def4
https://github.com/llvm/llvm-project/commit/921bd299134fbe17c676b2486af269e18281def4
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2023-11-06 (Mon, 06 Nov 2023)
Changed paths:
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
M clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
M clang/test/OpenMP/reduction_implicit_map.cpp
M clang/test/OpenMP/target_teams_generic_loop_codegen.cpp
Log Message:
-----------
[OpenMP] Remove alignment for global <-> local reduction functions
The alignment did likely not help much but increases the memory
requirement. Note that half of the affected accesses are all performed
by a single thread in each block. The reads are by consecutive threads
in a single block.
Commit: 3de645efe30b83ba1b6d7e500486c4f441a17a61
https://github.com/llvm/llvm-project/commit/3de645efe30b83ba1b6d7e500486c4f441a17a61
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2023-11-06 (Mon, 06 Nov 2023)
Changed paths:
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
M clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
M clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
M clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
M clang/test/OpenMP/reduction_implicit_map.cpp
M clang/test/OpenMP/target_teams_generic_loop_codegen.cpp
M llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
M llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
M llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
M llvm/test/Transforms/OpenMP/add_attributes.ll
M llvm/test/Transforms/OpenMP/always_inline_device.ll
M llvm/test/Transforms/OpenMP/custom_state_machines.ll
M llvm/test/Transforms/OpenMP/custom_state_machines_pre_lto.ll
M llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll
M llvm/test/Transforms/OpenMP/deduplication_target.ll
M llvm/test/Transforms/OpenMP/get_hardware_num_threads_in_block_fold.ll
M llvm/test/Transforms/OpenMP/global_constructor.ll
M llvm/test/Transforms/OpenMP/globalization_remarks.ll
M llvm/test/Transforms/OpenMP/gpu_state_machine_function_ptr_replacement.ll
M llvm/test/Transforms/OpenMP/is_spmd_exec_mode_fold.ll
M llvm/test/Transforms/OpenMP/nested_parallelism.ll
M llvm/test/Transforms/OpenMP/parallel_level_fold.ll
M llvm/test/Transforms/OpenMP/remove_globalization.ll
M llvm/test/Transforms/OpenMP/replace_globalization.ll
M llvm/test/Transforms/OpenMP/single_threaded_execution.ll
M llvm/test/Transforms/OpenMP/spmdization.ll
M llvm/test/Transforms/OpenMP/spmdization_assumes.ll
M llvm/test/Transforms/OpenMP/spmdization_constant_prop.ll
M llvm/test/Transforms/OpenMP/spmdization_guarding.ll
M llvm/test/Transforms/OpenMP/spmdization_guarding_two_reaching_kernels.ll
M llvm/test/Transforms/OpenMP/spmdization_indirect.ll
M llvm/test/Transforms/OpenMP/spmdization_kernel_env_dep.ll
M llvm/test/Transforms/OpenMP/spmdization_no_guarding_two_reaching_kernels.ll
M llvm/test/Transforms/OpenMP/spmdization_remarks.ll
M llvm/test/Transforms/OpenMP/value-simplify-openmp-opt.ll
M llvm/test/Transforms/PhaseOrdering/openmp-opt-module.ll
M mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir
M openmp/libomptarget/DeviceRTL/include/Interface.h
M openmp/libomptarget/DeviceRTL/src/Reduction.cpp
M openmp/libomptarget/include/Environment.h
M openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
Log Message:
-----------
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to
allocate it at runtime per kernel launch. This patch splits the number
into two parts, the size of the reduction data (=all reduction
variables) and the (maximal) length of the buffer. This will allow us to
allocate less if we need less, e.g., if we have less teams than the
maximal length. It also allows us to move code from clangs codegen into
the runtime as we now know how large the reduction data is.
Compare: https://github.com/llvm/llvm-project/compare/e28c393bd1c1...3de645efe30b
More information about the All-commits
mailing list