[all-commits] [llvm/llvm-project] f6c8a8: [AMDGPU] Iterative scan implementation for atomic ...
Pravin Jagtap via All-commits
all-commits at lists.llvm.org
Thu Jun 8 22:10:25 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: f6c8a8e9cb7d8b8abb382b52c88fb287a6f27a2b
https://github.com/llvm/llvm-project/commit/f6c8a8e9cb7d8b8abb382b52c88fb287a6f27a2b
Author: Pravin Jagtap <Pravin.Jagtap at amd.com>
Date: 2023-06-09 (Fri, 09 Jun 2023)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPU.h
M llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
M llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_optimizations_mul_one.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_raw_buffer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_struct_buffer.ll
A llvm/test/CodeGen/AMDGPU/global_atomics_iterative_scan.ll
Log Message:
-----------
[AMDGPU] Iterative scan implementation for atomic optimizer.
This patch provides an alternative implementation to DPP for Scan Computations.
An alternative implementation iterates over all active lanes of Wavefront
using llvm.cttz and performs the following steps:
1. Read the value that needs to be atomically incremented using
llvm.amdgcn.readlane intrinsic
2. Accumulate the result.
3. Update the scan result using llvm.amdgcn.writelane intrinsic
if intermediate scan results are needed later in the kernel.
Reviewed By: arsenm, cdevadas
Differential Revision: https://reviews.llvm.org/D147408
More information about the All-commits
mailing list