[all-commits] [llvm/llvm-project] 16cda0: [AMDGPU] V_SET_INACTIVE optimizations (#98864)
Carl Ritson via All-commits
all-commits at lists.llvm.org
Wed Sep 4 22:39:49 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 16cda01d22c0ac1713f667d501bdca91594a4e13
https://github.com/llvm/llvm-project/commit/16cda01d22c0ac1713f667d501bdca91594a4e13
Author: Carl Ritson <carl.ritson at amd.com>
Date: 2024-09-05 (Thu, 05 Sep 2024)
Changed paths:
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/lib/Target/AMDGPU/SIInstrInfo.h
M llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.set.inactive.ll
M llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
M llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-preserve-cc.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_pixelshader.ll
M llvm/test/CodeGen/AMDGPU/cse-convergent.ll
M llvm/test/CodeGen/AMDGPU/fix-wwm-vgpr-copy.ll
M llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll
M llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmax.ll
M llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmin.ll
M llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.set.inactive.chain.arg.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.set.inactive.ll
M llvm/test/CodeGen/AMDGPU/set-inactive-wwm-overwrite.ll
M llvm/test/CodeGen/AMDGPU/should-not-hoist-set-inactive.ll
M llvm/test/CodeGen/AMDGPU/wave32.ll
M llvm/test/CodeGen/AMDGPU/wqm.ll
M llvm/test/CodeGen/AMDGPU/wqm.mir
M llvm/test/CodeGen/AMDGPU/wwm-reserved-spill.ll
M llvm/test/CodeGen/AMDGPU/wwm-reserved.ll
M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll
Log Message:
-----------
[AMDGPU] V_SET_INACTIVE optimizations (#98864)
Optimize V_SET_INACTIVE by allow it to run in WWM.
Hence WWM sections are not broken up for inactive lane setting.
WWM V_SET_INACTIVE can typically be lower to V_CNDMASK.
Some cases require use of exec manipulation V_MOV as previous code.
GFX9 sees slight instruction count increase in edge cases due to
smaller constant bus.
Additionally avoid introducing exec manipulation and V_MOVs where
a source of V_SET_INACTIVE is the destination.
This is a common pattern as WWM register pre-allocation often
assigns the same register.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list