[all-commits] [llvm/llvm-project] 2f4d44: AMDGPU: add test to show wwm register overwrite issue
Ruiling, Song via All-commits
all-commits at lists.llvm.org
Sat Feb 5 20:38:47 PST 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 2f4d44bcd4a122f8f3c5539b08bdbdb20b72dc26
https://github.com/llvm/llvm-project/commit/2f4d44bcd4a122f8f3c5539b08bdbdb20b72dc26
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2022-02-06 (Sun, 06 Feb 2022)
Changed paths:
A llvm/test/CodeGen/AMDGPU/set-inactive-wwm-overwrite.ll
Log Message:
-----------
AMDGPU: add test to show wwm register overwrite issue
Pre-commit the test to make the diff easy to read later.
Differential Revision: https://reviews.llvm.org/D117527
Commit: 0719c43735b2a40ba11f5431aaf1b64c2e1cb084
https://github.com/llvm/llvm-project/commit/0719c43735b2a40ba11f5431aaf1b64c2e1cb084
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2022-02-06 (Sun, 06 Feb 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/lib/Target/AMDGPU/SIInstructions.td
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.set.inactive.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_pixelshader.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.set.inactive.ll
M llvm/test/CodeGen/AMDGPU/set-inactive-wwm-overwrite.ll
M llvm/test/CodeGen/AMDGPU/wqm.ll
M llvm/test/CodeGen/AMDGPU/wwm-reserved-spill.ll
M llvm/test/CodeGen/AMDGPU/wwm-reserved.ll
Log Message:
-----------
AMDGPU: Don't clobber source register for V_SET_INACTIVE_*
The WWM register has unmodeled register liveness, For v_set_inactive_*,
clobberring source register is dangerous because it will overwrite the
inactive lanes. When the source vgpr is dead at v_set_inactive_lane,
the inactive lanes may be not really dead. This may make common
optimizations doing wrong.
For example in a simple if-then cfg in Machine IR:
bb.if:
%src =
bb.then:
%src1 = COPY %src
%dst = V_SET_INACTIVE %src1(tied-def 0), %inactive
bb.end
... = PHI [0, %bb.then] [%src, %bb.if]
The register coalescer will think it is safe to optimize "%src1 = COPY %src"
in bb.then. And at the same time, there is no interference for the PHI in
bb.end. The source and destination values of the PHI will be assigned
the same register. The single PHI register will be overwritten by the
v_set_inactive, then we would get wrong value in bb.end.
With this change, we will copy the content of the source register before
setting inactive lanes after register allocation. Yes, this will sacrifice
the WWM code generation a little, but I don't have any better idea to do things
correctly.
Differential Revision: https://reviews.llvm.org/D117482
Compare: https://github.com/llvm/llvm-project/compare/6cd0015e7827...0719c43735b2
More information about the All-commits
mailing list