[llvm] r331633 - [AMDGPU] Don't force WQM for DS op
Tim Renouf via llvm-commits
llvm-commits at lists.llvm.org
Mon May 7 06:21:26 PDT 2018
Author: tpr
Date: Mon May 7 06:21:26 2018
New Revision: 331633
URL: http://llvm.org/viewvc/llvm-project?rev=331633&view=rev
Log:
[AMDGPU] Don't force WQM for DS op
Summary:
Previously, all DS ops forced WQM in a pixel shader. That was a hack to
allow for graphics frontends using ds_swizzle to implement explicit
derivatives, on SI/CI at least where DPP is not available. But it forced
WQM for _any_ DS op.
With this commit, DS ops no longer force WQM. Both graphics frontends
(Mesa and LLPC) need to change to issue an explicit llvm.amdgcn.wqm
intrinsic call when calculating explicit derivatives.
The required Mesa change is: "amd/common: use llvm.amdgcn.wqm for
explicit derivatives".
Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D46051
Change-Id: I9b745b626fa91bbd66456e6cf41ee07eeea42f81
Modified:
llvm/trunk/lib/Target/AMDGPU/SIWholeQuadMode.cpp
llvm/trunk/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll
llvm/trunk/test/CodeGen/AMDGPU/spill-m0.ll
Modified: llvm/trunk/lib/Target/AMDGPU/SIWholeQuadMode.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIWholeQuadMode.cpp?rev=331633&r1=331632&r2=331633&view=diff
==============================================================================
--- llvm/trunk/lib/Target/AMDGPU/SIWholeQuadMode.cpp (original)
+++ llvm/trunk/lib/Target/AMDGPU/SIWholeQuadMode.cpp Mon May 7 06:21:26 2018
@@ -325,9 +325,7 @@ char SIWholeQuadMode::scanInstructions(M
unsigned Opcode = MI.getOpcode();
char Flags = 0;
- if (TII->isDS(Opcode) && CallingConv == CallingConv::AMDGPU_PS) {
- Flags = StateWQM;
- } else if (TII->isWQM(Opcode)) {
+ if (TII->isWQM(Opcode)) {
// Sampling instructions don't need to produce results for all pixels
// in a quad, they just require all inputs of a quad to have been
// computed for derivatives.
Modified: llvm/trunk/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll?rev=331633&r1=331632&r2=331633&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll (original)
+++ llvm/trunk/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll Mon May 7 06:21:26 2018
@@ -355,7 +355,7 @@ exit1:
; GCN: v_mov_b32_e32 v0, 2.0
; GCN: s_or_b64 exec, exec
-; GCN: s_and_b64 exec, exec
+; GCN-NOT: s_and_b64 exec, exec
; GCN: v_mov_b32_e32 v0, 1.0
; GCN: {{^BB[0-9]+_[0-9]+}}: ; %UnifiedReturnBlock
Modified: llvm/trunk/test/CodeGen/AMDGPU/spill-m0.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/spill-m0.ll?rev=331633&r1=331632&r2=331633&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/AMDGPU/spill-m0.ll (original)
+++ llvm/trunk/test/CodeGen/AMDGPU/spill-m0.ll Mon May 7 06:21:26 2018
@@ -95,7 +95,8 @@ main_body:
if: ; preds = %main_body
%lds_ptr = getelementptr [64 x float], [64 x float] addrspace(3)* @lds, i32 0, i32 0
- %lds_data = load float, float addrspace(3)* %lds_ptr
+ %lds_data_ = load float, float addrspace(3)* %lds_ptr
+ %lds_data = call float @llvm.amdgcn.wqm.f32(float %lds_data_)
br label %endif
else: ; preds = %main_body
@@ -208,6 +209,7 @@ declare float @llvm.amdgcn.interp.mov(i3
declare void @llvm.amdgcn.exp.f32(i32, i32, float, float, float, float, i1, i1) #0
declare void @llvm.amdgcn.exp.compr.v2f16(i32, i32, <2 x half>, <2 x half>, i1, i1) #0
declare <2 x half> @llvm.amdgcn.cvt.pkrtz(float, float) #1
+declare float @llvm.amdgcn.wqm.f32(float) #1
attributes #0 = { nounwind }
attributes #1 = { nounwind readnone }
More information about the llvm-commits
mailing list