[PATCH] D55444: AMDGPU: Fix DPP combiner

Connor Abbott via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 14 10:27:47 PST 2019


cwabbott added inline comments.


================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:369
+
+	if (BoundCtrlZero && MaskAllLanes) { // [1]
+      OldOpndVGPR.Reg = AMDGPU::NoRegister;
----------------
I don't think this is correct. A lane can be set to 0 for the shift DPP controls (e.g. `DPP_ROW_SL1`) or if EXEC is zero for the lane to be read, in which case the user will do something non-trivial (e.g. OR will pass through the other operand) which we can't emulate in general.


================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:385
+          assert(!BoundCtrlZero); // by check [1]
+		  // undefine old to make regalloc's life easier but this requires
+		  // bound_ctrl:0 set
----------------
Indentation is off here, seems like your editor is using hard tabs


================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:391
+        } else {
+		  // bound_ctrl value ins't important here but this makes the combiner's
+          // logic easier - we check the VALU op for identity value when
----------------
Also here


================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:412
+    }
+	// case when DPP mov old == DPP scr register
+    OldOpndVGPR.Reg = AMDGPU::NoRegister;
----------------
This seems incorrect to me. In the lanes where the source is invalid or the row/bank mask is 0, the DPP move will act as a no-op, here so if that lane is then added, or'd, etc. with something else, we can't emulate that with a single instruction.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55444/new/

https://reviews.llvm.org/D55444





More information about the llvm-commits mailing list