[PATCH] D55444: AMDGPU: Fix DPP combiner

Mon Jan 14 10:27:47 PST 2019

cwabbott added inline comments.

================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:369
+
+	if (BoundCtrlZero && MaskAllLanes) { // [1]
+      OldOpndVGPR.Reg = AMDGPU::NoRegister;
----------------
I don't think this is correct. A lane can be set to 0 for the shift DPP controls (e.g. `DPP_ROW_SL1`) or if EXEC is zero for the lane to be read, in which case the user will do something non-trivial (e.g. OR will pass through the other operand) which we can't emulate in general.

================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:385
+          assert(!BoundCtrlZero); // by check [1]
+		  // undefine old to make regalloc's life easier but this requires
+		  // bound_ctrl:0 set
----------------
Indentation is off here, seems like your editor is using hard tabs

================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:391
+        } else {
+		  // bound_ctrl value ins't important here but this makes the combiner's
+          // logic easier - we check the VALU op for identity value when
----------------
Also here

================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:412
+    }
+	// case when DPP mov old == DPP scr register
+    OldOpndVGPR.Reg = AMDGPU::NoRegister;
----------------
This seems incorrect to me. In the lanes where the source is invalid or the row/bank mask is 0, the DPP move will act as a no-op, here so if that lane is then added, or'd, etc. with something else, we can't emulate that with a single instruction.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55444/new/

https://reviews.llvm.org/D55444