[PATCH] D55402: [AMDGPU] Simplify negated condition

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 6 22:04:37 PST 2018


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:166
+    std::swap(Op1, Op2);
+  if (!Op1->isReg() || !Op2->isImm() || Op2->getImm() != 1)
+    return AMDGPU::NoRegister;
----------------
arsenm wrote:
> Op2 doesn't need to be an immediate 1, it just needs to match the selected operand
I do not think it will work if we have select(0, 0, cc). Anyway this tries to catch a very specific pattern coming from DAG, and that is select(0, 1, cc) there. Otherwise I would need to check that selected operand is identical to Op2 *and* src0 != src1 in select. Sounds like an overkill for the problem.


================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:186-190
+  MachineInstr *Andn2 = BuildMI(MBB, *And, And->getDebugLoc(),
+                                TII->get(Andn2Opc),
+                                And->getOperand(0).getReg())
+                                  .addReg(ExecReg)
+                                  .addReg(CCReg, CC->getSubReg());
----------------
arsenm wrote:
> Weird formatting
Processed with clang-format.


================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:237-240
+      RecalcRegs.insert(Reg);
+      RecalcRegs.insert(AMDGPU::VCC_LO);
+      RecalcRegs.insert(AMDGPU::VCC_HI);
+      RecalcRegs.insert(AMDGPU::SCC);
----------------
arsenm wrote:
> Why is just Reg itself OK? Doesn't this need to use the reg alias iterator?
Reg can or can be not a VCC. SCC is needed because s_and_b64 produces it as well. Note, RecalcRegs is a set, so if duplicated only unique values will be added. A reg unit iterator for phys is used later at the bottom of the pass to squash it all for the whole pass.


================
Comment at: test/CodeGen/AMDGPU/optimize-negated-cond-exec-masking.mir:420-423
+registers:
+  - { id: 0, class: sreg_64_xexec }
+  - { id: 1, class: vreg_128 }
+  - { id: 2, class: sreg_64_xexec }
----------------
arsenm wrote:
> You can drop all of the register sections
Indeed, thanks!


================
Comment at: test/CodeGen/AMDGPU/optimize-negated-cond.ll:10
+; GCN:   s_cbranch_vccz BB0_4
+define amdgpu_kernel void @test(float addrspace(1)* %arg1) {
+bb:
----------------
arsenm wrote:
> Can this be reduced?
It is already reduced with bug point. In presence of mir test I can drop it altogether if you want.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55402/new/

https://reviews.llvm.org/D55402





More information about the llvm-commits mailing list