[PATCH] D55402: [AMDGPU] Simplify negated condition
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 6 22:04:37 PST 2018
rampitec added inline comments.
================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:166
+ std::swap(Op1, Op2);
+ if (!Op1->isReg() || !Op2->isImm() || Op2->getImm() != 1)
+ return AMDGPU::NoRegister;
----------------
arsenm wrote:
> Op2 doesn't need to be an immediate 1, it just needs to match the selected operand
I do not think it will work if we have select(0, 0, cc). Anyway this tries to catch a very specific pattern coming from DAG, and that is select(0, 1, cc) there. Otherwise I would need to check that selected operand is identical to Op2 *and* src0 != src1 in select. Sounds like an overkill for the problem.
================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:186-190
+ MachineInstr *Andn2 = BuildMI(MBB, *And, And->getDebugLoc(),
+ TII->get(Andn2Opc),
+ And->getOperand(0).getReg())
+ .addReg(ExecReg)
+ .addReg(CCReg, CC->getSubReg());
----------------
arsenm wrote:
> Weird formatting
Processed with clang-format.
================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:237-240
+ RecalcRegs.insert(Reg);
+ RecalcRegs.insert(AMDGPU::VCC_LO);
+ RecalcRegs.insert(AMDGPU::VCC_HI);
+ RecalcRegs.insert(AMDGPU::SCC);
----------------
arsenm wrote:
> Why is just Reg itself OK? Doesn't this need to use the reg alias iterator?
Reg can or can be not a VCC. SCC is needed because s_and_b64 produces it as well. Note, RecalcRegs is a set, so if duplicated only unique values will be added. A reg unit iterator for phys is used later at the bottom of the pass to squash it all for the whole pass.
================
Comment at: test/CodeGen/AMDGPU/optimize-negated-cond-exec-masking.mir:420-423
+registers:
+ - { id: 0, class: sreg_64_xexec }
+ - { id: 1, class: vreg_128 }
+ - { id: 2, class: sreg_64_xexec }
----------------
arsenm wrote:
> You can drop all of the register sections
Indeed, thanks!
================
Comment at: test/CodeGen/AMDGPU/optimize-negated-cond.ll:10
+; GCN: s_cbranch_vccz BB0_4
+define amdgpu_kernel void @test(float addrspace(1)* %arg1) {
+bb:
----------------
arsenm wrote:
> Can this be reduced?
It is already reduced with bug point. In presence of mir test I can drop it altogether if you want.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D55402/new/
https://reviews.llvm.org/D55402
More information about the llvm-commits
mailing list