[PATCH] D90052: AMDGPU/GlobalISel: Add clamp combine
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 28 17:23:45 PDT 2021
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:222
+
+ const APFloat &KO_FPImm = getConstantFPVRegVal(K0, MRI)->getValue();
+ const APFloat &K1_FPImm = getConstantFPVRegVal(K1, MRI)->getValue();
----------------
I would expect the helper above to return the two constants directly rather than having to requery the registers
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:231-233
+ if (isKnownNeverNaN(MI.getOperand(0).getReg(), MRI) ||
+ (getIEEE() && getDX10Clamp() && isFminnumIeee(MI) &&
+ isKnownNeverSNaN(Val, MRI))) {
----------------
Should swap these conditions so the cheap checks are first
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:334-340
+MachineInstr *AMDGPURegBankCombinerHelper::getOpSkipCopy(MachineInstr &MI,
+ unsigned Op) {
+ MachineInstr *Inst = MRI.getVRegDef(MI.getOperand(Op).getReg());
+ if (Inst->getOpcode() == TargetOpcode::COPY)
+ Inst = MRI.getVRegDef(Inst->getOperand(1).getReg());
+ return Inst;
+}
----------------
There's already a helper for this, getDefIgnoringCopies/getDefSrcRegIgnoringCopies
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D90052/new/
https://reviews.llvm.org/D90052
More information about the llvm-commits
mailing list