[PATCH] D45748: [RISCV] Separate base from offset in lowerGlobalAddress
Alex Bradbury via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 15 09:15:22 PDT 2018
asb added a comment.
Thanks for the update Sameer. I've been evaluating this patch today. I was curious about the decision to perform a peephole rather than a DAG combine. You mention that the DAG combiner doesn't work for uses spanning multiple basic blocks, but doesn't the SelectionDAG (and hence RISCVISelDAGToDAG) also work on a per-basic-block granularity?
The following IR demonstrates a missed opportunity for rewriting a global to be global+offset and deleting an addi. Specifically you end up with lui+addi+lh, which can be reduced to lui+lh by merging the offset back into the GlobalAddress and using that in the lh, thus allowing the addi to be deleted.
@foo = global [6 x i16] [i16 1, i16 2, i16 3, i16 4, i16 5, i16 0], align 2
define i32 @main() nounwind {
entry:
%0 = load i16, i16* getelementptr inbounds ([6 x i16], [6 x i16]* @foo, i32 0, i32 4), align 2
%cmp = icmp eq i16 %0, 140
br i1 %cmp, label %if.end, label %if.then
if.then:
tail call void @abort()
unreachable
if.end:
ret i32 0
}
declare void @abort()
A quick and dirty addition to doPeepholeLoadStoreADDI can resolve this (intended to be demonstrative rather than commitable code):
// Merge an ADDI into the offset of a load/store instruction where possible.
// (load (add base, off), 0) -> (load base, off)
// (store val, (add base, off)) -> (store val, base, off)
@@ -171,19 +375,51 @@ void RISCVDAGToDAGISel::doPeepholeLoadStoreADDI() {
break;
}
- // Currently, the load/store offset must be 0 to be considered for this
- // peephole optimisation.
- if (!isa<ConstantSDNode>(N->getOperand(OffsetOpIdx)) ||
- N->getConstantOperandVal(OffsetOpIdx) != 0)
- continue;
-
SDValue Base = N->getOperand(BaseOpIdx);
// If the base is an ADDI, we can merge it in to the load/store.
if (!Base.isMachineOpcode() || Base.getMachineOpcode() != RISCV::ADDI)
continue;
+ if (!isa<ConstantSDNode>(N->getOperand(OffsetOpIdx)))
+ continue;
+
+ uint64_t Offset = N->getConstantOperandVal(OffsetOpIdx);
SDValue ImmOperand = Base.getOperand(1);
+ SDValue HBase = Base.getOperand(0);
+
+ // Base is addi, offset is non-zero, addi has a global parameter, addi
+ // parent is lui, addi+lui have only one use then do the global match.
+ if (Offset != 0 && isa<GlobalAddressSDNode>(ImmOperand) &&
+ HBase.isMachineOpcode() && HBase.getMachineOpcode() == RISCV::LUI &&
+ isa<GlobalAddressSDNode>(HBase.getOperand(0)) && Base.hasOneUse() &&
+ HBase.hasOneUse()) {
+ auto GALo = cast<GlobalAddressSDNode>(ImmOperand);
+ // Change the ImmOperand to be a global address with new offset
+ SDValue GALoNew = CurDAG->getTargetGlobalAddress(
+ GALo->getGlobal(), SDLoc(ImmOperand), ImmOperand.getValueType(),
+ GALo->getOffset() + Offset, GALo->getTargetFlags());
+ // Change the HBase immoperand to be a global address with new offset
+ auto GAHi = cast<GlobalAddressSDNode>(HBase.getOperand(0));
+ SDValue GAHiNew = CurDAG->getTargetGlobalAddress(
+ GAHi->getGlobal(), SDLoc(ImmOperand), ImmOperand.getValueType(),
+ GAHi->getOffset() + Offset, GAHi->getTargetFlags());
+ if (BaseOpIdx == 0) // Load
+ CurDAG->UpdateNodeOperands(N, Base.getOperand(0), GALoNew,
+ N->getOperand(2));
+ else // Store
+ CurDAG->UpdateNodeOperands(N, N->getOperand(0), Base.getOperand(0),
+ GALoNew, N->getOperand(3));
+ CurDAG->UpdateNodeOperands(HBase.getNode(), GAHiNew);
+ // Remove dead nodes
+ CurDAG->RemoveDeadNodes();
+ continue;
+ }
+
+ // Currently, the load/store offset must be 0 to be considered for this
+ // peephole optimisation.
+ if (Offset != 0)
+ continue;
if (auto Const = dyn_cast<ConstantSDNode>(ImmOperand)) {
ImmOperand = CurDAG->getTargetConstant(
But I do see some cases where this introduces an extra instruction due to the basic-block granularity, as the global base is no longer reused across basic blocks. Hopefully this is all much easier to handle with GlobalISel.
https://reviews.llvm.org/D45748
More information about the llvm-commits
mailing list