[PATCH] D45748: [RISCV] Separate base from offset in lowerGlobalAddress

Alex Bradbury via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 15 09:15:22 PDT 2018


asb added a comment.

Thanks for the update Sameer. I've been evaluating this patch today. I was curious about the decision to perform a peephole rather than a DAG combine. You mention that the DAG combiner doesn't work for uses spanning multiple basic blocks, but doesn't the SelectionDAG (and hence RISCVISelDAGToDAG) also work on a per-basic-block granularity?

The following IR demonstrates a missed opportunity for rewriting a global to be global+offset and deleting an addi. Specifically you end up with lui+addi+lh, which can be reduced to lui+lh by merging the offset back into the GlobalAddress and using that in the lh, thus allowing the addi to be deleted.

  @foo = global [6 x i16] [i16 1, i16 2, i16 3, i16 4, i16 5, i16 0], align 2                                     
                                                                                                                  
  define i32 @main() nounwind {                                                                                   
  entry:                                                                                                          
    %0 = load i16, i16* getelementptr inbounds ([6 x i16], [6 x i16]* @foo, i32 0, i32 4), align 2                
    %cmp = icmp eq i16 %0, 140                                                                                    
    br i1 %cmp, label %if.end, label %if.then                                                                     
                                                                                                                  
  if.then:                                                                                                        
    tail call void @abort()                                                                                       
    unreachable                                                                                                   
                                                                                                                  
  if.end:                                                                                                         
    ret i32 0                                                                                                     
  }                                                                                                               
                                                                                                                  
  declare void @abort() 

A quick and dirty addition to doPeepholeLoadStoreADDI can resolve this (intended to be demonstrative rather than commitable code):

   // Merge an ADDI into the offset of a load/store instruction where possible.
   // (load (add base, off), 0) -> (load base, off)
   // (store val, (add base, off)) -> (store val, base, off)
  @@ -171,19 +375,51 @@ void RISCVDAGToDAGISel::doPeepholeLoadStoreADDI() {
         break;
       }
   
  -    // Currently, the load/store offset must be 0 to be considered for this
  -    // peephole optimisation.
  -    if (!isa<ConstantSDNode>(N->getOperand(OffsetOpIdx)) ||
  -        N->getConstantOperandVal(OffsetOpIdx) != 0)
  -      continue;
  -
       SDValue Base = N->getOperand(BaseOpIdx);
   
       // If the base is an ADDI, we can merge it in to the load/store.
       if (!Base.isMachineOpcode() || Base.getMachineOpcode() != RISCV::ADDI)
         continue;
   
  +    if (!isa<ConstantSDNode>(N->getOperand(OffsetOpIdx)))
  +     continue;
  +   
  +    uint64_t Offset = N->getConstantOperandVal(OffsetOpIdx);
       SDValue ImmOperand = Base.getOperand(1);
  +    SDValue HBase = Base.getOperand(0);
  +
  +    // Base is addi, offset is non-zero, addi has a global parameter, addi 
  +    // parent is lui, addi+lui have only one use then do the global match.
  +    if (Offset != 0 && isa<GlobalAddressSDNode>(ImmOperand) && 
  +        HBase.isMachineOpcode() && HBase.getMachineOpcode() == RISCV::LUI && 
  +        isa<GlobalAddressSDNode>(HBase.getOperand(0)) && Base.hasOneUse() && 
  +          HBase.hasOneUse()) {
  +      auto GALo = cast<GlobalAddressSDNode>(ImmOperand);
  +      // Change the ImmOperand to be a global address with new offset
  +      SDValue GALoNew = CurDAG->getTargetGlobalAddress(
  +          GALo->getGlobal(), SDLoc(ImmOperand), ImmOperand.getValueType(),
  +          GALo->getOffset() + Offset, GALo->getTargetFlags());
  +      // Change the HBase immoperand to be a global address with new offset
  +      auto GAHi = cast<GlobalAddressSDNode>(HBase.getOperand(0));
  +      SDValue GAHiNew = CurDAG->getTargetGlobalAddress(
  +          GAHi->getGlobal(), SDLoc(ImmOperand), ImmOperand.getValueType(),
  +          GAHi->getOffset() + Offset, GAHi->getTargetFlags());
  +      if (BaseOpIdx == 0) // Load
  +        CurDAG->UpdateNodeOperands(N, Base.getOperand(0), GALoNew,
  +                                 N->getOperand(2));
  +      else // Store
  +        CurDAG->UpdateNodeOperands(N, N->getOperand(0), Base.getOperand(0),
  +            GALoNew, N->getOperand(3));
  +      CurDAG->UpdateNodeOperands(HBase.getNode(), GAHiNew);
  +      // Remove dead nodes
  +      CurDAG->RemoveDeadNodes();
  +      continue;
  +    }
  +
  +    // Currently, the load/store offset must be 0 to be considered for this
  +    // peephole optimisation.
  +    if (Offset != 0)
  +      continue;
   
       if (auto Const = dyn_cast<ConstantSDNode>(ImmOperand)) {
         ImmOperand = CurDAG->getTargetConstant(

But I do see some cases where this introduces an extra instruction due to the basic-block granularity, as the global base is no longer reused across basic blocks. Hopefully this is all much easier to handle with GlobalISel.


https://reviews.llvm.org/D45748





More information about the llvm-commits mailing list