Weird handling of r+r (pre-inc) addresses

Hal Finkel hfinkel at anl.gov
Thu Mar 21 23:47:36 PDT 2013


----- Original Message -----
> From: "Ulrich Weigand" <Ulrich.Weigand at de.ibm.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at cs.uiuc.edu, "Bill Schmidt" <wschmidt at us.ibm.com>
> Sent: Thursday, March 21, 2013 4:33:32 PM
> Subject: Re: Weird handling of r+r (pre-inc) addresses
> 
> Hal Finkel <hfinkel at anl.gov> wrote on 21.03.2013 09:25:18:
> 
> > > I'd try to fix all this along the lines of:
> > > - remove xaddroff in favor of ptr_rc_nor0 as discussed above
> > > - rewrite iaddroff to properly check for in-range immediates
> 
> There's another twist here: iaddroff must recognize the *output*
> of SelectAddressRegImm, which is no longer the original immediate,
> but has been converted into a truncated TargetConstant (and e.g.
> a PPCISD::Lo has been stripped off).
> 
> The simplest way seems to be to accept any TargetConstant and
> TargetGlobalAddress, and nothing else at this point.
> 
> Note that the corresponding check in PPCDAGToDAGISel::Select
> for loads likewise needs to be updated.
> 
> > > - undo the swapping of BasePtr and Offset for r+r indexed
> > > loads/stores (both when creating and using them)
> > > - swap the "offreg" and "ptrreg" names of the memrr sub-operands
> > >
> > > Does this make sense, or am I missing some reason why things have
> > > to
> > > be this way?
> >
> > I don't think that you're missing anything. I think that cleaning
> > this up and making it truly self-consistent would be great.
> 
> The attached set of four patches implements the four steps as
> listed above.  This seems to work at first glance.
> 
> However, swapping BasePtr and Offset does cause different code
> to be generated in some cases.  This seems to be caused by the
> set of checks in DAGCombiner::CombineToPreIndexedLoadStore here:
> 
>   // Try turning it into a pre-indexed load / store except when:
>   // 1) The new base ptr is a frame index.
>   // 2) If N is a store and the new base ptr is either the same as or
>   is a
>   //    predecessor of the value being stored.
>   // 3) Another use of old base ptr is a predecessor of N. If ptr is
>   folded
>   //    that would create a cycle.
>   // 4) All uses are load / store ops that use it as old base ptr.
> 
> since this list, in particular #1 and #2, explicitly refer to the
> new BasePtr as provided by target code.  Since the PowerPC target
> used to report the offset as BasePtr, but now reports the real
> base, in particular the #2 check sometimes now hits when it
> didn't before.  Not sure what to do about that ...

These patches LGTM, regarding this last point:

I think that in cases where the returned base/offset will fail these checks, we should check the swapped ordering and return that when possible. This will mean repeating the checks in the target code, but they're fairly small (pasted here for the convenience of list readers):

  // Check #1.  Preinc'ing a frame index would require copying the stack pointer
  // (plus the implicit offset) to a register to preinc anyway.
  if (isa<FrameIndexSDNode>(BasePtr) || isa<RegisterSDNode>(BasePtr))
    return false;

  // Check #2.
  if (!isLoad) {
    SDValue Val = cast<StoreSDNode>(N)->getValue();
    if (Val == BasePtr || BasePtr.getNode()->isPredecessorOf(Val.getNode()))
      return false;
  }

[At least on the A2, pre-inc formation is an important performance optimization, and I'd like to not have any performance regressions if possible].

Thanks again,
Hal

> 
> Bye,
> Ulrich
> 
> (See attached file: diff-llvm-rr-xaddroff)
> (See attached file: diff-llvm-rr-iaddroff)
> (See attached file: diff-llvm-rr-swap)
> (See attached file: diff-llvm-rr-rename)



More information about the llvm-commits mailing list