[llvm-commits] [llvm] r113670 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/PeepholeOptimizer.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h

Gabor Greif ggreif at gmail.com
Mon Sep 13 03:27:21 PDT 2010


On Sep 12, 2:04 am, Bill Wendling <wendl... at apple.com> wrote:
> On Sep 11, 2010, at 5:43 AM, Gabor Greif wrote:
>
> > Bill,
>
> > I tried to extend your approach to TSTri (as an eliminable compare
> > instr.) and
> > ANDri (as an implicit compare) with this patch:
>
> That's what I'm attempting to do right now. :-) I want to do a bit more than just replace the TST with an AND that sets CPSR, though that's the first step. I'm introducing a separate callback instruction to handle TST (i.e., comparisons against a non-zero value).
>
> > gabor at google8:~/llvm-build$ svn diff /home/gabor/llvm/lib/Target/ARM/
> > ARMBaseInstrInfo.cpp
> > Index: /home/gabor/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
> > ===================================================================
> > --- /home/gabor/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
> > (revision 113683)
> > +++ /home/gabor/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
> > (working copy)
> > @@ -1372,6 +1372,10 @@
> >     SrcReg = MI->getOperand(0).getReg();
> >     CmpValue = MI->getOperand(1).getImm();
> >     return true;
> > +  case ARM::TSTri:
> > +    SrcReg = MI->getOperand(0).getReg();
> > +    CmpValue = 0;
>
> I think the TST instruction can take a compare value other than 0. The CMP instructions take 0, right?
>

I'd say the opposite. The cmp instruction takes an immediate and a
register, the tst only takes a register and tests that against 0.

>
>
> > @@ -1421,6 +1425,7 @@
> >   switch (MI->getOpcode()) {
> >   default: break;
> >   case ARM::ADDri:
> > +  case ARM::ANDri:
> >   case ARM::SUBri:
> >   case ARM::t2ADDri:
> >   case ARM::t2SUBri:
> > gabor at google8:~/llvm-build$
>
> > BUT, it does not trigger on my example, because of
> > something that appears to be a bug:
>
> >        %reg16388<def> = PHI %reg16396, <BB#0>, %reg16392, <BB#3>; GPR:
> > %reg16388,16396,16392
> >        %reg16389<def> = PHI %reg16385, <BB#0>, %reg16391, <BB#3>; GPR:
> > %reg16389,16385,16391
> >        %reg16397<def> = LDR %reg16387, %reg0, 4100, pred:14, pred:
> > %reg0; mem:LD4[%scevgep1] GPR:%reg16397,16387
> >        %reg16390<def> = ANDri %reg16397, 3, pred:14, pred:%reg0, opt:
> > %reg0; GPR:%reg16390,16397
> >        TSTri %reg16397, 3, pred:14, pred:%reg0, %CPSR<imp-def>; GPR:
> > %reg16397
> >        Bcc <BB#3>, pred:0, pred:%CPSR
>
> > The TST seems to use the result of the LDR instead of that of the AND!
> > Very fishy...
>
> If you want to use the result of AND instead of TST in the Bcc, you'll need to say that AND implicitly defines CSPR. In the above case, it looks like the AND is basically dead (at least in this code snippet). The TST performs an "and" of the 3 to the result of LDR, which is what the AND instruction does. The only difference here is that TST sets CPSR and the AND doesn't.

I thought the point of your patch is to visit all "cmp r, #0" and "tst
r" instructions and when they act on a register that is defined by an
instruction that has a form to set CPSR, you switch to that form. This
allows you to elide the "tst" rsp. "cmp". Did I get it right?

I only know PPC (well-ish) where there are the operations that do not
update flags, such as
   add r1, r2, r3 ; r1 = r2+r3, no flags touched
and the recording variant
   add. r1, r2, r3 ; r1 = r2+r3, update CR0

The first pass could create the simple-minded variant without
recording flags and with explicit compares, then the peephole would go
over this, find the shortcut opportunities and change the opcode to
the recording variant, eventually dropping the explicit compare. You
would normally refrain from *always* using the recording variant, as
there are 7-8 condition registers available (which can be scheduled
too) and in many cases you do not want to clobber CR0 without a
reason.

Sorry for being dense and not recognizing that possibly you are
implementing something more clever :-)

Cheers,

   Gabor

>
> > The bitcode is ok:
>
> >  %tmp2 = load i8** %scevgep1
> >  %0 = ptrtoint i8* %tmp2 to i32
> >  %and = and i32 %0, 3
> >  %tst = icmp eq i32 %and, 0
> >  br i1 %tst, label %sw.bb, label %tailrecurse.switch
>
> > What do you think? Okay to commit the above?
>
> Below is what I have in my tree right now. It's just the skeleton of how I'd like to handle this (it does nothing right now). I like it because of how it separates out different optimization potentialities for each type of instruction, but it may be overkill...
>
> -bw
>
> Index: include/llvm/Target/TargetInstrInfo.h
> ===================================================================
> --- include/llvm/Target/TargetInstrInfo.h       (revision 113706)
> +++ include/llvm/Target/TargetInstrInfo.h       (working copy)
> @@ -595,6 +595,15 @@
>      return false;
>    }
>
> +  /// OptimizeTestInstr - See if the test instruction can be converted into
> +  /// something more efficient. Update the iterator *only* if a transformation
> +  /// took place.
> +  virtual bool OptimizeTestInstr(MachineInstr * /*CmpInstr*/,
> +                                 unsigned /*SrcReg*/, int /*CmpValue*/,
> +                                 MachineBasicBlock::iterator &MII) const {
> +    return false;
> +  }
> +
>    /// getNumMicroOps - Return the number of u-operations the given machine
>    /// instruction will be decoded to on the target cpu.
>    virtual unsigned getNumMicroOps(const MachineInstr *MI,
> Index: lib/Target/ARM/ARMBaseInstrInfo.cpp
> ===================================================================
> --- lib/Target/ARM/ARMBaseInstrInfo.cpp (revision 113706)
> +++ lib/Target/ARM/ARMBaseInstrInfo.cpp (working copy)
> @@ -1377,6 +1377,18 @@
>    return false;
>  }
>
> +/// OptimizeTestInstr - See if we can convert the TEST instruction into an
> +/// equivalent AND instruction, possibly saving an instruction or two if we can
> +/// make AND set the flags register.
> +bool ARMBaseInstrInfo::
> +OptimizeTestInstr(MachineInstr *CmpInstr, unsigned SrcReg, int CmpValue,
> +                      MachineBasicBlock::iterator &MII) const {
> +  if (CmpValue == 0)
> +    return false;
> +
> +  return false;
> +}
> +
>  /// OptimizeCompareInstr - Convert the instruction supplying the argument to the
>  /// comparison into one that sets the zero bit in the flags register. Update the
>  /// iterator *only* if a transformation took place.
> Index: lib/Target/ARM/ARMBaseInstrInfo.h
> ===================================================================
> --- lib/Target/ARM/ARMBaseInstrInfo.h   (revision 113706)
> +++ lib/Target/ARM/ARMBaseInstrInfo.h   (working copy)
> @@ -350,6 +350,13 @@
>                                      int CmpValue,
>                                      MachineBasicBlock::iterator &MII) const;
>
> +  /// OptimizeTestInstr - See if we can convert the TEST instruction into an
> +  /// equivalent AND instruction, possibly saving an instruction or two if we
> +  /// can make AND set the flags register.
> +  virtual bool OptimizeTestInstr(MachineInstr *CmpInstr, unsigned SrcReg,
> +                                 int CmpValue,
> +                                 MachineBasicBlock::iterator &MII) const;
> +
>    virtual unsigned getNumMicroOps(const MachineInstr *MI,
>                                    const InstrItineraryData *ItinData) const;
>  };
> Index: lib/CodeGen/PeepholeOptimizer.cpp
> ===================================================================
> --- lib/CodeGen/PeepholeOptimizer.cpp   (revision 113706)
> +++ lib/CodeGen/PeepholeOptimizer.cpp   (working copy)
> @@ -249,6 +249,12 @@
>      return true;
>    }
>
> +  // Attempt to optimize the test instruction.
> +  if (TII->OptimizeTestInstr(MI, SrcReg, CmpValue, NextIter)) {
> +    ++NumEliminated;
> +    return true;
> +  }
> +
>    return false;
>  }
>
> _______________________________________________
> llvm-commits mailing list
> llvm-comm... at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list