[llvm-commits] [llvm] r113670 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/PeepholeOptimizer.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h

Gabor Greif ggreif at gmail.com
Mon Sep 13 11:04:01 PDT 2010


On 12 Sep., 02:04, Bill Wendling <wendl... at apple.com> wrote:
> On Sep 11, 2010, at 5:43 AM, Gabor Greif wrote:
>
> > Bill,
>
> > I tried to extend your approach to TSTri (as an eliminable compare
> > instr.) and
> > ANDri (as an implicit compare) with this patch:
>
> That's what I'm attempting to do right now. :-) I want to do a bit more than just replace the TST with an AND that sets CPSR, though that's the first step. I'm introducing a separate callback instruction to handle TST (i.e., comparisons against a non-zero value).
>
> > gabor at google8:~/llvm-build$ svn diff /home/gabor/llvm/lib/Target/ARM/
> > ARMBaseInstrInfo.cpp
> > Index: /home/gabor/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
> > ===================================================================
> > --- /home/gabor/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
> > (revision 113683)
> > +++ /home/gabor/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
> > (working copy)
> > @@ -1372,6 +1372,10 @@
> >     SrcReg = MI->getOperand(0).getReg();
> >     CmpValue = MI->getOperand(1).getImm();
> >     return true;
> > +  case ARM::TSTri:
> > +    SrcReg = MI->getOperand(0).getReg();
> > +    CmpValue = 0;
>
> I think the TST instruction can take a compare value other than 0. The CMP instructions take 0, right?
>
>

TST instructions have 2 operands and the second is a mask. It is a
hybrid of "and" and "cmp" against 0.

>
>
>
> > @@ -1421,6 +1425,7 @@
> >   switch (MI->getOpcode()) {
> >   default: break;
> >   case ARM::ADDri:
> > +  case ARM::ANDri:
> >   case ARM::SUBri:
> >   case ARM::t2ADDri:
> >   case ARM::t2SUBri:
> > gabor at google8:~/llvm-build$
>
> > BUT, it does not trigger on my example, because of
> > something that appears to be a bug:
>
> >        %reg16388<def> = PHI %reg16396, <BB#0>, %reg16392, <BB#3>; GPR:
> > %reg16388,16396,16392
> >        %reg16389<def> = PHI %reg16385, <BB#0>, %reg16391, <BB#3>; GPR:
> > %reg16389,16385,16391
> >        %reg16397<def> = LDR %reg16387, %reg0, 4100, pred:14, pred:
> > %reg0; mem:LD4[%scevgep1] GPR:%reg16397,16387
> >        %reg16390<def> = ANDri %reg16397, 3, pred:14, pred:%reg0, opt:
> > %reg0; GPR:%reg16390,16397
> >        TSTri %reg16397, 3, pred:14, pred:%reg0, %CPSR<imp-def>; GPR:
> > %reg16397
> >        Bcc <BB#3>, pred:0, pred:%CPSR
>
> > The TST seems to use the result of the LDR instead of that of the AND!
> > Very fishy...
>
> If you want to use the result of AND instead of TST in the Bcc, you'll need to say that AND implicitly defines CSPR. In the above case, it looks like the AND is basically dead (at least in this code snippet). The TST performs an "and" of the 3 to the result of LDR, which is what the AND instruction does. The only difference here is that TST sets CPSR and the AND doesn't.
>
> > The bitcode is ok:
>
> >  %tmp2 = load i8** %scevgep1
> >  %0 = ptrtoint i8* %tmp2 to i32
> >  %and = and i32 %0, 3
> >  %tst = icmp eq i32 %and, 0
> >  br i1 %tst, label %sw.bb, label %tailrecurse.switch
>
> > What do you think? Okay to commit the above?
>
> Below is what I have in my tree right now. It's just the skeleton of how I'd like to handle this (it does nothing right now). I like it because of how it separates out different optimization potentialities for each type of instruction, but it may be overkill...

Yes, I think it is a bit of overkill. Logically TST is a compare
instruction. In my last patch I successfully handle both with one
sweep. If you think my approach is sufficient and works well on your
testcases, I gladly check it in. Then I would like to establish
something like

dyn_cast<ARMri<ARM::ANDri> >
queries (with additional intelligence) for checking opcodes and
getting operands. Would simplify my code a great amount.

Cheers,

    Gabor

>
> -bw
>
> Index: include/llvm/Target/TargetInstrInfo.h
> ===================================================================
> --- include/llvm/Target/TargetInstrInfo.h       (revision 113706)
> +++ include/llvm/Target/TargetInstrInfo.h       (working copy)
> @@ -595,6 +595,15 @@
>      return false;
>    }
>
> +  /// OptimizeTestInstr - See if the test instruction can be converted into
> +  /// something more efficient. Update the iterator *only* if a transformation
> +  /// took place.
> +  virtual bool OptimizeTestInstr(MachineInstr * /*CmpInstr*/,
> +                                 unsigned /*SrcReg*/, int /*CmpValue*/,
> +                                 MachineBasicBlock::iterator &MII) const {
> +    return false;
> +  }
> +
>    /// getNumMicroOps - Return the number of u-operations the given machine
>    /// instruction will be decoded to on the target cpu.
>    virtual unsigned getNumMicroOps(const MachineInstr *MI,
> Index: lib/Target/ARM/ARMBaseInstrInfo.cpp
> ===================================================================
> --- lib/Target/ARM/ARMBaseInstrInfo.cpp (revision 113706)
> +++ lib/Target/ARM/ARMBaseInstrInfo.cpp (working copy)
> @@ -1377,6 +1377,18 @@
>    return false;
>  }
>
> +/// OptimizeTestInstr - See if we can convert the TEST instruction into an
> +/// equivalent AND instruction, possibly saving an instruction or two if we can
> +/// make AND set the flags register.
> +bool ARMBaseInstrInfo::
> +OptimizeTestInstr(MachineInstr *CmpInstr, unsigned SrcReg, int CmpValue,
> +                      MachineBasicBlock::iterator &MII) const {
> +  if (CmpValue == 0)
> +    return false;
> +
> +  return false;
> +}
> +
>  /// OptimizeCompareInstr - Convert the instruction supplying the argument to the
>  /// comparison into one that sets the zero bit in the flags register. Update the
>  /// iterator *only* if a transformation took place.
> Index: lib/Target/ARM/ARMBaseInstrInfo.h
> ===================================================================
> --- lib/Target/ARM/ARMBaseInstrInfo.h   (revision 113706)
> +++ lib/Target/ARM/ARMBaseInstrInfo.h   (working copy)
> @@ -350,6 +350,13 @@
>                                      int CmpValue,
>                                      MachineBasicBlock::iterator &MII) const;
>
> +  /// OptimizeTestInstr - See if we can convert the TEST instruction into an
> +  /// equivalent AND instruction, possibly saving an instruction or two if we
> +  /// can make AND set the flags register.
> +  virtual bool OptimizeTestInstr(MachineInstr *CmpInstr, unsigned SrcReg,
> +                                 int CmpValue,
> +                                 MachineBasicBlock::iterator &MII) const;
> +
>    virtual unsigned getNumMicroOps(const MachineInstr *MI,
>                                    const InstrItineraryData *ItinData) const;
>  };
> Index: lib/CodeGen/PeepholeOptimizer.cpp
> ===================================================================
> --- lib/CodeGen/PeepholeOptimizer.cpp   (revision 113706)
> +++ lib/CodeGen/PeepholeOptimizer.cpp   (working copy)
> @@ -249,6 +249,12 @@
>      return true;
>    }
>
> +  // Attempt to optimize the test instruction.
> +  if (TII->OptimizeTestInstr(MI, SrcReg, CmpValue, NextIter)) {
> +    ++NumEliminated;
> +    return true;
> +  }
> +
>    return false;
>  }
>
> _______________________________________________
> llvm-commits mailing list
> llvm-comm... at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list