[llvm] r243486 - [PeepholeOptimizer] Look through PHIs to find additional register sources

Jingyue Wu jingyue at google.com
Wed Jul 29 10:37:27 PDT 2015


Hi Bruno,

This patch broke one of our internal tests. I filed a bug at
https://llvm.org/bugs/show_bug.cgi?id=24303.

Would you mind me reverting it for now before we investigate the root
cause? I am investigating it too.

Jingyue



On Tue, Jul 28, 2015 at 2:45 PM, Bruno Cardoso Lopes <
bruno.cardoso at gmail.com> wrote:

> Author: bruno
> Date: Tue Jul 28 16:45:50 2015
> New Revision: 243486
>
> URL: http://llvm.org/viewvc/llvm-project?rev=243486&view=rev
> Log:
> [PeepholeOptimizer] Look through PHIs to find additional register sources
>
> Reapply 243271 with more fixes; although we are not handling multiple
> sources with coalescable copies, we were not properly skipping this
> case.
>
> - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
> instructions.
> - Add findNextSourceAndRewritePHI method to lookup into multiple sources
> returnted by the ValueTracker and rewrite PHIs with new sources.
>
> With these changes we can find more register sources and rewrite more
> copies to allow coaslescing of bitcast instructions. Hence, we eliminate
> unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
> other archs by marking "isBitcast" on target specific instructions. The
> x86 example follows:
>
> A:
>   psllq %mm1, %mm0
>   movd  %mm0, %r9
>   jmp C
>
> B:
>   por %mm1, %mm0
>   movd  %mm0, %r9
>   jmp C
>
> C:
>   movd  %r9, %mm0
>   pshufw  $238, %mm0, %mm0
>
> Becomes:
>
> A:
>   psllq %mm1, %mm0
>   jmp C
>
> B:
>   por %mm1, %mm0
>   jmp C
>
> C:
>   pshufw  $238, %mm0, %mm0
>
> Differential Revision: http://reviews.llvm.org/D11197
> rdar://problem/20404526
>
> Added:
>     llvm/trunk/test/CodeGen/X86/mmx-coalescing.ll
> Modified:
>     llvm/trunk/include/llvm/Target/TargetInstrInfo.h
>     llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp
>     llvm/trunk/lib/Target/X86/X86InstrMMX.td
>
> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=243486&r1=243485&r2=243486&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Jul 28 16:45:50
> 2015
> @@ -1266,6 +1266,33 @@ private:
>    unsigned CallFrameSetupOpcode, CallFrameDestroyOpcode;
>  };
>
> +/// \brief Provide DenseMapInfo for TargetInstrInfo::RegSubRegPair.
> +template<>
> +struct DenseMapInfo<TargetInstrInfo::RegSubRegPair> {
> +  typedef DenseMapInfo<unsigned> RegInfo;
> +
> +  static inline TargetInstrInfo::RegSubRegPair getEmptyKey() {
> +    return TargetInstrInfo::RegSubRegPair(RegInfo::getEmptyKey(),
> +                         RegInfo::getEmptyKey());
> +  }
> +  static inline TargetInstrInfo::RegSubRegPair getTombstoneKey() {
> +    return TargetInstrInfo::RegSubRegPair(RegInfo::getTombstoneKey(),
> +                         RegInfo::getTombstoneKey());
> +  }
> +  /// \brief Reuse getHashValue implementation from
> +  /// std::pair<unsigned, unsigned>.
> +  static unsigned getHashValue(const TargetInstrInfo::RegSubRegPair &Val)
> {
> +    std::pair<unsigned, unsigned> PairVal =
> +        std::make_pair(Val.Reg, Val.SubReg);
> +    return DenseMapInfo<std::pair<unsigned,
> unsigned>>::getHashValue(PairVal);
> +  }
> +  static bool isEqual(const TargetInstrInfo::RegSubRegPair &LHS,
> +                      const TargetInstrInfo::RegSubRegPair &RHS) {
> +    return RegInfo::isEqual(LHS.Reg, RHS.Reg) &&
> +           RegInfo::isEqual(LHS.SubReg, RHS.SubReg);
> +  }
> +};
> +
>  } // End llvm namespace
>
>  #endif
>
> Modified: llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp?rev=243486&r1=243485&r2=243486&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp (original)
> +++ llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp Tue Jul 28 16:45:50 2015
> @@ -98,6 +98,12 @@ static cl::opt<bool>
>  DisableAdvCopyOpt("disable-adv-copy-opt", cl::Hidden, cl::init(false),
>                    cl::desc("Disable advanced copy optimization"));
>
> +// Limit the number of PHI instructions to process
> +// in PeepholeOptimizer::getNextSource.
> +static cl::opt<unsigned> RewritePHILimit(
> +    "rewrite-phi-limit", cl::Hidden, cl::init(10),
> +    cl::desc("Limit the length of PHI chains to lookup"));
> +
>  STATISTIC(NumReuse,      "Number of extension results reused");
>  STATISTIC(NumCmps,       "Number of compares eliminated");
>  STATISTIC(NumImmFold,    "Number of move immediate folded");
> @@ -132,6 +138,10 @@ namespace {
>        }
>      }
>
> +    /// \brief Track Def -> Use info used for rewriting copies.
> +    typedef SmallDenseMap<TargetInstrInfo::RegSubRegPair,
> ValueTrackerResult>
> +        RewriteMapTy;
> +
>    private:
>      bool optimizeCmpInstr(MachineInstr *MI, MachineBasicBlock *MBB);
>      bool optimizeExtInstr(MachineInstr *MI, MachineBasicBlock *MBB,
> @@ -143,7 +153,8 @@ namespace {
>      bool optimizeCoalescableCopy(MachineInstr *MI);
>      bool optimizeUncoalescableCopy(MachineInstr *MI,
>                                     SmallPtrSetImpl<MachineInstr *>
> &LocalMIs);
> -    bool findNextSource(unsigned &Reg, unsigned &SubReg);
> +    bool findNextSource(unsigned Reg, unsigned SubReg,
> +                        RewriteMapTy &RewriteMap);
>      bool isMoveImmediate(MachineInstr *MI,
>                           SmallSet<unsigned, 4> &ImmDefRegs,
>                           DenseMap<unsigned, MachineInstr*> &ImmDefMIs);
> @@ -220,6 +231,20 @@ namespace {
>        assert(Idx < getNumSources() && "SubReg source out of index");
>        return RegSrcs[Idx].SubReg;
>      }
> +
> +    bool operator==(const ValueTrackerResult &Other) {
> +      if (Other.getInst() != getInst())
> +        return false;
> +
> +      if (Other.getNumSources() != getNumSources())
> +        return false;
> +
> +      for (int i = 0, e = Other.getNumSources(); i != e; ++i)
> +        if (Other.getSrcReg(i) != getSrcReg(i) ||
> +            Other.getSrcSubReg(i) != getSrcSubReg(i))
> +          return false;
> +      return true;
> +    }
>    };
>
>    /// \brief Helper class to track the possible sources of a value
> defined by
> @@ -281,6 +306,8 @@ namespace {
>      /// \brief Specialized version of getNextSource for SubregToReg
>      /// instructions.
>      ValueTrackerResult getNextSourceFromSubregToReg();
> +    /// \brief Specialized version of getNextSource for PHI instructions.
> +    ValueTrackerResult getNextSourceFromPHI();
>
>    public:
>      /// \brief Create a ValueTracker instance for the value defined by \p
> Reg.
> @@ -583,58 +610,143 @@ static bool shareSameRegisterFile(const
>
>  /// \brief Try to find the next source that share the same register file
>  /// for the value defined by \p Reg and \p SubReg.
> -/// When true is returned, \p Reg and \p SubReg are updated with the
> -/// register number and sub-register index of the new source.
> +/// When true is returned, the \p RewriteMap can be used by the client to
> +/// retrieve all Def -> Use along the way up to the next source. Any found
> +/// Use that is not itself a key for another entry, is the next source to
> +/// use. During the search for the next source, multiple sources can be
> found
> +/// given multiple incoming sources of a PHI instruction. In this case, we
> +/// look in each PHI source for the next source; all found next sources
> must
> +/// share the same register file as \p Reg and \p SubReg. The client
> should
> +/// then be capable to rewrite all intermediate PHIs to get the next
> source.
>  /// \return False if no alternative sources are available. True otherwise.
> -bool PeepholeOptimizer::findNextSource(unsigned &Reg, unsigned &SubReg) {
> +bool PeepholeOptimizer::findNextSource(unsigned Reg, unsigned SubReg,
> +                                       RewriteMapTy &RewriteMap) {
>    // Do not try to find a new source for a physical register.
>    // So far we do not have any motivating example for doing that.
>    // Thus, instead of maintaining untested code, we will revisit that if
>    // that changes at some point.
>    if (TargetRegisterInfo::isPhysicalRegister(Reg))
>      return false;
> -
>    const TargetRegisterClass *DefRC = MRI->getRegClass(Reg);
> -  unsigned DefSubReg = SubReg;
>
> -  unsigned Src;
> -  unsigned SrcSubReg;
> +  SmallVector<TargetInstrInfo::RegSubRegPair, 4> SrcToLook;
> +  TargetInstrInfo::RegSubRegPair CurSrcPair(Reg, SubReg);
> +  SrcToLook.push_back(CurSrcPair);
>    bool ShouldRewrite = false;
>
> -  // Follow the chain of copies until we reach the top of the use-def
> chain
> -  // or find a more suitable source.
> -  ValueTracker ValTracker(Reg, DefSubReg, *MRI, !DisableAdvCopyOpt, TII);
> -  do {
> -    ValueTrackerResult Res = ValTracker.getNextSource();
> -    if (!Res.isValid())
> -      break;
> -    Src = Res.getSrcReg(0);
> -    SrcSubReg = Res.getSrcSubReg(0);
> +  unsigned PHILimit = RewritePHILimit;
> +  while (!SrcToLook.empty() && PHILimit) {
> +    TargetInstrInfo::RegSubRegPair Pair = SrcToLook.pop_back_val();
> +    // As explained above, do not handle physical registers
> +    if (TargetRegisterInfo::isPhysicalRegister(Pair.Reg))
> +      return false;
>
> -    // Do not extend the live-ranges of physical registers as they add
> -    // constraints to the register allocator.
> -    // Moreover, if we want to extend the live-range of a physical
> register,
> -    // unlike SSA virtual register, we will have to check that they are
> not
> -    // redefine before the related use.
> -    if (TargetRegisterInfo::isPhysicalRegister(Src))
> -      break;
> +    CurSrcPair = Pair;
> +    ValueTracker ValTracker(CurSrcPair.Reg, CurSrcPair.SubReg, *MRI,
> +                            !DisableAdvCopyOpt, TII);
> +    ValueTrackerResult Res;
> +
> +    do {
> +      // Follow the chain of copies until we reach the top of the use-def
> chain
> +      // or find a more suitable source.
> +      Res = ValTracker.getNextSource();
> +      if (!Res.isValid())
> +        break;
> +
> +      // Insert the Def -> Use entry for the recently found source.
> +      ValueTrackerResult CurSrcRes = RewriteMap.lookup(CurSrcPair);
> +      if (CurSrcRes.isValid()) {
> +        assert(CurSrcRes == Res && "ValueTrackerResult found must match");
> +        // An existent entry with multiple sources is a PHI cycle we must
> avoid.
> +        // Otherwise it's an entry with a valid next source we already
> found.
> +        if (CurSrcRes.getNumSources() > 1) {
> +          DEBUG(dbgs() << "findNextSource: found PHI cycle,
> aborting...\n");
> +          return false;
> +        }
> +        break;
> +      }
> +      RewriteMap.insert(std::make_pair(CurSrcPair, Res));
> +
> +      // ValueTrackerResult usually have one source unless it's the
> result from
> +      // a PHI instruction. Add the found PHI edges to be looked up
> further.
> +      unsigned NumSrcs = Res.getNumSources();
> +      if (NumSrcs > 1) {
> +        PHILimit--;
> +        for (unsigned i = 0; i < NumSrcs; ++i)
> +          SrcToLook.push_back(TargetInstrInfo::RegSubRegPair(
> +              Res.getSrcReg(i), Res.getSrcSubReg(i)));
> +        break;
> +      }
> +
> +      CurSrcPair.Reg = Res.getSrcReg(0);
> +      CurSrcPair.SubReg = Res.getSrcSubReg(0);
> +      // Do not extend the live-ranges of physical registers as they add
> +      // constraints to the register allocator. Moreover, if we want to
> extend
> +      // the live-range of a physical register, unlike SSA virtual
> register,
> +      // we will have to check that they aren't redefine before the
> related use.
> +      if (TargetRegisterInfo::isPhysicalRegister(CurSrcPair.Reg))
> +        return false;
> +
> +      const TargetRegisterClass *SrcRC = MRI->getRegClass(CurSrcPair.Reg);
> +
> +      // If this source does not incur a cross register bank copy, use it.
> +      ShouldRewrite = shareSameRegisterFile(*TRI, DefRC, SubReg, SrcRC,
> +                                           CurSrcPair.SubReg);
> +    } while (!ShouldRewrite);
>
> -    const TargetRegisterClass *SrcRC = MRI->getRegClass(Src);
> +    // Continue looking for new sources...
> +    if (Res.isValid())
> +      continue;
>
> -    // If this source does not incur a cross register bank copy, use it.
> -    ShouldRewrite = shareSameRegisterFile(*TRI, DefRC, DefSubReg, SrcRC,
> -                                          SrcSubReg);
> -  } while (!ShouldRewrite);
> +    if (!PHILimit) {
> +      DEBUG(dbgs() << "findNextSource: PHI limit reached\n");
> +      return false;
> +    }
> +
> +    // Do not continue searching for a new source if the there's at least
> +    // one use-def which cannot be rewritten.
> +    if (!ShouldRewrite)
> +      return false;
> +  }
>
>    // If we did not find a more suitable source, there is nothing to
> optimize.
> -  if (!ShouldRewrite || Src == Reg)
> +  if (CurSrcPair.Reg == Reg)
>      return false;
>
> -  Reg = Src;
> -  SubReg = SrcSubReg;
>    return true;
>  }
>
> +/// \brief Insert a PHI instruction with incoming edges \p SrcRegs that
> are
> +/// guaranteed to have the same register class. This is necessary
> whenever we
> +/// successfully traverse a PHI instruction and find suitable sources
> coming
> +/// from its edges. By inserting a new PHI, we provide a rewritten PHI def
> +/// suitable to be used in a new COPY instruction.
> +MachineInstr *
> +insertPHI(MachineRegisterInfo *MRI, const TargetInstrInfo *TII,
> +          const SmallVectorImpl<TargetInstrInfo::RegSubRegPair> &SrcRegs,
> +          MachineInstr *OrigPHI) {
> +  assert(!SrcRegs.empty() && "No sources to create a PHI instruction?");
> +
> +  const TargetRegisterClass *NewRC = MRI->getRegClass(SrcRegs[0].Reg);
> +  unsigned NewVR = MRI->createVirtualRegister(NewRC);
> +  MachineBasicBlock *MBB = OrigPHI->getParent();
> +  MachineInstrBuilder MIB = BuildMI(*MBB, OrigPHI, OrigPHI->getDebugLoc(),
> +                                    TII->get(TargetOpcode::PHI), NewVR);
> +
> +  unsigned MBBOpIdx = 2;
> +  for (auto RegPair : SrcRegs) {
> +    MIB.addReg(RegPair.Reg, 0, RegPair.SubReg);
> +    MIB.addMBB(OrigPHI->getOperand(MBBOpIdx).getMBB());
> +    // Since we're extended the lifetime of RegPair.Reg, clear the
> +    // kill flags to account for that and make RegPair.Reg reaches
> +    // the new PHI.
> +    MRI->clearKillFlags(RegPair.Reg);
> +    MBBOpIdx += 2;
> +  }
> +
> +  return MIB;
> +}
> +
>  namespace {
>  /// \brief Helper class to rewrite the arguments of a copy-like
> instruction.
>  class CopyRewriter {
> @@ -702,22 +814,78 @@ public:
>    virtual bool RewriteCurrentSource(unsigned NewReg, unsigned NewSubReg) {
>      if (!CopyLike.isCopy() || CurrentSrcIdx != 1)
>        return false;
> +    DEBUG(dbgs() << "-- RewriteCurrentSource\n");
> +    DEBUG(dbgs() << "   Replacing: " << CopyLike);
>      MachineOperand &MOSrc = CopyLike.getOperand(CurrentSrcIdx);
>      MOSrc.setReg(NewReg);
>      MOSrc.setSubReg(NewSubReg);
> +    DEBUG(dbgs() << "        With: " << CopyLike);
>      return true;
>    }
>
> -  /// \brief Rewrite the current source with \p NewSrcReg and \p
> NewSecSubReg
> -  /// by creating a new COPY instruction. \p DefReg and \p DefSubReg
> contain the
> -  /// definition to be rewritten from the original copylike instruction.
> -  /// \return The new COPY if the rewriting was possible, nullptr
> otherwise.
> -  /// This is needed to handle Uncoalescable copies, since they are copy
> -  /// like instructions that aren't recognized by the register allocator.
> -  virtual MachineInstr *RewriteCurrentSource(unsigned DefReg,
> -                                             unsigned DefSubReg,
> -                                             unsigned NewSrcReg,
> -                                             unsigned NewSrcSubReg) {
> +  /// \brief Given a \p Def.Reg and Def.SubReg  pair, use \p RewriteMap
> to find
> +  /// the new source to use for rewrite. If \p HandleMultipleSources is
> true and
> +  /// multiple sources for a given \p Def are found along the way, we
> found a
> +  /// PHI instructions that needs to be rewritten.
> +  /// TODO: HandleMultipleSources should be removed once we test PHI
> handling
> +  /// with coalescable copies.
> +  TargetInstrInfo::RegSubRegPair
> +  getNewSource(MachineRegisterInfo *MRI, const TargetInstrInfo *TII,
> +               TargetInstrInfo::RegSubRegPair Def,
> +               PeepholeOptimizer::RewriteMapTy &RewriteMap,
> +               bool HandleMultipleSources = true) {
> +
> +    TargetInstrInfo::RegSubRegPair LookupSrc(Def.Reg, Def.SubReg);
> +    do {
> +      ValueTrackerResult Res = RewriteMap.lookup(LookupSrc);
> +      // If there are no entries on the map, LookupSrc is the new source.
> +      if (!Res.isValid())
> +        return LookupSrc;
> +
> +      // There's only one source for this definition, keep searching...
> +      unsigned NumSrcs = Res.getNumSources();
> +      if (NumSrcs == 1) {
> +        LookupSrc.Reg = Res.getSrcReg(0);
> +        LookupSrc.SubReg = Res.getSrcSubReg(0);
> +        continue;
> +      }
> +
> +      // TODO: remove once multiple srcs w/ coaslescable copies are
> supported.
> +      if (!HandleMultipleSources)
> +        break;
> +
> +      // Multiple sources, recurse into each source to find a new source
> +      // for it. Then, rewrite the PHI accordingly to its new edges.
> +      SmallVector<TargetInstrInfo::RegSubRegPair, 4> NewPHISrcs;
> +      for (unsigned i = 0; i < NumSrcs; ++i) {
> +        TargetInstrInfo::RegSubRegPair PHISrc(Res.getSrcReg(i),
> +                                              Res.getSrcSubReg(i));
> +        NewPHISrcs.push_back(
> +            getNewSource(MRI, TII, PHISrc, RewriteMap,
> HandleMultipleSources));
> +      }
> +
> +      // Build the new PHI node and return its def register as the new
> source.
> +      MachineInstr *OrigPHI = const_cast<MachineInstr *>(Res.getInst());
> +      MachineInstr *NewPHI = insertPHI(MRI, TII, NewPHISrcs, OrigPHI);
> +      DEBUG(dbgs() << "-- getNewSource\n");
> +      DEBUG(dbgs() << "   Replacing: " << *OrigPHI);
> +      DEBUG(dbgs() << "        With: " << *NewPHI);
> +      const MachineOperand &MODef = NewPHI->getOperand(0);
> +      return TargetInstrInfo::RegSubRegPair(MODef.getReg(),
> MODef.getSubReg());
> +
> +    } while (1);
> +
> +    return TargetInstrInfo::RegSubRegPair(0, 0);
> +  }
> +
> +  /// \brief Rewrite the source found through \p Def, by using the \p
> RewriteMap
> +  /// and create a new COPY instruction. More info about RewriteMap in
> +  /// PeepholeOptimizer::findNextSource. Right now this is only used to
> handle
> +  /// Uncoalescable copies, since they are copy like instructions that
> aren't
> +  /// recognized by the register allocator.
> +  virtual MachineInstr *
> +  RewriteSource(TargetInstrInfo::RegSubRegPair Def,
> +                PeepholeOptimizer::RewriteMapTy &RewriteMap) {
>      return nullptr;
>    }
>  };
> @@ -765,31 +933,44 @@ public:
>      return true;
>    }
>
> -  /// \brief Rewrite the current source with \p NewSrcReg and \p
> NewSrcSubReg
> -  /// by creating a new COPY instruction. \p DefReg and \p DefSubReg
> contain the
> -  /// definition to be rewritten from the original copylike instruction.
> -  /// \return The new COPY if the rewriting was possible, nullptr
> otherwise.
> -  MachineInstr *RewriteCurrentSource(unsigned DefReg, unsigned DefSubReg,
> -                                     unsigned NewSrcReg,
> -                                     unsigned NewSrcSubReg) override {
> -    assert(!TargetRegisterInfo::isPhysicalRegister(DefReg) &&
> +  /// \brief Rewrite the source found through \p Def, by using the \p
> RewriteMap
> +  /// and create a new COPY instruction. More info about RewriteMap in
> +  /// PeepholeOptimizer::findNextSource. Right now this is only used to
> handle
> +  /// Uncoalescable copies, since they are copy like instructions that
> aren't
> +  /// recognized by the register allocator.
> +  MachineInstr *
> +  RewriteSource(TargetInstrInfo::RegSubRegPair Def,
> +                PeepholeOptimizer::RewriteMapTy &RewriteMap) override {
> +    assert(!TargetRegisterInfo::isPhysicalRegister(Def.Reg) &&
>             "We do not rewrite physical registers");
>
> -    const TargetRegisterClass *DefRC = MRI.getRegClass(DefReg);
> +    // Find the new source to use in the COPY rewrite.
> +    TargetInstrInfo::RegSubRegPair NewSrc =
> +        getNewSource(&MRI, &TII, Def, RewriteMap);
> +
> +    // Insert the COPY.
> +    const TargetRegisterClass *DefRC = MRI.getRegClass(Def.Reg);
>      unsigned NewVR = MRI.createVirtualRegister(DefRC);
>
>      MachineInstr *NewCopy =
>          BuildMI(*CopyLike.getParent(), &CopyLike, CopyLike.getDebugLoc(),
>                  TII.get(TargetOpcode::COPY), NewVR)
> -            .addReg(NewSrcReg, 0, NewSrcSubReg);
> +            .addReg(NewSrc.Reg, 0, NewSrc.SubReg);
>
> -    NewCopy->getOperand(0).setSubReg(DefSubReg);
> -    if (DefSubReg)
> +    NewCopy->getOperand(0).setSubReg(Def.SubReg);
> +    if (Def.SubReg)
>        NewCopy->getOperand(0).setIsUndef();
>
> -    MRI.replaceRegWith(DefReg, NewVR);
> +    DEBUG(dbgs() << "-- RewriteSource\n");
> +    DEBUG(dbgs() << "   Replacing: " << CopyLike);
> +    DEBUG(dbgs() << "        With: " << *NewCopy);
> +    MRI.replaceRegWith(Def.Reg, NewVR);
>      MRI.clearKillFlags(NewVR);
>
> +    // We extended the lifetime of NewSrc.Reg, clear the kill flags to
> +    // account for that.
> +    MRI.clearKillFlags(NewSrc.Reg);
> +
>      return NewCopy;
>    }
>  };
> @@ -1033,16 +1214,25 @@ bool PeepholeOptimizer::optimizeCoalesca
>    unsigned SrcReg, SrcSubReg, TrackReg, TrackSubReg;
>    while (CpyRewriter->getNextRewritableSource(SrcReg, SrcSubReg, TrackReg,
>                                                TrackSubReg)) {
> -    unsigned NewSrc = TrackReg;
> -    unsigned NewSubReg = TrackSubReg;
> +    // Keep track of PHI nodes and its incoming edges when looking for
> sources.
> +    RewriteMapTy RewriteMap;
>      // Try to find a more suitable source. If we failed to do so, or get
> the
>      // actual source, move to the next source.
> -    if (!findNextSource(NewSrc, NewSubReg) || SrcReg == NewSrc)
> +    if (!findNextSource(TrackReg, TrackSubReg, RewriteMap))
>        continue;
> +
> +    // Get the new source to rewrite. TODO: Only enable handling of
> multiple
> +    // sources (PHIs) once we have a motivating example and testcases for
> it.
> +    TargetInstrInfo::RegSubRegPair TrackPair(TrackReg, TrackSubReg);
> +    TargetInstrInfo::RegSubRegPair NewSrc = CpyRewriter->getNewSource(
> +        MRI, TII, TrackPair, RewriteMap, false /* multiple sources */);
> +    if (SrcReg == NewSrc.Reg || NewSrc.Reg == 0)
> +      continue;
> +
>      // Rewrite source.
> -    if (CpyRewriter->RewriteCurrentSource(NewSrc, NewSubReg)) {
> +    if (CpyRewriter->RewriteCurrentSource(NewSrc.Reg, NewSrc.SubReg)) {
>        // We may have extended the live-range of NewSrc, account for that.
> -      MRI->clearKillFlags(NewSrc);
> +      MRI->clearKillFlags(NewSrc.Reg);
>        Changed = true;
>      }
>    }
> @@ -1071,9 +1261,7 @@ bool PeepholeOptimizer::optimizeUncoales
>    assert(MI && isUncoalescableCopy(*MI) && "Invalid argument");
>
>    // Check if we can rewrite all the values defined by this instruction.
> -  SmallVector<
> -      std::pair<TargetInstrInfo::RegSubRegPair,
> TargetInstrInfo::RegSubRegPair>,
> -      4> RewritePairs;
> +  SmallVector<TargetInstrInfo::RegSubRegPair, 4> RewritePairs;
>    // Get the right rewriter for the current copy.
>    std::unique_ptr<CopyRewriter> CpyRewriter(getCopyRewriter(*MI, *TII,
> *MRI));
>    // If none exists, bails out.
> @@ -1083,39 +1271,32 @@ bool PeepholeOptimizer::optimizeUncoales
>    // Rewrite each rewritable source by generating new COPYs. This works
>    // differently from optimizeCoalescableCopy since it first makes sure
> that all
>    // definitions can be rewritten.
> -  unsigned SrcReg, SrcSubReg, TrackReg, TrackSubReg;
> -  while (CpyRewriter->getNextRewritableSource(SrcReg, SrcSubReg, TrackReg,
> -                                              TrackSubReg)) {
> +  RewriteMapTy RewriteMap;
> +  unsigned Reg, SubReg, CopyDefReg, CopyDefSubReg;
> +  while (CpyRewriter->getNextRewritableSource(Reg, SubReg, CopyDefReg,
> +                                              CopyDefSubReg)) {
>      // If a physical register is here, this is probably for a good reason.
>      // Do not rewrite that.
> -    if (TargetRegisterInfo::isPhysicalRegister(TrackReg))
> +    if (TargetRegisterInfo::isPhysicalRegister(CopyDefReg))
>        return false;
>
>      // If we do not know how to rewrite this definition, there is no point
>      // in trying to kill this instruction.
> -    TargetInstrInfo::RegSubRegPair Def(TrackReg, TrackSubReg);
> -    TargetInstrInfo::RegSubRegPair Src = Def;
> -    if (!findNextSource(Src.Reg, Src.SubReg))
> +    TargetInstrInfo::RegSubRegPair Def(CopyDefReg, CopyDefSubReg);
> +    if (!findNextSource(Def.Reg, Def.SubReg, RewriteMap))
>        return false;
>
> -    RewritePairs.push_back(std::make_pair(Def, Src));
> +    RewritePairs.push_back(Def);
>    }
>
>    // The change is possible for all defs, do it.
> -  for (const auto &PairDefSrc : RewritePairs) {
> -    const auto &Def = PairDefSrc.first;
> -    const auto &Src = PairDefSrc.second;
> -
> +  for (const auto &Def : RewritePairs) {
>      // Rewrite the "copy" in a way the register coalescer understands.
> -    MachineInstr *NewCopy = CpyRewriter->RewriteCurrentSource(
> -        Def.Reg, Def.SubReg, Src.Reg, Src.SubReg);
> +    MachineInstr *NewCopy = CpyRewriter->RewriteSource(Def, RewriteMap);
>      assert(NewCopy && "Should be able to always generate a new copy");
> -
> -    // We extended the lifetime of Src and clear the kill flags to
> -    // account for that.
> -    MRI->clearKillFlags(Src.Reg);
>      LocalMIs.insert(NewCopy);
>    }
> +
>    // MI is now dead.
>    MI->eraseFromParent();
>    ++NumUncoalescableCopies;
> @@ -1523,6 +1704,26 @@ ValueTrackerResult ValueTracker::getNext
>                              Def->getOperand(3).getImm());
>  }
>
> +/// \brief Explore each PHI incoming operand and return its sources
> +ValueTrackerResult ValueTracker::getNextSourceFromPHI() {
> +  assert(Def->isPHI() && "Invalid definition");
> +  ValueTrackerResult Res;
> +
> +  // If we look for a different subreg, bails as we do not
> +  // support composing subreg yet.
> +  if (Def->getOperand(0).getSubReg() != DefSubReg)
> +    return ValueTrackerResult();
> +
> +  // Return all register sources for PHI instructions.
> +  for (unsigned i = 1, e = Def->getNumOperands(); i < e; i += 2) {
> +    auto &MO = Def->getOperand(i);
> +    assert(MO.isReg() && "Invalid PHI instruction");
> +    Res.addSource(MO.getReg(), MO.getSubReg());
> +  }
> +
> +  return Res;
> +}
> +
>  ValueTrackerResult ValueTracker::getNextSourceImpl() {
>    assert(Def && "This method needs a valid definition");
>
> @@ -1545,6 +1746,8 @@ ValueTrackerResult ValueTracker::getNext
>      return getNextSourceFromExtractSubreg();
>    if (Def->isSubregToReg())
>      return getNextSourceFromSubregToReg();
> +  if (Def->isPHI())
> +    return getNextSourceFromPHI();
>    return ValueTrackerResult();
>  }
>
>
> Modified: llvm/trunk/lib/Target/X86/X86InstrMMX.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrMMX.td?rev=243486&r1=243485&r2=243486&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrMMX.td (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrMMX.td Tue Jul 28 16:45:50 2015
> @@ -249,6 +249,7 @@ def MMX_MOVD64grr : MMXI<0x7E, MRMDestRe
>                            (MMX_X86movd2w (x86mmx VR64:$src)))],
>                            IIC_MMX_MOV_REG_MM>, Sched<[WriteMove]>;
>
> +let isBitcast = 1 in
>  def MMX_MOVD64to64rr : MMXRI<0x6E, MRMSrcReg, (outs VR64:$dst), (ins
> GR64:$src),
>                               "movd\t{$src, $dst|$dst, $src}",
>                               [(set VR64:$dst, (bitconvert GR64:$src))],
> @@ -262,7 +263,7 @@ def MMX_MOVD64to64rm : MMXRI<0x6E, MRMSr
>  // These are 64 bit moves, but since the OS X assembler doesn't
>  // recognize a register-register movq, we write them as
>  // movd.
> -let SchedRW = [WriteMove] in {
> +let SchedRW = [WriteMove], isBitcast = 1 in {
>  def MMX_MOVD64from64rr : MMXRI<0x7E, MRMDestReg,
>                                 (outs GR64:$dst), (ins VR64:$src),
>                                 "movd\t{$src, $dst|$dst, $src}",
>
> Added: llvm/trunk/test/CodeGen/X86/mmx-coalescing.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/mmx-coalescing.ll?rev=243486&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/mmx-coalescing.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/mmx-coalescing.ll Tue Jul 28 16:45:50 2015
> @@ -0,0 +1,84 @@
> +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+mmx,+sse2 |
> FileCheck %s
> +
> +%SA = type <{ %union.anon, i32, [4 x i8], i8*, i8*, i8*, i32, [4 x i8] }>
> +%union.anon = type { <1 x i64> }
> +
> +; Check that extra movd (copy) instructions aren't generated.
> +
> +define i32 @test(%SA* %pSA, i16* %A, i32 %B, i32 %C, i32 %D, i8* %E) {
> +entry:
> +; CHECK-LABEL: test
> +; CHECK:       # BB#0:
> +; CHECK-NEXT:  pshufw
> +; CHECK-NEXT:  movd
> +; CHECK-NOT:  movd
> +; CHECK-NEXT:  testl
> +  %shl = shl i32 1, %B
> +  %shl1 = shl i32 %C, %B
> +  %shl2 = shl i32 1, %D
> +  %v = getelementptr inbounds %SA, %SA* %pSA, i64 0, i32 0, i32 0
> +  %v0 = load <1 x i64>, <1 x i64>* %v, align 8
> +  %SA0 = getelementptr inbounds %SA, %SA* %pSA, i64 0, i32 1
> +  %v1 = load i32, i32* %SA0, align 4
> +  %SA1 = getelementptr inbounds %SA, %SA* %pSA, i64 0, i32 3
> +  %v2 = load i8*, i8** %SA1, align 8
> +  %SA2 = getelementptr inbounds %SA, %SA* %pSA, i64 0, i32 4
> +  %v3 = load i8*, i8** %SA2, align 8
> +  %v4 = bitcast <1 x i64> %v0 to <4 x i16>
> +  %v5 = bitcast <4 x i16> %v4 to x86_mmx
> +  %v6 = tail call x86_mmx @llvm.x86.sse.pshuf.w(x86_mmx %v5, i8 -18)
> +  %v7 = bitcast x86_mmx %v6 to <4 x i16>
> +  %v8 = bitcast <4 x i16> %v7 to <1 x i64>
> +  %v9 = extractelement <1 x i64> %v8, i32 0
> +  %v10 = bitcast i64 %v9 to <2 x i32>
> +  %v11 = extractelement <2 x i32> %v10, i32 0
> +  %cmp = icmp eq i32 %v11, 0
> +  br i1 %cmp, label %if.A, label %if.B
> +
> +if.A:
> +; CHECK: %if.A
> +; CHECK-NEXT:  movd
> +; CHECK-NEXT:  psllq
> +  %pa = phi <1 x i64> [ %v8, %entry ], [ %vx, %if.C ]
> +  %v17 = extractelement <1 x i64> %pa, i32 0
> +  %v18 = bitcast i64 %v17 to x86_mmx
> +  %v19 = tail call x86_mmx @llvm.x86.mmx.pslli.q(x86_mmx %v18, i32 %B) #2
> +  %v20 = bitcast x86_mmx %v19 to i64
> +  %v21 = insertelement <1 x i64> undef, i64 %v20, i32 0
> +  %cmp3 = icmp eq i64 %v20, 0
> +  br i1 %cmp3, label %if.C, label %merge
> +
> +if.B:
> +  %v34 = bitcast <1 x i64> %v8 to <4 x i16>
> +  %v35 = bitcast <4 x i16> %v34 to x86_mmx
> +  %v36 = tail call x86_mmx @llvm.x86.sse.pshuf.w(x86_mmx %v35, i8 -18)
> +  %v37 = bitcast x86_mmx %v36 to <4 x i16>
> +  %v38 = bitcast <4 x i16> %v37 to <1 x i64>
> +  br label %if.C
> +
> +if.C:
> +  %vx = phi <1 x i64> [ %v21, %if.A ], [ %v38, %if.B ]
> +  %cvt = bitcast <1 x i64> %vx to <2 x i32>
> +  %ex = extractelement <2 x i32> %cvt, i32 0
> +  %cmp2 = icmp eq i32 %ex, 0
> +  br i1 %cmp2, label %if.A, label %merge
> +
> +merge:
> +; CHECK: %merge
> +; CHECK-NOT:  movd
> +; CHECK-NEXT:  pshufw
> +  %vy = phi <1 x i64> [ %v21, %if.A ], [ %vx, %if.C ]
> +  %v130 = bitcast <1 x i64> %vy to <4 x i16>
> +  %v131 = bitcast <4 x i16> %v130 to x86_mmx
> +  %v132 = tail call x86_mmx @llvm.x86.sse.pshuf.w(x86_mmx %v131, i8 -18)
> +  %v133 = bitcast x86_mmx %v132 to <4 x i16>
> +  %v134 = bitcast <4 x i16> %v133 to <1 x i64>
> +  %v135 = extractelement <1 x i64> %v134, i32 0
> +  %v136 = bitcast i64 %v135 to <2 x i32>
> +  %v137 = extractelement <2 x i32> %v136, i32 0
> +  ret i32 %v137
> +}
> +
> +
> +declare x86_mmx @llvm.x86.sse.pshuf.w(x86_mmx, i8)
> +declare x86_mmx @llvm.x86.mmx.pslli.q(x86_mmx, i32)
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150729/76c3a95e/attachment.html>


More information about the llvm-commits mailing list