[llvm-commits] Patch: new backend for Hexagon processor

Tony Linthicum tlinth at codeaurora.org
Fri Dec 9 10:28:17 PST 2011


Hi Jakob,

I've finally got a new patch ready.  I will address each of your 
comments below.  A few of these will take a bit of work to correct and 
maintaining the patch is a bit of a maintenance headache for us.  We are 
hoping that we will be able to commit the patch without fixing all of 
these immediately.  We will, of course, move to fix them as soon as 
possible once the backend is committed.  Each one of these issues is 
noted below, so please let me know what you think.

On 11/28/2011 8:25 PM, Jakob Stoklund Olesen wrote:
>
> On Nov 28, 2011, at 3:14 PM, Tony Linthicum wrote:
>> <hexagon.patch.bz2><llvm.patch.bz2>
>
> I think you need to make a pass over the whole patch and clean up missing punctuation and capitalization in comments.  Some files have no comments at all.  Commented-out code should be deleted.

Done.

> All target-specific command line options should have help text and be named so it is obvious which target they belong to.

Done.

>
> Spurious hunk:
>
> diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
> index ceb1771..9edb875 100644
> --- a/include/llvm/InitializePasses.h
> +++ b/include/llvm/InitializePasses.h
> @@ -232,7 +232,6 @@ void initializeUnreachableMachineBlockElimPass(PassRegistry&);
>   void initializeVerifierPass(PassRegistry&);
>   void initializeVirtRegMapPass(PassRegistry&);
>   void initializeInstSimplifierPass(PassRegistry&);
> -
>   }
>

Done.

>
> There is a lot of commented out code (using #if 0) in HexagonAsmPrinter. Please just remove that.
>
>
> +bool HexagonInstrInfo::
> +isU6_3Immediate(const int value) const {
> +    return (0<= value&&  value<= 511);
> +}
>

Done.

> Please use the isUInt<9>  templates from MathExtras.h for these functions.
>
> Same thing here:
>
> +def s32ImmPred  : PatLeaf<(i32 imm), [{
> +  // immS16 predicate - True if the immediate fits in a 16-bit sign extended
> +  // field.
> +  int64_t v = (int64_t)N->getSExtValue();
> +  return (v<= 2147483647&&  v>= -2147483648);
> +}]>;
>

Done.

> +++ b/lib/Target/Hexagon/HexagonCFGOptimizer.cpp
>
> There is not a lot of documentation in this file.  Is it doing something that the target-independent code placement optimizers can't do?
>

The functionality here should be able to be folded into the 
target-independent code.  Can we do this later?

>
> +cl::opt<std::string>  CFGFn("cfgfn", cl::Hidden, cl::desc(""), cl::init(""),
> +                           cl::ZeroOrMore);
> +cl::opt<int>  CFGBlock("cfgblock", cl::Hidden, cl::desc(""), cl::init(-1),
> +                      cl::ZeroOrMore);
>
> Are these options leftovers from debugging? If so, just delete them. If not, please use longer option names, and add help text.
>

Yes.  Deleted.

> +static bool IsConditionalBranch(int Opc) {
> +  return (Opc == Hexagon::JMP_Pred) || (Opc == Hexagon::JMP_PredNot)
> +    || (Opc == Hexagon::JMP_PredPt) || (Opc == Hexagon::JMP_PredNotPt);
> +}
> +
> +
> +static bool IsUnconditionalJump(int Opc) {
> +  return (Opc == Hexagon::JMP);
> +}
>
> This information should already be available from MCInstrDesc. See the isBranch() and isBarrier() methods.
>

Will clean this up when we integrate with target-independent optimizer.

>
> +++ b/lib/Target/Hexagon/HexagonCallingConvLower.cpp
> @@ -0,0 +1,207 @@
> +//===-- llvm/CallingConvLower.cpp - Calling Conventions -------------------===//
>
> The comment is wrong.
>
> It is unfortunate that you had to duplicate this code. How much needs to be changed in the target-independent code for it to be useful?
>

We believe this can be folded into the target-independent code.  Can we 
do this later?

>
> --- /dev/null
> +++ b/lib/Target/Hexagon/HexagonExpandPredSpillCode.cpp
> @@ -0,0 +1,175 @@
> +//===--- HexagonExpandPredSpillCode.cpp - Expand Predicate Spill Code ----===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +
>
> I think that is the only comment in this file. Please explain what this pass does.
>
> It looks like you could just use a TII::expandPostRAPseudo() hook instead?
>

Yes, we believe we can.  Later?

>
> +++ b/lib/Target/Hexagon/HexagonFrameLowering.cpp
>
> +static cl::opt<bool>  DisableDeallocRet("disable-dealloc-ret", cl::Hidden,
> +    cl::desc("Disable Dealloc Retrun"));
>
> Please use some kind of indication that this is a target-specific command line option.

Done.

>
> +    maxCallFrameSize = (maxCallFrameSize + AlignMask)&  ~AlignMask;
>
> Please use RoundUpToAlignment().
>
> +#define ALLOCFRAME_MAX 16384
>
> Use 'const unsigned' for constants instead of preprocessor macros.
>

Done.

> +  if (RetOpcode == Hexagon::TCRETURNtg || RetOpcode == Hexagon::TCRETURNtext)
> +    return true;
> +  else
> +    return false;
>
> a.k.a. "return RetOpcode == Hexagon::TCRETURNtg || RetOpcode == Hexagon::TCRETURNtext"
>

Done.

>
> +  // We can only schedule double loads if we spill contiguous callee-saved regs
> +  // For instance, we cannot scheduled double-word loads if we spill r24,
> +  // r26, and r27.
> +  // Hexagon_TODO: We can try to double-word align odd registers for -O2 and
> +  // above
> +  bool ContiguousRegs = true;
> +
> +  for (unsigned i = 0; i<  CSI.size(); ++i) {
> +    unsigned Reg = CSI[i].getReg();
> +
> +    //
> +    // Check if we can use a double-word store
> +    //
> +    const unsigned* SuperReg = TRI->getSuperRegisters(Reg);
>
> If double-word loads and stores are always better, you can change your getCalleeSavedRegs() hook to return D-registers instead.  The register allocator understands the aliasing.
>

Double loads and stores are not always better.  We just want to use them 
if we are spilling contiguous registers that can be treated as a pair.

> +void HexagonFrameLowering::
> +processFunctionBeforeFrameFinalized(MachineFunction&MF) const {
> +  // do nothing
> +}
> +
> +void HexagonFrameLowering::
> +processFunctionBeforeCalleeSavedScan(MachineFunction&MF,
> +                                     RegScavenger* RS) const {
> +  // do nothing
> +}
>
> Just delete these empty overrides.
>

Done.

> +++ b/lib/Target/Hexagon/HexagonFrameLowering.h
> +#include "HexagonSubtarget.h"
>
> You have both and include and a forward decl of HexagonSubtarget.
>

Removed forward decl.

>
> +++ b/lib/Target/Hexagon/HexagonHardwareLoops.cpp
>
> +    void print(raw_ostream&OS, const TargetMachine *TM = 0) const {
> +      if (isReg()) { OS<<  "%reg"<<  getReg(); }
>
> Better pretty-printing of registers: OS<<  PrintReg(getReg());
>

Done.

>
> +    BuildMI(*Preheader, InsertPos, InsertPos->getDebugLoc(),
> +            TII->get(TargetOpcode::COPY), CountReg).addReg(TripCount->getReg());
> +    if (TripCount->isNeg()) {
> +      unsigned CountReg1 = CountReg;
> +      CountReg = MF->getRegInfo().createVirtualRegister(RC);
> +      BuildMI(*Preheader, InsertPos, InsertPos->getDebugLoc(),
> +              TII->get(Hexagon::NEG), CountReg).addReg(CountReg1);
> +    }
>
> This doesn't look right.  You never define CountReg1 anywhere. The machine code verifier should tell you.
>

CountReg1 is defined.  See it's declaration above under the if statement.

>
> +++ b/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp
>
> +static bool FitsS11_0(int64_t v) {
> +  return (v<= 1023&&  v>= -1024);
> +}
>
> Again, please use the MathExtras.h functions for this.

Done.

>
> +  }
> +  else if (LD->getValueType(0) == MVT::i64&&
> +           LD->getExtensionType() == ISD::SEXTLOAD) {
> +    return SelectIndexedLoadSignExtend64(LD, Opcode, dl);
> +  }
> +  else { // Handle normal loads
>
> Two things: Please put else on the same line as the closing brace, and don't use else after return/continue.  This is happening in many places.
>

I scanned the code and believe I have fixed all of these occurrences. 
Please let me know if I missed any.

>
> +++ b/lib/Target/Hexagon/HexagonISelLowering.cpp
>
> +#define Hexagon_MAX_RET_SIZE 64
>
> Please use C++ constants instead of macros.
>

Done.

> +cl::opt<bool>
> +ExpandBuiltinMemcpy("expand-memcpy", cl::init(true), cl::Hidden,
> +                    cl::desc("Enable memcpy/memset builtin expansion"));
> +
> +cl::opt<bool>
> +EmitJumpTables("jump-tables", cl::init(true), cl::Hidden,
> +                    cl::desc("Emit jump tables"));
>
> Please rename these options, and make them static.
>

Done.

>
> +  if (Flag.getNode())
> +  return DAG.getNode(HexagonISD::RET_FLAG, dl, MVT::Other, Chain, Flag);
>
> Indentation.
>

Done.

> +    //setOperationAction(ISD::DBG_STOPPOINT, MVT::Other, Expand);
> +    //setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
> +    //setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand);
> +    //setOperationAction(ISD::DECLARE, MVT::Other, Expand);
>
> Please delete commented-out code. There is a lot of it.
>

Done.  I hope I found all of it.

> +++ b/lib/Target/Hexagon/HexagonInstrInfo.cpp
>
> +bool HexagonInstrInfo::isMoveInstr(const MachineInstr&MI,
>
> This hook is obsolete and not called from anywhere.  Make sure you always generate target-independent COPY instructions before register allocation instead.
>

Deleted.

>
> +void HexagonInstrInfo::copyPhysReg(MachineBasicBlock&MBB,
> +                                 MachineBasicBlock::iterator I, DebugLoc DL,
> +                                 unsigned DestReg, unsigned SrcReg,
> +                                 bool KillSrc) const {
> +  // Hexagon_TODO: add code
>
> It seems you already did...

Deleted comment.

>
> +  if (Hexagon::IntRegsRegClass.contains(SrcReg)&&
> +      Hexagon::IntRegsRegClass.contains(DestReg)) {
>
> You can simply say contains(SrcReg, DestReg) here.

Done.

>
> +    BuildMI(MBB, I, DL, get(Hexagon::TFR), DestReg).addReg(SrcReg);
> +    return;
> +  }
> +  else if (Hexagon::DoubleRegsRegClass.contains(SrcReg)&&
>
> ... and don't else after return
>

Done.

>
> +  else if (Hexagon::DoubleRegsRegClass.contains(DestReg)&&
> +           Hexagon::IntRegsRegClass.contains(SrcReg)) {
> +    // We can have an overlap between single and double reg: r1:0 = r0
>
> This looks odd.  How can you copy between registers of different sizes?
>

I looked into this, and it appears that this is a situation in which a 
symptom was fixed rather than the root cause.  It appears that zero 
extends are being lowered incorrectly and producing these copies.  This 
code is converting them to what they should have been lowered to in the 
first place.  Can we fix this after committing?

>
> +  if (RC == Hexagon::IntRegsRegisterClass) {
> +    BuildMI(MBB, I, DL, get(Hexagon::STriw))
> +          .addFrameIndex(FI).addImm(0)
> +          .addReg(SrcReg, getKillRegState(isKill)).addMemOperand(MMO);
> +  }
>
> Here, I recommend you use Hexagon::IntRegsRegisterClass->hasSubClassEq(RC).  It has been a common bug in other targets when new sub-classes are added.
>

Done.

> +MachineInstr *HexagonInstrInfo::foldMemoryOperandImpl(MachineFunction&MF,
> +                                                    MachineInstr* MI,
> +                                          const SmallVectorImpl<unsigned>  &Ops,
> +                                                    int FI) const {
> +  // Hexagon_TODO: Implement
> +  return(0);
> +}
>
> You probably don't need to implement this hook on a load/store architecture.
>

We actually have some arithmetic operations that take memory operands.

>
> +unsigned HexagonInstrInfo::createVR(MachineFunction* MF, MVT VT) const {
>
> This function is unused?
>

Deleted.

>
> +bool HexagonInstrInfo::
> +isS12_Immediate(const int value) const {
> +  // predicate - True if the immediate fits is a 12-bit signed value
> +  return (value<= 2047&&  value>= -2048);
> +}
>
> These functions come in many copies... ;) I think one is enough.
>

Ha!  Agreed. Done.

>
> +bool HexagonInstrInfo::isPredicable(MachineInstr *MI) const {
> +  switch(Opc) {
> +  case Hexagon::TFRI:
> +    return isS12_Immediate(MI->getOperand(1).getImm());
>
> Can you make use of TargetFlags to avoid these many opcode switches?
>
>
> +bool HexagonInstrInfo::isPredicated(const MachineInstr *MI) const {
> +  switch (MI->getOpcode()) {
>
> Again, use MCInstrDesc or TargetFlags to avoid these large switches.
>

Yes, there needs to be a clean up of this code and better use of 
TargetFlags all around.  Can we do this after committing?

>
> +bool HexagonInstrInfo::
> +isValidOffset(const int Opcode, const int Offset) const {
> +
> +  // Hexagon_TODO: Ugh! hardcoding. Is there a target specific API that can be
> +  // used?
> +#define Hexagon_MEMW_OFFSET_MAX  4095
> +#define Hexagon_MEMW_OFFSET_MIN -4096
>
> I agree.  You really need to clean up your literal predicates. They are literally all over the place.
>

I've moved these to the top of the file and made them C++ constants for 
now.  I would like to do further clean up on it later.

>
> +  if (VT == MVT::i64) {
> +      return (Offset>= Hexagon_MEMD_AUTOINC_MIN&&
> +              Offset<= Hexagon_MEMD_AUTOINC_MAX&&
> +              (Offset&  0x7) == 0);
>
> You have many checks like this. I think it would be fine to add isShiftedUInt<N,S>  templates to MathExtras.h.
>

Good suggestion.  I've added the template and all our code now uses it.

>
> +bool HexagonInstrInfo::
> +isMemOp(const MachineInstr *MI) const {
> +  switch (MI->getOpcode())
> +  {
>
> Can you use MCInstrDesc::mayLoad / mayStore?
>

Yes, we can but all of our instructions that need it don't currently 
have the appropriate flag set on them.  Will fix this when I clean up 
all of the switch statements.

>
> +++ b/lib/Target/Hexagon/HexagonInstrInfo.h
>
> +/// SPII - This namespace holds all of the target specific flags that
> +/// instruction info tracks.
> +///
> +namespace HexagonII {
> +  enum {
> +    Pseudo = (1<<0),
> +    Load = (1<<1),
> +    Store = (1<<2),
> +    DelaySlot = (1<<3)
> +  };
>
> There is already target-independent flags for all of these in MCInstrDesc.
>

As above.

>
> +++ b/lib/Target/Hexagon/HexagonInstrInfo.td
>
> I like the straightforward instruction definitions in this file.
>
> Could you place all the call-related instructions together. I think one escaped.
>

I think I've got them all rounded up now. :)

>
> +++ b/lib/Target/Hexagon/HexagonIntrinsicsDerived.td
>
> One file for a single pattern??
>

This particular pattern is problematic.  It actually matches an 
intrinsic.  It should go into HexagonInstrInfo.td, but it doesn't have 
access to the intrinsic definitions there.  We could add it to the 
intrinsic td file, but that doesn't seem quite right either.  We chose 
to put it in a separate file and include it after the intrinsic td 
files.  We are certainly open to suggestions if this is too distasteful.

>
> +++ b/lib/Target/Hexagon/HexagonOptimizeSZExtends.cpp
> @@ -0,0 +1,129 @@
> +//===-- HexagonOptimizeSZExtends.cpp - Identify and remove sign and -------===//
> +//===--                                zero extends.                -------===//
>
> Can this pass do something that lib/CodeGen/PeepholeOptimizer.cpp can't?
>

No, probably not. :)  Can we fold it in later?

>
> +++ b/lib/Target/Hexagon/HexagonRegisterInfo.cpp
>
> +cl::opt<std::string>  AllocFn("alloc-fn", cl::Hidden,
> +                        cl::desc(""));
>
> !
>
>
> +const unsigned* HexagonRegisterInfo::getCalleeSavedRegs(const MachineFunction
> +                                                        *MF)
> +  const {
>
> Weird whitespace. Use:
>
> const unsigned* HexagonRegisterInfo::
> getCalleeSavedRegs(const MachineFunction *MF) const {
>

Yeah. :)  Done.

>
> +++ b/lib/Target/Hexagon/HexagonRegisterInfo.h
>
> +#define HEXAGON_RESERVED_REG_1 Hexagon::R10
> +#define HEXAGON_RESERVED_REG_2 Hexagon::R11
>
> Hmm. Does that have to go in a header file? Have you looked into using the register scavenger instead of reserving registers?
>

We are actually currently working on utilizing the register scavenger 
for this instead.  Can we submit that later when it is done?

>
> +++ b/lib/Target/Hexagon/HexagonRegisterInfo.td
>
> +// FIXME: the register order should be defined in terms of the preferred
> +// allocation order...
> +//
> +def IntRegs : RegisterClass<"Hexagon", [i32], 32, (add (sequence "R%u", 0, 9),
> +                                                       (sequence "R%u", 12, 28),
> +                                                        R10, R11, R29, R30,
> +                                                        R31)>
>
> You don't need to shuffle registers here just to put the callee-saved registers last in the allocation order. The register allocator does that automatically.
>

Those are actually reserved registers.  R10 and 11 are discussed above. 
  29 is SP, 30 is FP, 31 is LR.

>
> +// Hexagon_TODO: Do we really need to specify different classes for
> +// FP and Int registers?
>
> No, you don't.  Just specify an [i32, f32] type vector.
>
> +// Will this confuse register allocation?
>
> No, it will work, but you aren't gaining anything.  You can use multiple identical register classes if the types have different spill sizes, see the FR32, FR64, and VR128 register classes in the x86 target.
>

Removed.

>
> +// Subregister definitions. Adapted from X86RegisterInfo.td
> +//
> +//def Hexagon_subreg_loreg  :   PatLeaf<(i32 1)>;
> +//def Hexagon_subreg_hireg   :   PatLeaf<(i32 2)>;
> +//
> +//def : SubRegSet<1, [D0,  D1,  D2,  D3, D4, D5, D6, D7,
> +//                  D8,  D9,  D10, D11, D12, D13, D14, D15],
> +//                   [R0,  R2,  R4,  R6, R8, R10,R12, R14,
> +//                  R16, R18, R20, R22, R24, R26, R28, R30]>;
> +//
> +//
> +//def : SubRegSet<2, [D0,  D1,  D2,  D3, D4, D5, D6, D7,
> +//                  D8,  D9,  D10, D11, D12, D13, D14, D15],
> +//                   [R1,  R3,  R5,  R7, R9, R11, R13, R15,
> +//                  R17, R19, R21, R23, R25, R27, R29, R31]>;
>
> This is obsolete. Just delete it.
>

Done.
>
> +++ b/lib/Target/Hexagon/HexagonRemoveSZExtArgs.cpp
> @@ -0,0 +1,86 @@
> +//===-- HexagonRemoveExtendArgs.cpp - Remove unecessary sign extends ------===//
>
> This comment is wrong.  There is already a HexagonOptimizeSZExtends pass.  Are both needed?
>

This one is driven by our calling convention.  Inside the callee, it is 
free to assume that the caller has zero/sign extended arguments.  It 
should probably be folded into the other code, though, and we will do so 
when we incorporate it into the target-independent optimizer.

>
> +++ b/lib/Target/Hexagon/HexagonSplitTFRCondSets.cpp
> @@ -0,0 +1,119 @@
> +//===---- HexagonSplitTFRCondSets.cpp - split TFR condsets into xfers -----===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
>
> Missing non-trivial comments.
>

Comment describing transformation added.

Thanks so much for your comments, Jakob.

Tony


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hexagon.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111209/69eb9305/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: llvm.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111209/69eb9305/attachment-0001.ksh>


More information about the llvm-commits mailing list