[llvm] r174696 - When Mips16 frames grow large, the immediate field may exceed the maximum

Fri Feb 8 08:19:03 PST 2013

----- Original Message -----
> From: "Reed Kotler" <rkotler at mips.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Friday, February 8, 2013 6:43:33 AM
> Subject: Re: [llvm] r174696 - When Mips16 frames grow large,	the immediate field may exceed the maximum
> 
> On 02/07/2013 10:46 PM, Hal Finkel wrote:
> > ----- Original Message -----
> >> From: "Reed Kotler" <rkotler at mips.com>
> >> To: llvm-commits at cs.uiuc.edu
> >> Sent: Thursday, February 7, 2013 9:57:41 PM
> >> Subject: [llvm] r174696 - When Mips16 frames grow large,	the
> >> immediate field may exceed the maximum
> >>
> >> Author: rkotler
> >> Date: Thu Feb  7 21:57:41 2013
> >> New Revision: 174696
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=174696&view=rev
> >> Log:
> >> When Mips16 frames grow large, the immediate field may exceed the
> >> maximum
> >> allowed size for the instruction. This code uses RegScavenger to
> >> fix
> >> this.
> >> We sometimes need 2 registers for Mips16 so we must handle things
> >> differently than how register scavenger is normally used.
> > Normally the reason that you can't get more than one register from
> > the scavenger is that there is only one emergency spill slot. In
> > your case, you're getting around this problem by not using the
> > spill slot at all and simply "spilling" to other reserved
> > registers?
> >
> >   -Hal
> Exactly. Mips16 is a processor mode which is using a different
> instruction decoder in order to save space.
> The full set of Mips32 (or Mips64) registers are available but only
> in a
> limited way when in Mips16 mode.
> There are only 8 registers which can be accessed as general registers
> so
> there are many additional registers which would be free, if the
> processor were in mips32 mode.
> 
> At the present time, I use these additional registers in an ad hoc
> way
> as in this situation. In the future I hope to use them to avoid
> spills
> to memory in a more general way.
> 
> Interestingly enough, SP is not a Mips16 register but there are a set
> of
> instructions which implicitly use this register but I cannot for
> example, directly add SP and another Mips16 register together. This
> makes SP complicated to use when you are referencing elements on the
> stack and you deal with the more general case in which the local
> stack
> offsets can be large. Gcc decides to not deal with any of that
> complexity and they just copy SP to a Mips16 register when they need
> to
> do anything with the stack; like reference local stack data. This
> seems
> way too wasteful to me since there are so few registers; so I have
> gone
> the extra mile to deal with all the extra Sp induced complexities.
> 
> T8 is another non Mips16 register which is used implicitly in many
> conditional test and branch instructions.  I use it mostly as if it
> were
> a condition code register and try and avoid copying it back to a
> Mips16
> register. I had to write a lot of long patterns for various things
> like
> conditional moves so that sets of instructions that implicitly use T8
> can be strung together.
> 
> With good optimization, you should be able to save 40% on the memory
> footprint by using Mips16 mode as opposed to Mips32 mode.
> 
> You can more or less freely switch back and forth between Mips16 and
> Mips32 mode.
> 
> Micromips is another newer encoding which offers space savings too
> but
> without any of these Mips16 complexities because it is essentially
> identical to Mips32 mode at the .s level. But with micro mips, if you
> generate code without worrying about the differences, you will
> probably
> get little compression because it will be using the 32 bit variants
> much
> of the time.
> 
> Both mips16 and micro mips have both 16 and 32 bit length
> instructions.

Thanks for the explanation!

 -Hal

> 
> 
> 
> >>
> >> Added:
> >>      llvm/trunk/test/CodeGen/Mips/largefr1.ll
> >> Modified:
> >>      llvm/trunk/lib/Target/Mips/Mips16InstrInfo.cpp
> >>      llvm/trunk/lib/Target/Mips/Mips16InstrInfo.h
> >>      llvm/trunk/lib/Target/Mips/Mips16RegisterInfo.cpp
> >>
> >> Modified: llvm/trunk/lib/Target/Mips/Mips16InstrInfo.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/Mips16InstrInfo.cpp?rev=174696&r1=174695&r2=174696&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Target/Mips/Mips16InstrInfo.cpp (original)
> >> +++ llvm/trunk/lib/Target/Mips/Mips16InstrInfo.cpp Thu Feb  7
> >> 21:57:41 2013
> >> @@ -19,6 +19,7 @@
> >>   #include "llvm/ADT/StringRef.h"
> >>   #include "llvm/CodeGen/MachineInstrBuilder.h"
> >>   #include "llvm/CodeGen/MachineRegisterInfo.h"
> >> +#include "llvm/CodeGen/RegisterScavenging.h"
> >>   #include "llvm/Support/CommandLine.h"
> >>   #include "llvm/Support/ErrorHandling.h"
> >>   #include "llvm/Support/TargetRegistry.h"
> >> @@ -306,11 +307,79 @@ void Mips16InstrInfo::adjustStackPtr(uns
> >>   /// This function generates the sequence of instructions needed
> >>   to
> >>   get the
> >>   /// result of adding register REG and immediate IMM.
> >>   unsigned
> >> -Mips16InstrInfo::loadImmediate(int64_t Imm, MachineBasicBlock
> >> &MBB,
> >> +Mips16InstrInfo::loadImmediate(unsigned FrameReg,
> >> +                               int64_t Imm, MachineBasicBlock
> >> &MBB,
> >>                                  MachineBasicBlock::iterator II,
> >>                                  DebugLoc DL,
> >> -                               unsigned *NewImm) const {
> >> +                               unsigned &NewImm) const {
> >> +  //
> >> +  // given original instruction is:
> >> +  // Instr rx, T[offset] where offset is too big.
> >> +  //
> >> +  // lo = offset & 0xFFFF
> >> +  // hi = ((offset >> 16) + (lo >> 15)) & 0xFFFF;
> >> +  //
> >> +  // let T = temporary register
> >> +  // li T, hi
> >> +  // shl T, 16
> >> +  // add T, Rx, T
> >> +  //
> >> +  RegScavenger rs;
> >> +  int32_t lo = Imm & 0xFFFF;
> >> +  int32_t hi = ((Imm >> 16) + (lo >> 15)) & 0xFFFF;
> >> +  NewImm = lo;
> >> +  unsigned Reg =0;
> >> +  unsigned SpReg = 0;
> >> +  rs.enterBasicBlock(&MBB);
> >> +  rs.forward(II);
> >> +  //
> >> +  // we use T0 for the first register, if we need to save
> >> something
> >> away.
> >> +  // we use T1 for the second register, if we need to save
> >> something
> >> away.
> >> +  //
> >> +  unsigned FirstRegSaved =0, SecondRegSaved=0;
> >> +  unsigned FirstRegSavedTo = 0, SecondRegSavedTo = 0;
> >>   
> >> -  return 0;
> >> +  Reg = rs.FindUnusedReg(&Mips::CPU16RegsRegClass);
> >> +  if (Reg == 0) {
> >> +    FirstRegSaved = Reg = Mips::V0;
> >> +    FirstRegSavedTo = Mips::T0;
> >> +    copyPhysReg(MBB, II, DL, FirstRegSavedTo, FirstRegSaved,
> >> true);
> >> +  }
> >> +  else
> >> +    rs.setUsed(Reg);
> >> +  BuildMI(MBB, II, DL, get(Mips::LiRxImmX16), Reg).addImm(hi);
> >> +  BuildMI(MBB, II, DL, get(Mips::SllX16), Reg).addReg(Reg).
> >> +    addImm(16);
> >> +  if (FrameReg == Mips::SP) {
> >> +    SpReg = rs.FindUnusedReg(&Mips::CPU16RegsRegClass);
> >> +    if (SpReg == 0) {
> >> +      if (Reg != Mips::V1) {
> >> +        SecondRegSaved = SpReg = Mips::V1;
> >> +        SecondRegSavedTo = Mips::T1;
> >> +      }
> >> +      else {
> >> +        SecondRegSaved = SpReg = Mips::V0;
> >> +        SecondRegSavedTo = Mips::T0;
> >> +      }
> >> +      copyPhysReg(MBB, II, DL, SecondRegSavedTo, SecondRegSaved,
> >> true);
> >> +    }
> >> +    else
> >> +      rs.setUsed(SpReg);
> >> +
> >> +    copyPhysReg(MBB, II, DL, SpReg, Mips::SP, false);
> >> +    BuildMI(MBB, II, DL, get(Mips::  AdduRxRyRz16),
> >> Reg).addReg(SpReg)
> >> +      .addReg(Reg);
> >> +  }
> >> +  else
> >> +    BuildMI(MBB, II, DL, get(Mips::  AdduRxRyRz16),
> >> Reg).addReg(FrameReg)
> >> +      .addReg(Reg, RegState::Kill);
> >> +  if (FirstRegSaved || SecondRegSaved) {
> >> +    II = llvm::next(II);
> >> +    if (FirstRegSaved)
> >> +      copyPhysReg(MBB, II, DL, FirstRegSaved, FirstRegSavedTo,
> >> true);
> >> +    if (SecondRegSaved)
> >> +      copyPhysReg(MBB, II, DL, SecondRegSaved, SecondRegSavedTo,
> >> true);
> >> +  }
> >> +  return Reg;
> >>   }
> >>   
> >>   unsigned Mips16InstrInfo::GetAnalyzableBrOpc(unsigned Opc) const
> >>   {
> >>
> >> Modified: llvm/trunk/lib/Target/Mips/Mips16InstrInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/Mips16InstrInfo.h?rev=174696&r1=174695&r2=174696&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Target/Mips/Mips16InstrInfo.h (original)
> >> +++ llvm/trunk/lib/Target/Mips/Mips16InstrInfo.h Thu Feb  7
> >> 21:57:41
> >> 2013
> >> @@ -77,12 +77,14 @@ public:
> >>     void adjustStackPtr(unsigned SP, int64_t Amount,
> >>     MachineBasicBlock
> >>     &MBB,
> >>                         MachineBasicBlock::iterator I) const;
> >>   
> >> -  /// Emit a series of instructions to load an immediate. If
> >> NewImm
> >> is a
> >> -  /// non-NULL parameter, the last instruction is not emitted,
> >> but
> >> instead
> >> -  /// its immediate operand is returned in NewImm.
> >> -  unsigned loadImmediate(int64_t Imm, MachineBasicBlock &MBB,
> >> +  /// Emit a series of instructions to load an immediate.
> >> +  // This is to adjust some FrameReg. We return the new register
> >> to
> >> be used
> >> +  // in place of FrameReg and the adjusted immediate field
> >> (&NewImm)
> >> +  //
> >> +  unsigned loadImmediate(unsigned FrameReg,
> >> +                         int64_t Imm, MachineBasicBlock &MBB,
> >>                            MachineBasicBlock::iterator II,
> >>                            DebugLoc
> >>                            DL,
> >> -                         unsigned *NewImm) const;
> >> +                         unsigned &NewImm) const;
> >>   
> >>   private:
> >>     virtual unsigned GetAnalyzableBrOpc(unsigned Opc) const;
> >>
> >> Modified: llvm/trunk/lib/Target/Mips/Mips16RegisterInfo.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/Mips16RegisterInfo.cpp?rev=174696&r1=174695&r2=174696&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Target/Mips/Mips16RegisterInfo.cpp (original)
> >> +++ llvm/trunk/lib/Target/Mips/Mips16RegisterInfo.cpp Thu Feb  7
> >> 21:57:41 2013
> >> @@ -1,3 +1,4 @@
> >> +
> >>   //===-- Mips16RegisterInfo.cpp - MIPS16 Register Information -==
> >>   ----------===//
> >>   //
> >>   //                     The LLVM Compiler Infrastructure
> >> @@ -12,6 +13,7 @@
> >>   //===----------------------------------------------------------------------===//
> >>   
> >>   #include "Mips16RegisterInfo.h"
> >> +#include "Mips16InstrInfo.h"
> >>   #include "Mips.h"
> >>   #include "Mips16InstrInfo.h"
> >>   #include "MipsAnalyzeImmediate.h"
> >> @@ -23,6 +25,7 @@
> >>   #include "llvm/CodeGen/MachineFrameInfo.h"
> >>   #include "llvm/CodeGen/MachineFunction.h"
> >>   #include "llvm/CodeGen/MachineInstrBuilder.h"
> >> +#include "llvm/CodeGen/MachineRegisterInfo.h"
> >>   #include "llvm/CodeGen/ValueTypes.h"
> >>   #include "llvm/DebugInfo.h"
> >>   #include "llvm/IR/Constants.h"
> >> @@ -140,6 +143,7 @@ void Mips16RegisterInfo::eliminateFI(Mac
> >>     //   by adding the size of the stack:
> >>     //   incoming argument, callee-saved register location or
> >>     local
> >>     variable.
> >>     int64_t Offset;
> >> +  bool IsKill = false;
> >>     Offset = SPOffset + (int64_t)StackSize;
> >>     Offset += MI.getOperand(OpNo + 1).getImm();
> >>   
> >> @@ -148,9 +152,14 @@ void Mips16RegisterInfo::eliminateFI(Mac
> >>   
> >>     if (!MI.isDebugValue() && ( ((FrameReg != Mips::SP) &&
> >>     !isInt<16>(Offset)) ||
> >>         ((FrameReg == Mips::SP) && !isInt<15>(Offset)) )) {
> >> -    llvm_unreachable("frame offset does not fit in instruction");
> >> +    MachineBasicBlock &MBB = *MI.getParent();
> >> +    DebugLoc DL = II->getDebugLoc();
> >> +    unsigned NewImm;
> >> +    FrameReg = TII.loadImmediate(FrameReg, Offset, MBB, II, DL,
> >> NewImm);
> >> +    Offset = SignExtend64<16>(NewImm);
> >> +    IsKill = true;
> >>     }
> >> -  MI.getOperand(OpNo).ChangeToRegister(FrameReg, false);
> >> +  MI.getOperand(OpNo).ChangeToRegister(FrameReg, false, false,
> >> IsKill);
> >>     MI.getOperand(OpNo + 1).ChangeToImmediate(Offset);
> >>   
> >>   
> >>
> >> Added: llvm/trunk/test/CodeGen/Mips/largefr1.ll
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/largefr1.ll?rev=174696&view=auto
> >> ==============================================================================
> >> --- llvm/trunk/test/CodeGen/Mips/largefr1.ll (added)
> >> +++ llvm/trunk/test/CodeGen/Mips/largefr1.ll Thu Feb  7 21:57:41
> >> 2013
> >> @@ -0,0 +1,61 @@
> >> +; RUN: llc -march=mipsel -mcpu=mips16 -mips16-hard-float
> >> -soft-float
> >> -relocation-model=static < %s | FileCheck %s -check-prefix=1
> >> +
> >> + at i = common global i32 0, align 4
> >> + at j = common global i32 0, align 4
> >> + at .str = private unnamed_addr constant [8 x i8] c"%i %i \0A\00",
> >> align 1
> >> +
> >> +define void @foo(i32* %p, i32 %i, i32 %j) nounwind {
> >> +entry:
> >> +  %p.addr = alloca i32*, align 4
> >> +  %i.addr = alloca i32, align 4
> >> +  %j.addr = alloca i32, align 4
> >> +  store i32* %p, i32** %p.addr, align 4
> >> +  store i32 %i, i32* %i.addr, align 4
> >> +  store i32 %j, i32* %j.addr, align 4
> >> +  %0 = load i32* %j.addr, align 4
> >> +  %1 = load i32** %p.addr, align 4
> >> +  %2 = load i32* %i.addr, align 4
> >> +  %add.ptr = getelementptr inbounds i32* %1, i32 %2
> >> +  store i32 %0, i32* %add.ptr, align 4
> >> +  ret void
> >> +}
> >> +
> >> +define i32 @main() nounwind {
> >> +entry:
> >> +; 1: main:
> >> +; 1: 1: 	.word	-797992
> >> +; 1:            li ${{[0-9]+}}, 12
> >> +; 1:            sll ${{[0-9]+}}, ${{[0-9]+}}, 16
> >> +; 1:            addu ${{[0-9]+}}, ${{[0-9]+}}, ${{[0-9]+}}
> >> +; 2:            move $sp, ${{[0-9]+}}
> >> +; 2:            addu ${{[0-9]+}}, ${{[0-9]+}}, ${{[0-9]+}}
> >> +; 1:            li ${{[0-9]+}}, 6
> >> +; 1:            sll ${{[0-9]+}}, ${{[0-9]+}}, 16
> >> +; 1:            addu ${{[0-9]+}}, ${{[0-9]+}}, ${{[0-9]+}}
> >> +; 2:            move $sp, ${{[0-9]+}}
> >> +; 2:            addu ${{[0-9]+}}, ${{[0-9]+}}, ${{[0-9]+}}
> >> +; 1:          	addiu	${{[0-9]+}}, ${{[0-9]+}}, 6800
> >> +; 1: 	        li	${{[0-9]+}}, 1
> >> +; 1:	        sll	${{[0-9]+}}, ${{[0-9]+}}, 16
> >> +; 2: 	        li	${{[0-9]+}}, 34463
> >> +  %retval = alloca i32, align 4
> >> +  %one = alloca [100000 x i32], align 4
> >> +  %two = alloca [100000 x i32], align 4
> >> +  store i32 0, i32* %retval
> >> +  %arrayidx = getelementptr inbounds [100000 x i32]* %one, i32 0,
> >> i32 0
> >> +  call void @foo(i32* %arrayidx, i32 50, i32 9999)
> >> +  %arrayidx1 = getelementptr inbounds [100000 x i32]* %two, i32
> >> 0,
> >> i32 0
> >> +  call void @foo(i32* %arrayidx1, i32 99999, i32 5555)
> >> +  %arrayidx2 = getelementptr inbounds [100000 x i32]* %one, i32
> >> 0,
> >> i32 50
> >> +  %0 = load i32* %arrayidx2, align 4
> >> +  store i32 %0, i32* @i, align 4
> >> +  %arrayidx3 = getelementptr inbounds [100000 x i32]* %two, i32
> >> 0,
> >> i32 99999
> >> +  %1 = load i32* %arrayidx3, align 4
> >> +  store i32 %1, i32* @j, align 4
> >> +  %2 = load i32* @i, align 4
> >> +  %3 = load i32* @j, align 4
> >> +  %call = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
> >> ([8 x i8]* @.str, i32 0, i32 0), i32 %2, i32 %3)
> >> +  ret i32 0
> >> +}
> >> +
> >> +declare i32 @printf(i8*, ...)
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> 
>