[llvm-dev] Extending Register Rematerialization
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Fri Dec 2 12:36:26 PST 2016
----- Original Message -----
> From: "Gerolf Hoflehner" <ghoflehner at apple.com>
> To: "Nirav Rana" <nirav076 at gmail.com>
> Cc: hfinkel at anl.gov, llvm-dev at lists.llvm.org, "Pandya Vivek"
> <h2015078 at pilani.bits-pilani.ac.in>,
> h2015089 at pilani.bits-pilani.ac.in, h2015172 at pilani.bits-pilani.ac.in
> Sent: Thursday, December 1, 2016 6:14:06 PM
> Subject: Re: [llvm-dev] Extending Register Rematerialization
> On which targets & apps/benchmarks do you expect a speed-up? In
> practice I expect spills/fills to be hard to beat by longer remat
> sequences.
Why?
Perhaps it depends on how you define "longer." A larger OOO core with multiple pipelines can often execute a materialization sequence consisting of several instructions faster than it can get data from the L1 cache. If the code is already putting pressure on the load/store units then the extra spill/restore code can be noticeably worse.
Note the following from AArch64InstrInfo.td:
let isReMaterializable = 1, isCodeGenOnly = 1, isMoveImm = 1,
isAsCheapAsAMove = 1 in {
// FIXME: The following pseudo instructions are only needed because remat
// cannot handle multiple instructions. When that changes, we can select
// directly to the real instructions and get rid of these pseudos.
def MOVi32imm
: Pseudo<(outs GPR32:$dst), (ins i32imm:$src),
[(set GPR32:$dst, imm:$src)]>,
Sched<[WriteImm]>;
def MOVi64imm
: Pseudo<(outs GPR64:$dst), (ins i64imm:$src),
[(set GPR64:$dst, imm:$src)]>,
Sched<[WriteImm]>;
} // isReMaterializable, isCodeGenOnly
Also, I think that Ivan Baev's talk at the developers' meeting a couple of years ago also provides some good hints of places to look for where this might matter: http://llvm.org/devmtg/2014-10/#talk20
Thanks again,
Hal
> Thanks
> Gerolf
> > On Nov 27, 2016, at 12:37 PM, Nirav Rana via llvm-dev <
> > llvm-dev at lists.llvm.org > wrote:
>
> > Hello LLVM Developers,
>
> > We are working on extending currently available register
> > rematerialization to include cases where sequence of multiple
> > instructions is required to rematerialize a value.
>
> > We had a discussion on this in community mailing list and link is
> > here:
>
> > http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
>
> > From the above discussion and studying the code we believe that
> > extension can be implemented in same flow as current remat is
> > implemented. What we unterstood is RegAlloc<>.cpp will try to
> > allocate register to live-range, and if not possible, will call
> > InlineSpiller.cpp to spill the live range. InlineSpiller.cpp will
> > try to first rematerialize the register value if possible with help
> > of LiveRangeEdit.cpp which provides various methods for checking if
> > value is rematable or not.
>
> > So we have added a new function in LiveRangeEdit that traverses
> > sequence of instruction in use-def chain recursively (instead of
> > only current instruction in consideration) upto depth 6
> > (arbitrarily
> > taken for testing) to check if value can be rematerialized with the
> > sequence of instruction or not.
>
> > Here is the code:
>
> > //New function added for checking complex
> > multi-instruction-sequence
> > rematerializable
>
> > bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
>
> > const MachineInstr *DefMI,
>
> > unsigned int depth,
>
> > AliasAnalysis *aa) {
>
> > if(TII.isReMaterializablePossible(*DefMI, aa))
>
> > return false;
>
> > DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
>
> > for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
>
> > const MachineOperand &MO = DefMI->getOperand(i);
>
> > if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
>
> > continue;
>
> > if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
>
> > if (MRI.isConstantPhysReg(MO.getReg(),
> > *DefMI->getParent()->getParent()))
>
> > continue;
>
> > //If not constant then check its def
>
> > if(depth > 6)
>
> > return false;
>
> > LiveInterval &li = LIS.getInterval(MO.getReg());
>
> > SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
>
> > VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);
>
> > MachineInstr *NewDefMI =
> > LIS.getInstructionFromIndex(UseVNInfo->def);
>
> > if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
>
> > return false;
>
> > }
>
> > }
>
> > Remattable.insert(VNI); //May have to add new data structure
>
> > return true;
>
> > }
>
> > In above function we are calling a new function
> > TII.isReMaterializablePossible(*DefMI, aa) which will act as early
> > heuristic and return false by checking if instruction is definitely
> > not rematerialize. We have found some cases from
> > TargetInstrInfo::isReallyTriviallyReMaterializableGeneric and code
> > for same is here:
>
> > bool TargetInstrInfo::isReMaterializablePossible(
>
> > const MachineInstr &MI, AliasAnalysis *AA) const {
>
> > const MachineFunction &MF = *MI.getParent()->getParent();
>
> > const MachineRegisterInfo &MRI = MF.getRegInfo();
>
> > // Remat clients assume operand 0 is the defined register.
>
> > if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
>
> > return false;
>
> > unsigned DefReg = MI.getOperand(0).getReg();
>
> > // A sub-register definition can only be rematerialized if the
> > instruction
>
> > // doesn't read the other parts of the register. Otherwise it is
> > really a
>
> > // read-modify-write operation on the full virtual register which
> > cannot be
>
> > // moved safely.
>
> > if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
>
> > MI.getOperand(0).getSubReg() && MI.readsVirtualRegister(DefReg))
>
> > return false;
>
> > // Avoid instructions obviously unsafe for remat.
>
> > if (MI.isNotDuplicable() || MI.mayStore() ||
> > MI.hasUnmodeledSideEffects())
>
> > return false;
>
> > // Don't remat inline asm. We have no idea how expensive it is
>
> > // even if it's side effect free.
>
> > if (MI.isInlineAsm())
>
> > return false;
>
> > }
>
> > We have following doubts and require guidance and suggestion to
> > move
> > ahead:
>
> > 1. Is the approach we are following feasible?
>
> > 2. What will be the suitable method to store the sequence of
> > instruction for recomputing value which will be used during
> > transformation.
>
> > 3. Suggestion for deciding termination condition for checking
> > use-def
> > chain as it should be terminated when remat will be costly that
> > spill.
>
> > 4. What other cases or instruction could be included in
> > isReMaterializablePossible() function. Some suggestions for
> > direction to look in.
>
> > Any other suggestions will also be helpful for us to move in right
> > direction.
>
> > - Nirav _______________________________________________
>
> > LLVM Developers mailing list
>
> > llvm-dev at lists.llvm.org
>
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161202/5ace6ba7/attachment.html>
More information about the llvm-dev
mailing list