[llvm] r176498 - R600: initial scheduler code
Vincent Lejeune
vljn at ovi.com
Tue Mar 5 15:12:15 PST 2013
Hi Andy,
Thank for your review.
I'd like to put some feature into this scheduler before using some unit tests.
Currently ScheduleDAGInstr enforces dependencies between instructions writing/reading to a reg although not to the same subreg.
Our arch does not have constrains on accessing different subreg of a 128 bits regs thus schedule pass has to remove such deps.
In a previous patch this was done in the initialize function but I think this would be better if it was directly supported by ScheduleDAGInstr using a target hook.
I'm working on this atm, currently I'm experimenting things in R600MachineRegister.cpp following ScheduleDAGInstr algorithm :
http://cgit.freedesktop.org/~vlj/llvm/commit/?h=codesize4&id=62a4109bd126efc1f267843402b624b7729c79f7
This is WIP, I'm evaluating different idea (using a VReg2SUnit structure that would store several SUnit*, one for each accessed subreg, or
having a SubVReg2SUnit structure that is similar to the current VReg2SUnit but also store subreg)
Changing ScheduleDAG also means changing schedule pass behavior and I don't know how to properly catch this in a test case ; I have (few) tests but
they check the expected result of the complete schedule strategy (ie without deps between subreg).
Vincent
----- Mail original -----
> De : Andrew Trick <atrick at apple.com>
> À : Vincent Lejeune <vljn at ovi.com>
> Cc : llvm-commits at cs.uiuc.edu
> Envoyé le : Mardi 5 mars 2013 20h29
> Objet : Re: [llvm] r176498 - R600: initial scheduler code
>
>
> On Mar 5, 2013, at 10:41 AM, Vincent Lejeune <vljn at ovi.com> wrote:
>
>> Author: vljn
>> Date: Tue Mar 5 12:41:32 2013
>> New Revision: 176498
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=176498&view=rev
>> Log:
>> R600: initial scheduler code
>>
>> This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently
>> it only tries to expose more parallelism for ALU instructions (this also
>> makes the distribution of GPR channels more uniform and increases the
>> chances of ALU instructions to be packed together in a single VLIW group).
>> Also it tries to reduce clause switching by grouping instruction of the
>> same kind (ALU/FETCH/CF) together.
>>
>> Vincent Lejeune:
>> - Support for VLIW4 Slot assignement
>> - Recomputation of ScheduleDAG to get more parallelism opportunities
>>
>> Tom Stellard:
>> - Fix assertion failure when trying to determine an instruction's slot
>> based on its destination register's class
>> - Fix some compiler warnings
>>
>> Vincent Lejeune: [v2]
>> - Remove recomputation of ScheduleDAG (will be provided in a later patch)
>> - Improve estimation of an ALU clause size so that heuristic does not emit
> cf
>> instructions at the wrong position.
>> - Make schedule heuristic smarter using SUnit Depth
>> - Take constant read limitations into account
>>
>> Vincent Lejeune: [v3]
>> - Fix some uninitialized values in ConstPair
>> - Add asserts to ensure an ALU slot is always populated
>>
>> Added:
>> llvm/trunk/lib/Target/R600/R600MachineScheduler.cpp
>> llvm/trunk/lib/Target/R600/R600MachineScheduler.h
>> Modified:
>> llvm/trunk/lib/Target/R600/AMDGPUTargetMachine.cpp
>>
>> Modified: llvm/trunk/lib/Target/R600/AMDGPUTargetMachine.cpp
>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/R600/AMDGPUTargetMachine.cpp?rev=176498&r1=176497&r2=176498&view=diff
>>
> ==============================================================================
>> --- llvm/trunk/lib/Target/R600/AMDGPUTargetMachine.cpp (original)
>> +++ llvm/trunk/lib/Target/R600/AMDGPUTargetMachine.cpp Tue Mar 5 12:41:32
> 2013
>> @@ -17,6 +17,7 @@
>> #include "AMDGPU.h"
>> #include "R600ISelLowering.h"
>> #include "R600InstrInfo.h"
>> +#include "R600MachineScheduler.h"
>> #include "SIISelLowering.h"
>> #include "SIInstrInfo.h"
>> #include "llvm/Analysis/Passes.h"
>> @@ -39,6 +40,14 @@ extern "C" void LLVMInitializeR600Target
>> RegisterTargetMachine<AMDGPUTargetMachine> X(TheAMDGPUTarget);
>> }
>>
>> +static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext
> *C) {
>> + return new ScheduleDAGMI(C, new R600SchedStrategy());
>> +}
>> +
>> +static MachineSchedRegistry
>> +SchedCustomRegistry("r600", "Run R600's custom
> scheduler",
>> + createR600MachineScheduler);
>> +
>> AMDGPUTargetMachine::AMDGPUTargetMachine(const Target &T, StringRef TT,
>> StringRef CPU, StringRef FS,
>> TargetOptions Options,
>> @@ -70,7 +79,13 @@ namespace {
>> class AMDGPUPassConfig : public TargetPassConfig {
>> public:
>> AMDGPUPassConfig(AMDGPUTargetMachine *TM, PassManagerBase &PM)
>> - : TargetPassConfig(TM, PM) {}
>> + : TargetPassConfig(TM, PM) {
>> + const AMDGPUSubtarget &ST =
> TM->getSubtarget<AMDGPUSubtarget>();
>> + if (ST.device()->getGeneration() <= AMDGPUDeviceInfo::HD6XXX) {
>> + enablePass(&MachineSchedulerID);
>> + MachineSchedRegistry::setDefault(createR600MachineScheduler);
>> + }
>> + }
>>
>> AMDGPUTargetMachine &getAMDGPUTargetMachine() const {
>> return getTM<AMDGPUTargetMachine>();
>>
>> Added: llvm/trunk/lib/Target/R600/R600MachineScheduler.cpp
>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/R600/R600MachineScheduler.cpp?rev=176498&view=auto
>>
> ==============================================================================
>> --- llvm/trunk/lib/Target/R600/R600MachineScheduler.cpp (added)
>> +++ llvm/trunk/lib/Target/R600/R600MachineScheduler.cpp Tue Mar 5 12:41:32
> 2013
>> @@ -0,0 +1,487 @@
>> +//===-- R600MachineScheduler.cpp - R600 Scheduler Interface -*- C++
> -*-----===//
>> +//
>> +// The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>>
> +//===----------------------------------------------------------------------===//
>> +//
>> +/// \file
>> +/// \brief R600 Machine Scheduler interface
>> +// TODO: Scheduling is optimised for VLIW4 arch, modify it to support
> TRANS slot
>> +//
>>
> +//===----------------------------------------------------------------------===//
>> +
>> +#define DEBUG_TYPE "misched"
>> +
>> +#include "R600MachineScheduler.h"
>> +#include "llvm/CodeGen/MachineRegisterInfo.h"
>> +#include "llvm/CodeGen/LiveIntervalAnalysis.h"
>> +#include "llvm/Pass.h"
>> +#include "llvm/PassManager.h"
>> +#include <set>
>> +#include <iostream>
>> +using namespace llvm;
>> +
>> +void R600SchedStrategy::initialize(ScheduleDAGMI *dag) {
>> +
>> + DAG = dag;
>> + TII = static_cast<const R600InstrInfo*>(DAG->TII);
>> + TRI = static_cast<const R600RegisterInfo*>(DAG->TRI);
>> + MRI = &DAG->MRI;
>> + Available[IDAlu]->clear();
>> + Available[IDFetch]->clear();
>> + Available[IDOther]->clear();
>> + CurInstKind = IDOther;
>> + CurEmitted = 0;
>> + OccupedSlotsMask = 15;
>> + memset(InstructionsGroupCandidate, 0,
> sizeof(InstructionsGroupCandidate));
>> + InstKindLimit[IDAlu] = 120; // 120 minus 8 for security
>> +
>> +
>> + const AMDGPUSubtarget &ST =
> DAG->TM.getSubtarget<AMDGPUSubtarget>();
>> + if (ST.device()->getGeneration() <= AMDGPUDeviceInfo::HD5XXX) {
>> + InstKindLimit[IDFetch] = 7; // 8 minus 1 for security
>> + } else {
>> + InstKindLimit[IDFetch] = 15; // 16 minus 1 for security
>> + }
>> +}
>> +
>> +void R600SchedStrategy::MoveUnits(ReadyQueue *QSrc, ReadyQueue *QDst)
>> +{
>> + if (QSrc->empty())
>> + return;
>> + for (ReadyQueue::iterator I = QSrc->begin(),
>> + E = QSrc->end(); I != E; ++I) {
>> + (*I)->NodeQueueId &= ~QSrc->getID();
>> + QDst->push(*I);
>> + }
>> + QSrc->clear();
>> +}
>> +
>> +SUnit* R600SchedStrategy::pickNode(bool &IsTopNode) {
>> + SUnit *SU = 0;
>> + IsTopNode = true;
>> + NextInstKind = IDOther;
>> +
>> + // check if we might want to switch current clause type
>> + bool AllowSwitchToAlu = (CurInstKind == IDOther) ||
>> + (CurEmitted > InstKindLimit[CurInstKind]) ||
>> + (Available[CurInstKind]->empty());
>> + bool AllowSwitchFromAlu = (CurEmitted > InstKindLimit[CurInstKind])
> &&
>> + (!Available[IDFetch]->empty() ||
> !Available[IDOther]->empty());
>> +
>> + if ((AllowSwitchToAlu && CurInstKind != IDAlu) ||
>> + (!AllowSwitchFromAlu && CurInstKind == IDAlu)) {
>> + // try to pick ALU
>> + SU = pickAlu();
>> + if (SU) {
>> + if (CurEmitted > InstKindLimit[IDAlu])
>> + CurEmitted = 0;
>> + NextInstKind = IDAlu;
>> + }
>> + }
>> +
>> + if (!SU) {
>> + // try to pick FETCH
>> + SU = pickOther(IDFetch);
>> + if (SU)
>> + NextInstKind = IDFetch;
>> + }
>> +
>> + // try to pick other
>> + if (!SU) {
>> + SU = pickOther(IDOther);
>> + if (SU)
>> + NextInstKind = IDOther;
>> + }
>> +
>> + DEBUG(
>> + if (SU) {
>> + dbgs() << "picked node: ";
>> + SU->dump(DAG);
>> + } else {
>> + dbgs() << "NO NODE ";
>> + for (int i = 0; i < IDLast; ++i) {
>> + Available[i]->dump();
>> + Pending[i]->dump();
>> + }
>> + for (unsigned i = 0; i < DAG->SUnits.size(); i++) {
>> + const SUnit &S = DAG->SUnits[i];
>> + if (!S.isScheduled)
>> + S.dump(DAG);
>> + }
>> + }
>> + );
>> +
>> + return SU;
>> +}
>> +
>> +void R600SchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {
>> +
>> + DEBUG(dbgs() << "scheduled: ");
>> + DEBUG(SU->dump(DAG));
>> +
>> + if (NextInstKind != CurInstKind) {
>> + DEBUG(dbgs() << "Instruction Type Switch\n");
>> + if (NextInstKind != IDAlu)
>> + OccupedSlotsMask = 15;
>> + CurEmitted = 0;
>> + CurInstKind = NextInstKind;
>> + }
>> +
>> + if (CurInstKind == IDAlu) {
>> + switch (getAluKind(SU)) {
>> + case AluT_XYZW:
>> + CurEmitted += 4;
>> + break;
>> + case AluDiscarded:
>> + break;
>> + default: {
>> + ++CurEmitted;
>> + for (MachineInstr::mop_iterator It =
> SU->getInstr()->operands_begin(),
>> + E = SU->getInstr()->operands_end(); It != E; ++It) {
>> + MachineOperand &MO = *It;
>> + if (MO.isReg() && MO.getReg() == AMDGPU::ALU_LITERAL_X)
>> + ++CurEmitted;
>> + }
>> + }
>> + }
>> + } else {
>> + ++CurEmitted;
>> + }
>> +
>> +
>> + DEBUG(dbgs() << CurEmitted << " Instructions Emitted in
> this clause\n");
>> +
>> + if (CurInstKind != IDFetch) {
>> + MoveUnits(Pending[IDFetch], Available[IDFetch]);
>> + }
>> + MoveUnits(Pending[IDOther], Available[IDOther]);
>> +}
>> +
>> +void R600SchedStrategy::releaseTopNode(SUnit *SU) {
>> + int IK = getInstKind(SU);
>> +
>> + DEBUG(dbgs() << IK << " <= ");
>> + DEBUG(SU->dump(DAG));
>> +
>> + Pending[IK]->push(SU);
>> +}
>> +
>> +void R600SchedStrategy::releaseBottomNode(SUnit *SU) {
>> +}
>> +
>> +bool R600SchedStrategy::regBelongsToClass(unsigned Reg,
>> + const TargetRegisterClass *RC)
> const {
>> + if (!TargetRegisterInfo::isVirtualRegister(Reg)) {
>> + return RC->contains(Reg);
>> + } else {
>> + return MRI->getRegClass(Reg) == RC;
>> + }
>> +}
>> +
>> +R600SchedStrategy::AluKind R600SchedStrategy::getAluKind(SUnit *SU) const
> {
>> + MachineInstr *MI = SU->getInstr();
>> +
>> + switch (MI->getOpcode()) {
>> + case AMDGPU::INTERP_PAIR_XY:
>> + case AMDGPU::INTERP_PAIR_ZW:
>> + case AMDGPU::INTERP_VEC_LOAD:
>> + return AluT_XYZW;
>> + case AMDGPU::COPY:
>> + if
> (TargetRegisterInfo::isPhysicalRegister(MI->getOperand(1).getReg())) {
>> + // %vregX = COPY Tn_X is likely to be discarded in favor of an
>> + // assignement of Tn_X to %vregX, don't considers it in
> scheduling
>> + return AluDiscarded;
>> + }
>> + else if (MI->getOperand(1).isUndef()) {
>> + // MI will become a KILL, don't considers it in scheduling
>> + return AluDiscarded;
>> + }
>> + default:
>> + break;
>> + }
>> +
>> + // Does the instruction take a whole IG ?
>> + if(TII->isVector(*MI) ||
>> + TII->isCubeOp(MI->getOpcode()) ||
>> + TII->isReductionOp(MI->getOpcode()))
>> + return AluT_XYZW;
>> +
>> + // Is the result already assigned to a channel ?
>> + unsigned DestSubReg = MI->getOperand(0).getSubReg();
>> + switch (DestSubReg) {
>> + case AMDGPU::sub0:
>> + return AluT_X;
>> + case AMDGPU::sub1:
>> + return AluT_Y;
>> + case AMDGPU::sub2:
>> + return AluT_Z;
>> + case AMDGPU::sub3:
>> + return AluT_W;
>> + default:
>> + break;
>> + }
>> +
>> + // Is the result already member of a X/Y/Z/W class ?
>> + unsigned DestReg = MI->getOperand(0).getReg();
>> + if (regBelongsToClass(DestReg, &AMDGPU::R600_TReg32_XRegClass) ||
>> + regBelongsToClass(DestReg, &AMDGPU::R600_AddrRegClass))
>> + return AluT_X;
>> + if (regBelongsToClass(DestReg, &AMDGPU::R600_TReg32_YRegClass))
>> + return AluT_Y;
>> + if (regBelongsToClass(DestReg, &AMDGPU::R600_TReg32_ZRegClass))
>> + return AluT_Z;
>> + if (regBelongsToClass(DestReg, &AMDGPU::R600_TReg32_WRegClass))
>> + return AluT_W;
>> + if (regBelongsToClass(DestReg, &AMDGPU::R600_Reg128RegClass))
>> + return AluT_XYZW;
>> +
>> + return AluAny;
>> +
>> +}
>> +
>> +int R600SchedStrategy::getInstKind(SUnit* SU) {
>> + int Opcode = SU->getInstr()->getOpcode();
>> +
>> + if (TII->isALUInstr(Opcode)) {
>> + return IDAlu;
>> + }
>> +
>> + switch (Opcode) {
>> + case AMDGPU::COPY:
>> + case AMDGPU::CONST_COPY:
>> + case AMDGPU::INTERP_PAIR_XY:
>> + case AMDGPU::INTERP_PAIR_ZW:
>> + case AMDGPU::INTERP_VEC_LOAD:
>> + case AMDGPU::DOT4_eg_pseudo:
>> + case AMDGPU::DOT4_r600_pseudo:
>> + return IDAlu;
>> + case AMDGPU::TEX_VTX_CONSTBUF:
>> + case AMDGPU::TEX_VTX_TEXBUF:
>> + case AMDGPU::TEX_LD:
>> + case AMDGPU::TEX_GET_TEXTURE_RESINFO:
>> + case AMDGPU::TEX_GET_GRADIENTS_H:
>> + case AMDGPU::TEX_GET_GRADIENTS_V:
>> + case AMDGPU::TEX_SET_GRADIENTS_H:
>> + case AMDGPU::TEX_SET_GRADIENTS_V:
>> + case AMDGPU::TEX_SAMPLE:
>> + case AMDGPU::TEX_SAMPLE_C:
>> + case AMDGPU::TEX_SAMPLE_L:
>> + case AMDGPU::TEX_SAMPLE_C_L:
>> + case AMDGPU::TEX_SAMPLE_LB:
>> + case AMDGPU::TEX_SAMPLE_C_LB:
>> + case AMDGPU::TEX_SAMPLE_G:
>> + case AMDGPU::TEX_SAMPLE_C_G:
>> + case AMDGPU::TXD:
>> + case AMDGPU::TXD_SHADOW:
>> + return IDFetch;
>> + default:
>> + DEBUG(
>> + dbgs() << "other inst: ";
>> + SU->dump(DAG);
>> + );
>> + return IDOther;
>> + }
>> +}
>> +
>> +class ConstPairs {
>> +private:
>> + unsigned XYPair;
>> + unsigned ZWPair;
>> +public:
>> + ConstPairs(unsigned ReadConst[3]) : XYPair(0), ZWPair(0) {
>> + for (unsigned i = 0; i < 3; i++) {
>> + unsigned ReadConstChan = ReadConst[i] & 3;
>> + unsigned ReadConstIndex = ReadConst[i] & (~3);
>> + if (ReadConstChan < 2) {
>> + if (!XYPair) {
>> + XYPair = ReadConstIndex;
>> + }
>> + } else {
>> + if (!ZWPair) {
>> + ZWPair = ReadConstIndex;
>> + }
>> + }
>> + }
>> + }
>> +
>> + bool isCompatibleWith(const ConstPairs& CP) const {
>> + return (!XYPair || !CP.XYPair || CP.XYPair == XYPair) &&
>> + (!ZWPair || !CP.ZWPair || CP.ZWPair == ZWPair);
>> + }
>> +};
>> +
>> +static
>> +const ConstPairs getPairs(const R600InstrInfo *TII, const
> MachineInstr& MI) {
>> + unsigned ReadConsts[3] = {0, 0, 0};
>> + R600Operands::Ops OpTable[3][2] = {
>> + {R600Operands::SRC0, R600Operands::SRC0_SEL},
>> + {R600Operands::SRC1, R600Operands::SRC1_SEL},
>> + {R600Operands::SRC2, R600Operands::SRC2_SEL},
>> + };
>> +
>> + if (!TII->isALUInstr(MI.getOpcode()))
>> + return ConstPairs(ReadConsts);
>> +
>> + for (unsigned i = 0; i < 3; i++) {
>> + int SrcIdx = TII->getOperandIdx(MI.getOpcode(), OpTable[i][0]);
>> + if (SrcIdx < 0)
>> + break;
>> + if (MI.getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST)
>> + ReadConsts[i] =MI.getOperand(
>> + TII->getOperandIdx(MI.getOpcode(), OpTable[i][1])).getImm();
>> + }
>> + return ConstPairs(ReadConsts);
>> +}
>> +
>> +bool
>> +R600SchedStrategy::isBundleable(const MachineInstr& MI) {
>> + const ConstPairs &MIPair = getPairs(TII, MI);
>> + for (unsigned i = 0; i < 4; i++) {
>> + if (!InstructionsGroupCandidate[i])
>> + continue;
>> + const ConstPairs &IGPair = getPairs(TII,
>> + *InstructionsGroupCandidate[i]->getInstr());
>> + if (!IGPair.isCompatibleWith(MIPair))
>> + return false;
>> + }
>> + return true;
>> +}
>> +
>> +SUnit *R600SchedStrategy::PopInst(std::multiset<SUnit *,
> CompareSUnit> &Q) {
>> + if (Q.empty())
>> + return NULL;
>> + for (std::set<SUnit *, CompareSUnit>::iterator It = Q.begin(), E =
> Q.end();
>> + It != E; ++It) {
>> + SUnit *SU = *It;
>> + if (isBundleable(*SU->getInstr())) {
>> + Q.erase(It);
>> + return SU;
>> + }
>> + }
>> + return NULL;
>> +}
>> +
>> +void R600SchedStrategy::LoadAlu() {
>> + ReadyQueue *QSrc = Pending[IDAlu];
>> + for (ReadyQueue::iterator I = QSrc->begin(),
>> + E = QSrc->end(); I != E; ++I) {
>> + (*I)->NodeQueueId &= ~QSrc->getID();
>> + AluKind AK = getAluKind(*I);
>> + AvailableAlus[AK].insert(*I);
>> + }
>> + QSrc->clear();
>> +}
>> +
>> +void R600SchedStrategy::PrepareNextSlot() {
>> + DEBUG(dbgs() << "New Slot\n");
>> + assert (OccupedSlotsMask && "Slot wasn't filled");
>> + OccupedSlotsMask = 0;
>> + memset(InstructionsGroupCandidate, 0,
> sizeof(InstructionsGroupCandidate));
>> + LoadAlu();
>> +}
>> +
>> +void R600SchedStrategy::AssignSlot(MachineInstr* MI, unsigned Slot) {
>> + unsigned DestReg = MI->getOperand(0).getReg();
>> + // PressureRegister crashes if an operand is def and used in the same
> inst
>> + // and we try to constraint its regclass
>> + for (MachineInstr::mop_iterator It = MI->operands_begin(),
>> + E = MI->operands_end(); It != E; ++It) {
>> + MachineOperand &MO = *It;
>> + if (MO.isReg() && !MO.isDef() &&
>> + MO.getReg() == MI->getOperand(0).getReg())
>> + return;
>> + }
>> + // Constrains the regclass of DestReg to assign it to Slot
>> + switch (Slot) {
>> + case 0:
>> + MRI->constrainRegClass(DestReg,
> &AMDGPU::R600_TReg32_XRegClass);
>> + break;
>> + case 1:
>> + MRI->constrainRegClass(DestReg,
> &AMDGPU::R600_TReg32_YRegClass);
>> + break;
>> + case 2:
>> + MRI->constrainRegClass(DestReg,
> &AMDGPU::R600_TReg32_ZRegClass);
>> + break;
>> + case 3:
>> + MRI->constrainRegClass(DestReg,
> &AMDGPU::R600_TReg32_WRegClass);
>> + break;
>> + }
>> +}
>> +
>> +SUnit *R600SchedStrategy::AttemptFillSlot(unsigned Slot) {
>> + static const AluKind IndexToID[] = {AluT_X, AluT_Y, AluT_Z, AluT_W};
>> + SUnit *SlotedSU = PopInst(AvailableAlus[IndexToID[Slot]]);
>> + SUnit *UnslotedSU = PopInst(AvailableAlus[AluAny]);
>> + if (!UnslotedSU) {
>> + return SlotedSU;
>> + } else if (!SlotedSU) {
>> + AssignSlot(UnslotedSU->getInstr(), Slot);
>> + return UnslotedSU;
>> + } else {
>> + //Determine which one to pick (the lesser one)
>> + if (CompareSUnit()(SlotedSU, UnslotedSU)) {
>> + AvailableAlus[AluAny].insert(UnslotedSU);
>> + return SlotedSU;
>> + } else {
>> + AvailableAlus[IndexToID[Slot]].insert(SlotedSU);
>> + AssignSlot(UnslotedSU->getInstr(), Slot);
>> + return UnslotedSU;
>> + }
>> + }
>> +}
>> +
>> +bool R600SchedStrategy::isAvailablesAluEmpty() const {
>> + return Pending[IDAlu]->empty() &&
> AvailableAlus[AluAny].empty() &&
>> + AvailableAlus[AluT_XYZW].empty() &&
> AvailableAlus[AluT_X].empty() &&
>> + AvailableAlus[AluT_Y].empty() &&
> AvailableAlus[AluT_Z].empty() &&
>> + AvailableAlus[AluT_W].empty() &&
> AvailableAlus[AluDiscarded].empty();
>> +}
>> +
>> +SUnit* R600SchedStrategy::pickAlu() {
>> + while (!isAvailablesAluEmpty()) {
>> + if (!OccupedSlotsMask) {
>> + // Flush physical reg copies (RA will discard them)
>> + if (!AvailableAlus[AluDiscarded].empty()) {
>> + OccupedSlotsMask = 15;
>> + return PopInst(AvailableAlus[AluDiscarded]);
>> + }
>> + // If there is a T_XYZW alu available, use it
>> + if (!AvailableAlus[AluT_XYZW].empty()) {
>> + OccupedSlotsMask = 15;
>> + return PopInst(AvailableAlus[AluT_XYZW]);
>> + }
>> + }
>> + for (unsigned Chan = 0; Chan < 4; ++Chan) {
>> + bool isOccupied = OccupedSlotsMask & (1 << Chan);
>> + if (!isOccupied) {
>> + SUnit *SU = AttemptFillSlot(Chan);
>> + if (SU) {
>> + OccupedSlotsMask |= (1 << Chan);
>> + InstructionsGroupCandidate[Chan] = SU;
>> + return SU;
>> + }
>> + }
>> + }
>> + PrepareNextSlot();
>> + }
>> + return NULL;
>> +}
>> +
>> +SUnit* R600SchedStrategy::pickOther(int QID) {
>> + SUnit *SU = 0;
>> + ReadyQueue *AQ = Available[QID];
>> +
>> + if (AQ->empty()) {
>> + MoveUnits(Pending[QID], AQ);
>> + }
>> + if (!AQ->empty()) {
>> + SU = *AQ->begin();
>> + AQ->remove(AQ->begin());
>> + }
>> + return SU;
>> +}
>> +
>>
>> Added: llvm/trunk/lib/Target/R600/R600MachineScheduler.h
>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/R600/R600MachineScheduler.h?rev=176498&view=auto
>>
> ==============================================================================
>> --- llvm/trunk/lib/Target/R600/R600MachineScheduler.h (added)
>> +++ llvm/trunk/lib/Target/R600/R600MachineScheduler.h Tue Mar 5 12:41:32
> 2013
>> @@ -0,0 +1,121 @@
>> +//===-- R600MachineScheduler.h - R600 Scheduler Interface -*- C++
> -*-------===//
>> +//
>> +// The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>>
> +//===----------------------------------------------------------------------===//
>> +//
>> +/// \file
>> +/// \brief R600 Machine Scheduler interface
>> +//
>>
> +//===----------------------------------------------------------------------===//
>> +
>> +#ifndef R600MACHINESCHEDULER_H_
>> +#define R600MACHINESCHEDULER_H_
>> +
>> +#include "R600InstrInfo.h"
>> +#include "llvm/CodeGen/MachineScheduler.h"
>> +#include "llvm/Support/Debug.h"
>> +#include "llvm/ADT/PriorityQueue.h"
>> +
>> +using namespace llvm;
>> +
>> +namespace llvm {
>> +
>> +class CompareSUnit {
>> +public:
>> + bool operator()(const SUnit *S1, const SUnit *S2) {
>> + return S1->getDepth() > S2->getDepth();
>> + }
>> +};
>> +
>> +class R600SchedStrategy : public MachineSchedStrategy {
>> +
>> + const ScheduleDAGMI *DAG;
>> + const R600InstrInfo *TII;
>> + const R600RegisterInfo *TRI;
>> + MachineRegisterInfo *MRI;
>> +
>> + enum InstQueue {
>> + QAlu = 1,
>> + QFetch = 2,
>> + QOther = 4
>> + };
>> +
>> + enum InstKind {
>> + IDAlu,
>> + IDFetch,
>> + IDOther,
>> + IDLast
>> + };
>> +
>> + enum AluKind {
>> + AluAny,
>> + AluT_X,
>> + AluT_Y,
>> + AluT_Z,
>> + AluT_W,
>> + AluT_XYZW,
>> + AluDiscarded, // LLVM Instructions that are going to be eliminated
>> + AluLast
>> + };
>> +
>> + ReadyQueue *Available[IDLast], *Pending[IDLast];
>> + std::multiset<SUnit *, CompareSUnit> AvailableAlus[AluLast];
>> +
>> + InstKind CurInstKind;
>> + int CurEmitted;
>> + InstKind NextInstKind;
>> +
>> + int InstKindLimit[IDLast];
>> +
>> + int OccupedSlotsMask;
>> +
>> +public:
>> + R600SchedStrategy() :
>> + DAG(0), TII(0), TRI(0), MRI(0) {
>> + Available[IDAlu] = new ReadyQueue(QAlu, "AAlu");
>> + Available[IDFetch] = new ReadyQueue(QFetch, "AFetch");
>> + Available[IDOther] = new ReadyQueue(QOther, "AOther");
>> + Pending[IDAlu] = new ReadyQueue(QAlu<<4, "PAlu");
>> + Pending[IDFetch] = new ReadyQueue(QFetch<<4,
> "PFetch");
>> + Pending[IDOther] = new ReadyQueue(QOther<<4,
> "POther");
>> + }
>> +
>> + virtual ~R600SchedStrategy() {
>> + for (unsigned I = 0; I < IDLast; ++I) {
>> + delete Available[I];
>> + delete Pending[I];
>> + }
>> + }
>> +
>> + virtual void initialize(ScheduleDAGMI *dag);
>> + virtual SUnit *pickNode(bool &IsTopNode);
>> + virtual void schedNode(SUnit *SU, bool IsTopNode);
>> + virtual void releaseTopNode(SUnit *SU);
>> + virtual void releaseBottomNode(SUnit *SU);
>> +
>> +private:
>> + SUnit *InstructionsGroupCandidate[4];
>> +
>> + int getInstKind(SUnit *SU);
>> + bool regBelongsToClass(unsigned Reg, const TargetRegisterClass *RC)
> const;
>> + AluKind getAluKind(SUnit *SU) const;
>> + void LoadAlu();
>> + bool isAvailablesAluEmpty() const;
>> + SUnit *AttemptFillSlot (unsigned Slot);
>> + void PrepareNextSlot();
>> + SUnit *PopInst(std::multiset<SUnit *, CompareSUnit> &Q);
>> +
>> + void AssignSlot(MachineInstr *MI, unsigned Slot);
>> + SUnit* pickAlu();
>> + SUnit* pickOther(int QID);
>> + bool isBundleable(const MachineInstr& MI);
>> + void MoveUnits(ReadyQueue *QSrc, ReadyQueue *QDst);
>> +};
>> +
>> +} // namespace llvm
>> +
>> +#endif /* R600MACHINESCHEDULER_H_ */
>
> Hi Vincent,
>
> I only did a quick review, but it looks great.
>
> Please keep in mind that the MachineScheduler is still under development. Unit
> tests will prevent future changes from breaking your target too. I know it's
> extremely hard to write scheduler unit tests but you may be able to come up with
> a few.
>
> -Andy
>
More information about the llvm-commits
mailing list