[llvm-dev] new backend in llvm

Thu Oct 24 09:34:26 PDT 2019

Hi,

On Thu, 24 Oct 2019 at 07:11, 林政宗 via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> I notice that not all backends are included in CPUType. For example, Sparc is included, but MIPS is not. What is this CPUType used for? How could I decide whether I need to add Cpu0 to it?

The MachO format is only used by Apple, you can safely ignore it. But
for informational purposes, the CPUType defines what target the MachO
file is for, it's primarily used by the linker and dynamic loader
(dyld) to determine which version of a file it should read.

> 2) About include/llvm/BinaryFormat/ELFRelocs/*.def, how could we decide that which items are needed and which are not? what materials do you suggest me to read?

This is ultimately guided by what you and other compiler writers need
to be able to fixup at link-time or later. There are some common
features, almost all targets have some relocation to fill out a whole
word with an address for example, but most relocations exist because
the CPU has some special instruction that needs an address inserted
into particular bits later, maybe PC-relative.

I think you should probably follow the tutorial as a baseline (the
R_CPU0_* entries in https://jonathan2251.github.io/lbd/elf.html) and
observe how each one is used later on to get a better understanding.
You could do it the other way, of course: implement each one as it's
needed and you're thinking about it.

> 3) In Cpu0 backend tutorial(llvm 3.9), there is a file named Cpu0Schedule.td:
> //===----------------------------------------------------------------------===//
> // Functional units across Cpu0 chips sets. Based on GCC/Cpu0 backend files.
> //===----------------------------------------------------------------------===//
> def ALU     : FuncUnit;
> def IMULDIV : FuncUnit;
>
> //===----------------------------------------------------------------------===//
> // Instruction Itinerary classes used for Cpu0
> //===----------------------------------------------------------------------===//
> def IIAlu              : InstrItinClass;
> def II_CLO             : InstrItinClass;
> def II_CLZ             : InstrItinClass;
> def IILoad             : InstrItinClass;
> def IIStore            : InstrItinClass;
> //#if CH >= CH4_1 1
> def IIHiLo             : InstrItinClass;
> def IIImul             : InstrItinClass;
> def IIIdiv             : InstrItinClass;
> //#endif
> def IIBranch           : InstrItinClass;
>
> def IIPseudo           : InstrItinClass;
>
> //===----------------------------------------------------------------------===//
> // Cpu0 Generic instruction itineraries.
> //===----------------------------------------------------------------------===//
> //@ http://llvm.org/docs/doxygen/html/structllvm_1_1InstrStage.html
> def Cpu0GenericItineraries : ProcessorItineraries<[ALU, IMULDIV], [], [
> //@2
>   InstrItinData<IIAlu              , [InstrStage<1,  [ALU]>]>,
>   InstrItinData<II_CLO             , [InstrStage<1,  [ALU]>]>,
>   InstrItinData<II_CLZ             , [InstrStage<1,  [ALU]>]>,
>   InstrItinData<IILoad             , [InstrStage<3,  [ALU]>]>,
>   InstrItinData<IIStore            , [InstrStage<1,  [ALU]>]>,
> //#if CH >= CH4_1 2
>   InstrItinData<IIHiLo             , [InstrStage<1,  [IMULDIV]>]>,
>   InstrItinData<IIImul             , [InstrStage<17, [IMULDIV]>]>,
>   InstrItinData<IIIdiv             , [InstrStage<38, [IMULDIV]>]>,
> //#endif
>   InstrItinData<IIBranch           , [InstrStage<1,  [ALU]>]>
> ]>;
>
> There is 10 InstrItinClass definition. When should we add a new InstrItinClass definition for a new cpu or chip?
> I read the comment in TargetItinerary.td:
>
> // Instruction itinerary classes - These values represent 'named' instruction
> // itinerary.  Using named itineraries simplifies managing groups of
> // instructions across chip sets.  An instruction uses the same itinerary class
> // across all chip sets.  Thus a new chip set can be added without modifying
> // instruction information.
> //
> class InstrItinClass;
> def NoItinerary : InstrItinClass;
> But I still can not understand it. Are there any materials?

As far as I know there's no large body of documentation on LLVM
schedulers. It's not really my area, but I think there are two
different ways to define schedules, and the InstrItin* variant is the
obsolete one. The newer method associates instructions with their
timing properties in the scheduler definition; InstRW seems to be the
key class there, and AArch64 has schedulers using it that you could
look at for inspiration.

But to answer your question anyway. Going by that description you'd
add a new InstrItinClass when there is a CPU you care about that
handles an instruction differently enough that you want to schedule it
specially. You can see why that isn't ideal because just one weird CPU
could split up the instruction's class for everyone.

Cheers.

Tim.