[llvm] [LLVM][MDL] First integration of MDL with LLVM (PR #78002)

Wed Dec 11 16:48:56 PST 2024

reidtatge wrote:

Thanks for taking a look Craig! I know its a lot of stuff, and you're no
doubt otherwise occupied :-).
I responded to your questions in the PR.

Happy to answer any more questions!

On Wed, Dec 11, 2024 at 4:18 PM Craig Topper ***@***.***>
wrote:

> ***@***.**** commented on this pull request.
> ------------------------------
>
> In llvm/docs/Mdl/MachineDescriptionNotes.md
> <https://github.com/llvm/llvm-project/pull/78002#discussion_r1881168738>:
>
> > +The details of the artifacts are described later in this document.
> +
> +_Note: A full language grammar description is provided in an appendix.  Snippets of grammar throughout the document only provide the pertinent section of the grammar, see the Appendix A for the full grammar._
> +
> +The proposed language can be thought of as an _optional extension to the LLVM machine description_. For most upstream architectures, the new language offers minimal benefit other than a much more succinct way to specify the architecture vs Schedules and Itineraries.  But for accelerator-class architectures, it provides a level of detail and capability not available in the existing tablegen approaches.
> +
> +### **Background**
> +
> +Processor families evolve over time. They accrete new instructions, and pipelines change - often in subtle ways - as they accumulate more functional units and registers; encoding rules change; issue rules change. Understanding, encoding, and using all of this information - over time, for many subtargets - can be daunting.  When the description language isn’t sufficient to model the architecture, the back-end modeling evolves towards heuristics, and leads to performance issues or bugs in the compiler. And it certainly ends with large amounts of target specific code to handle “special cases”.
> +
> +LLVM uses the [TableGen](https://llvm.org/docs/TableGen/index.html) language to describe a processor, and this is quite sufficient for handling most general purpose architectures - there are 20+ processor families currently upstreamed in LLVM! In fact, it is very good at modeling instruction definitions, register classes, and calling conventions.  However, there are “features” of modern accelerator micro-architectures which are difficult or impossible to model in tablegen.
> +
> +We would like to easily handle:
> +
> +*   Complex pipeline behaviors
> +    *   An instruction may have different latencies, resource usage, and/or register constraints on different functional units or different operand values.
>
> or different operand values.
>
> Is this runtime values. For example, division is often variable latency
> based on the inputs. How can the compiler know the value to make use of
> this?
>
> Or is this referring to immediate values that would be known at compile
> time.
> ------------------------------
>
> In llvm/docs/Mdl/MachineDescriptionNotes.md
> <https://github.com/llvm/llvm-project/pull/78002#discussion_r1881177611>:
>
> > +
> +**Defining an ISA**
> +
> +We need to map a microarchitecture model back to LLVM instruction, operand, and register definitions.  So, the MDL contains constructs for defining instructions, operands, registers, and register classes.
> +
> +When writing a target machine description, its not necessary to write descriptions for instructions, operands, and registers - we scrape all of this information about the CPU ISA from the tablegen output as part of the build process, and produce an MDL file which contains these definitions. The machine description compiler uses these definitions to tie architectural information back to LLVM instructions, operands, and register classes.
> +
> +We will describe these language features here, primarily for completeness.
> +
> +### **Defining Instructions**
> +
> +Instruction definitions are scraped from tablegen files, and provide the following information to the MDL compiler for each instruction:
> +
> +*   The instruction’s name (as defined in the td files)
> +*   Its operands, with the operand type and name provided in the order they are declared, and indicating whether each is an input or output of the instruction.
> +*   A set of “legal” subunit definitions (a “subunit” is described later in this document)
>
> Are the subunits based on something already existing in tablegen or is
> this new?
> ------------------------------
>
> In llvm/docs/Mdl/MachineDescriptionNotes.md
> <https://github.com/llvm/llvm-project/pull/78002#discussion_r1881177827>:
>
> > +**Defining an ISA**
> +
> +We need to map a microarchitecture model back to LLVM instruction, operand, and register definitions.  So, the MDL contains constructs for defining instructions, operands, registers, and register classes.
> +
> +When writing a target machine description, its not necessary to write descriptions for instructions, operands, and registers - we scrape all of this information about the CPU ISA from the tablegen output as part of the build process, and produce an MDL file which contains these definitions. The machine description compiler uses these definitions to tie architectural information back to LLVM instructions, operands, and register classes.
> +
> +We will describe these language features here, primarily for completeness.
> +
> +### **Defining Instructions**
> +
> +Instruction definitions are scraped from tablegen files, and provide the following information to the MDL compiler for each instruction:
> +
> +*   The instruction’s name (as defined in the td files)
> +*   Its operands, with the operand type and name provided in the order they are declared, and indicating whether each is an input or output of the instruction.
> +*   A set of “legal” subunit definitions (a “subunit” is described later in this document)
> +*   An optional list of instructions derived from this one.
>
> What does it mean for an instruction to be derived from another one. Is
> this something already in tablegen or something new?
> ------------------------------
>
> In llvm/docs/Mdl/MachineDescriptionNotes.md
> <https://github.com/llvm/llvm-project/pull/78002#discussion_r1881181317>:
>
> > +
> +```
> +    attribute my_attr = 5 if address;    // if operand is a relocatable address
> +    attribute my_attr = 2 if label;      // if operand is a code address
> +    attribute my_attr = 3 if lit;        // if operand is any literal constant
> +```
> +
> +
> +Predicates for literal constants can also take an optional list of “predicate values”, where each predicate value is either an integer, a range of integers, or a “mask”. Mask predicate values are explicitly checking for non-zero bits:
> +
> +```
> +    attribute my_attr = 5 if lit [1, 2, 4, 8];    // looking for specific values
> +    attribute my_attr = 12 if lit [100..200];     // looking for a range of values
> +    attribute my_attr = 1 if lit [{0x0000FFFF}];  // looking for a 16 bit number
> +    attribute my_attr = 2 if lit [{0x00FFFF00}];  // also a 16-bit number!
> +    attribute my_attr = 3 if lit [1, 4, 10..14, 0x3F800000, {0xFF00FF00}];
>
> What if you need to describe something that can't be easily expressed with
> ranges like a 32-bit even number.
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/llvm/llvm-project/pull/78002#pullrequestreview-2497264336>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AIRIKEZPHTX3DIVWUBZV6CL2FDI6LAVCNFSM6AAAAABTOT5JKKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDIOJXGI3DIMZTGY>
> .
> You are receiving this because you modified the open/close state.Message
> ID: <llvm/llvm-project/pull/78002/review/2497264336 <(249)%20726-4336>@
> github.com>
>

https://github.com/llvm/llvm-project/pull/78002