[llvm] [LLVM][MDL] First integration of MDL with LLVM (PR #78002)

Thu Dec 12 13:14:21 PST 2024

================
@@ -0,0 +1,2483 @@
+
+# MPACT Microarchitecture Description Language
+
+Reid Tatge          [tatge at google.com](mailto:tatge at google.com)
+
+
+## **Goals for a Machine Description Language**
+
+Modern processors are complex: multiple execution pipelines, dynamically dispatched, out-of-order execution, register renaming, forwarding networks, and (often) undocumented micro-operations. Instruction behaviors, including micro-operations, often can’t be _statically_ modeled in an accurate way, but only _statistically_ modeled. In these cases, the compiler’s model of a microarchitecture (Schedules and Itineraries in LLVM) is effectively closer to a heuristic than a formal model. And this works quite well for general purpose microprocessors.
+
+However, modern accelerators have different and/or additional dimensions of complexity: VLIW instruction issue, unprotected pipelines, tensor/vector ALUs, software-managed memory hierarchies. And it's more critical that compilers can precisely model the details of that complexity. Currently, LLVM’s Schedules and Itineraries aren’t adequate for directly modeling many accelerator architectural features.
+
+So we have several goals:
+
+1. We want a first-class, purpose-built, intuitive language that captures all the scheduling and latency details of the architecture - much like Schedules and Itineraries - that works well for all current targets, but also for a large class of accelerator architectures..
+2. The complexity of the specification should scale with the complexity of the hardware. 
+3. The description should be succinct, avoiding duplicated information, while reflecting the way things are defined in a hardware architecture specification.
+4. We want to generate artifacts that can be used in a machine-independent way for back-end optimization, register allocation, instruction scheduling, etc - anything that depends on the behavior and constraints of instructions.
+5. We want to support a much larger class of architectures in one uniform manner.
+
+For this document (and language), the term “instructions” refers to the documented instruction set of the machine, as represented by LLVM instructions descriptions, rather than undocumented micro-operations used by many modern microprocessors. 
+
+The process of compiling a processor’s machine description creates several primary artifacts:
+
+*   For each target instruction (described in td files), we create an object that describes the detailed behaviors of the instruction in any legal context (for example, on any functional unit, on any processor)
+*   A set of methods with machine independent APIs that leverage the information associated with instructions to inform and guide back-end optimization passes.  
+
+The details of the artifacts are described later in this document.
+
+_Note: A full language grammar description is provided in an appendix.  Snippets of grammar throughout the document only provide the pertinent section of the grammar, see the Appendix A for the full grammar._
+
+The proposed language can be thought of as an _optional extension to the LLVM machine description_. For most upstream architectures, the new language offers minimal benefit other than a much more succinct way to specify the architecture vs Schedules and Itineraries.  But for accelerator-class architectures, it provides a level of detail and capability not available in the existing tablegen approaches.
+
+### **Background**
+
+Processor families evolve over time. They accrete new instructions, and pipelines change - often in subtle ways - as they accumulate more functional units and registers; encoding rules change; issue rules change. Understanding, encoding, and using all of this information - over time, for many subtargets - can be daunting.  When the description language isn’t sufficient to model the architecture, the back-end modeling evolves towards heuristics, and leads to performance issues or bugs in the compiler. And it certainly ends with large amounts of target specific code to handle “special cases”. 
+
+LLVM uses the [TableGen](https://llvm.org/docs/TableGen/index.html) language to describe a processor, and this is quite sufficient for handling most general purpose architectures - there are 20+ processor families currently upstreamed in LLVM! In fact, it is very good at modeling instruction definitions, register classes, and calling conventions.  However, there are “features” of modern accelerator micro-architectures which are difficult or impossible to model in tablegen.
+
+We would like to easily handle:
+
+*   Complex pipeline behaviors
+    *   An instruction may have different latencies, resource usage, and/or register constraints on different functional units or different operand values.
----------------
reidtatge wrote:

So in accelerators, its not uncommon to have versions of an instruction that have different latencies based on the kind of an operand - register, vs immediate, vs memory, etc.  (Its not common on most RISC processors.) These cases will have distinct instructions in LLVM (since they have different operand types), but we wanted a way to model groups of similar instructions that share a lot of pipeline attributes.  So you can - optionally - set latencies based on operand types. That could include register id, or specific constant values.  Again, we wanted to easily support (almost) anything, no matter how weird. And things like TPUs are pretty weird. And we wanted to be able to handle the weirdness without making simple devices hard to model.


https://github.com/llvm/llvm-project/pull/78002