[PATCH] D117451: [MCA] Proposing the InstrPostProcess:modifyInstrDesc() method.

Mon Jan 17 05:00:33 PST 2022

andreadb added a comment.

> One thing that I find potentially unsettling about allowing developers to modify this object is that the Writes and Reads vectors are built in a specific way. For example, in the Writes vector, the explicit defs come first, then the implicit defs, then optional defs, then variadic defs. I'm not positive whether or not this order is important during mca's simulations, but I did add a comment within the generic modifyInstrDesc definition that says to maintain that order.

Field `OpIndex` in `ReadDescriptor`/`WriteDescriptor` stores the `MCOperand` index from the original `MCInst`. Changing that value might break the logic in `InstrBuilder` that peeks into the original `MCInst`.

For implicit reads and writes, the register definition is already encoded in tablegen, and therefore `InstrBuilder` doesn't need to peek the actual register identifier from the `MCOperand`. The register value is already known statically, and its value is stored in field `RegisterID` (see for example `ReadDescriptor::RegisterID`). This effectively means that, at least for implicit reads/writes, we don't need to worry about whether `OpIndex` is a valid `MCOperand` index or not. That being said, some debug printing logic might still try to decode the original operand index when pretty-printing defs/uses.

Example: this is how we lower reads:

  MCPhysReg RegID = 0;
  for (const ReadDescriptor &RD : D.Reads) {
    if (!RD.isImplicitRead()) {
      // explicit read.
      const MCOperand &Op = MCI.getOperand(RD.OpIndex);
      // Skip non-register operands.
      if (!Op.isReg())
        continue;
      RegID = Op.getReg();
    } else {
      // Implicit read.
      RegID = RD.RegisterID;
    }

Changing the value of field `ReadDescriptor::UseIndex` has the potential of breaking the way how read-advance entries are resolved during the simulation. If possible, those indices should never be touched.

It would be better if we could be more specific about what can be post-processed, and possibly define a more restrictive API. Toggling some feature flags ON/OFF is generally safe. Changes to `RetireOOO` and `MayRead`/`MayWrite` don't have the potential of breaking the simulation.

Changing `OpIndex` and `UseIndex` are likely to break the lowering logic in `InstrBuilder`, and introduce bugs in the read-advance resolution logic which are very hard to triage.

Things get much more complicated (and potentially problematic) if we decide to let users tweak register defs/uses. However, it should be fairly simple to allow a few basic changes.

For example, users should be allowed to:

1. Eliminate/disable specific register defs/uses.
2. Add new IMPLICIT defs/uses.

About 1. We could introduce a special `Ignored` flag in the read/write descriptor. `InstrBuilder` would then skip operands whose read/write descriptor has been specifically marked as `Ignored`. This is just an idea; you can probably achieve this same behaviour in other ways.

About 2: Implicit defs/uses encode their modified/read register identifier in field `RegisterID`. Adding implicit reads/writes should be relatively easy because InstrBuilder doesn't require a valid OpIndex for those. For those new operands, `OpIndex` could reference a "special invalid index" (since it won't be used). For the purpose of ReadAdvance, UseIndex should default to an invalid operand index. I don't think that there is a good way to encode read-advance for custom new operands.

Things gets much more complicated if we allow adding "explicit" register defs/uses. I don't think that there is a good way to add support for those. In most cases, being able to add implicit defs/uses is already more than enough. There may be specific cases which would require support for adding explicit register uses. For example this one: https://github.com/llvm/llvm-project/issues/38161, which is about instructions that have an un-modelled false dependency on a register definition. To fix those cases, we may need to introduce a special "aliasing register use" which behaves like a normal explicit use, with the difference that the register index would be interpreted as an alias of an already defined read/write operand in mca::Instruction. This is just an idea. But I think it would be enough to fix that bug.

  To account for this, modifyInstrDesc returns a bool which is true if and only if the instruction has been modified. If this method returns true, then we print a message stating that the instruction has been modified and then we re-print all of this information. Because re-printing this information would require the same LLVM_DEBUG code, I took this code and made functions for each of them that are then called from their respective places. Also, for these functions to be able to tell what type of def/use a particular write/read is, I had to add the IsVariadicDef and IsVariadicUse attributes to the WriteDescriptor and ReadDescriptor structs.

Sounds good.

  This above paragraph is something that I'm not sure I did properly (best practice wise). As far as I understand, the LLVM_DEBUG() blocks are code that will be omitted when building llvm in Release mode. I surrounded the bodies of these new functions with the LLVM_DEBUG blocks and I surrounded the calls to these new functions with the LLVM_DEBUG blocks, but I did not surround the actual definitions of these functions. Should I?

I agree with Min on this. You should wrap definitions with NDEBUG if those are only meant to be used in debug.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117451/new/

https://reviews.llvm.org/D117451