[PATCH] D86644: [llvm-mca] Initial implementation of output serialization using JSON

Wed Aug 26 12:26:38 PDT 2020

andreadb added a comment.

Thanks Wolfgang for working on this.

I suggest a few changes to the JSON ouput:

1. Have a single (top level) JSON object for the assembly code sequence (see example below):

  "InstructionSequence": [
    "addl\t%eax, %eax",
    "addl\t%ebx, %ebx",
    "addl\t%ecx, %ecx",
    "addl\t%edx, %edx"
  ]

2. Have Views reference code strings by simply using indices "InstructionSequence". This is to avoid to have to unnecessarily print the same instruction over and over (since different views may need to reference specific instructions).

Example (based on the code snippet from your test):

Before:

  {
    "Instruction": "addl\t%ecx, %ecx",
    "Latency": 1,
    "NumMicroOpcodes": 1,
    "RThroughput": "0.5",
    <...>
  }

After:

  {
    "Instruction": 1
    "Latency": 1,
    "Latency": 1,
    "NumMicroOpcodes": 1,
    "RThroughput": "0.5",
  }

3. Have a top level object describing the simulated cpu. That object would need two fields:
  - A cpu name (in your test, it would report "haswell").
  - An array of processor resource units (strings).

The principle is the same as for the code sequence: if a View needs to reference a processor resource unit, then just use an index to that array.

4. (optional) Change the ResourcePressureView object by splitting the array into multiple arrays (i.e. one per each instruction in the code sequence).

Point 4. is optional, but only if point 3 is implemented. Without point 3. there is no correspondence between the resource cycles and the (names of) consumed resources.
You can still infer the number of resources by dividing the number of elements in the array by the number of instructions in the code sequence.
Splitting that single array into a sequence of arrays would give a bit more structure and make (at least in my opinion) the entire output a bit more readable.

-

More in general: I suggest to remove the version number from the JSON format. We can add it in future if we need to.

About the test: I agree with Roman. We want to check both the structure and the numbers. So an autogenerated test that doesn't omit details is much much better.
Also (as Roman already suggested), please limit the number of iterations to 1. Otherwise your json output would be really too big (especially if you print the timeline!).

-Andrea

P.s.: in a future patch, we need a way to print out warnings (generated by mca during a simulation) as part of the json output.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86644/new/

https://reviews.llvm.org/D86644