[PATCH] D115138: [llvm-mca] Compare multiple files

Thu Dec 9 06:47:25 PST 2021

andreadb added inline comments.

================
Comment at: llvm/utils/llvm-mca-compare-multiple-files.py:34
+# Parse the program arguments.
+def parse_program_args(parser):
+    parser.add_argument(
----------------
I am not entirely sure if this is the right approach. I suggest to delegate the parsing of llvm-mca specific arguments to the llvm-mca driver itself. I would only parse here flags that are specific to your script only.

In one of my previous posts, I was essentially suggesting to only parse fine names, and simply forward all other options to each llvm-mca invocation. Basically, llvm-mca arguments could be simply treated like a generic string to append to the llvm-mca invocation. You don't need to know what is being passed in input. If the llvm-mca run is successful, then those args made sense. You can always inspect the json file for the full set of simulation parameters and the summary view information (for the number of iterations).

The way how it is now is that we parse most llvm-mca arguments here, and then we let llvm-mca do the same again. This design is not ideal because if forces us to mirror changes to (non-view related)  llvm-mca flags to this script too. This script doesn't need to know about the semantic of llvm-mca specific flags. Once the set of input code snippets is known, we simply forward everything else to llvm-mca.

That way, you avoid to duplicate the logic that parses arguments, and you can probably get rid of `verify_program_inputs`.

================
Comment at: llvm/utils/llvm-mca-compare-multiple-files.py:160-162
+    if opts.v:
+        print("run: $ " + llvm_mca_cmd + "\n")
+
----------------
We should get this information from the json output.

See json object "SimulationParameters". Example:

```
  "SimulationParameters": {
    "-march": "x86_64",
    "-mcpu": "znver1",
    "-mtriple": "x86_64-unknown-linux-gnu"
  },
```

That json object also stores information about flags like -noalias and -dispatch. Example:

```
  "SimulationParameters": {
    "-dispatch": 2,
    "-march": "x86_64",
    "-mcpu": "znver1",
    "-mtriple": "x86_64-unknown-linux-gnu",
    "-noalias": false
  },
```

The advantage of doing things this way, is that we know exactly which "native" subtarget was selected (for the case where no -mcpu is passed in input).

================
Comment at: llvm/utils/llvm-mca-compare-multiple-files.py:181-192
+    return Summary(
+        opts.file_names[0],
+        json_parsed["CodeRegions"][0]["SummaryView"]["BlockRThroughput"],
+        json_parsed["CodeRegions"][0]["SummaryView"]["DispatchWidth"],
+        json_parsed["CodeRegions"][0]["SummaryView"]["IPC"],
+        json_parsed["CodeRegions"][0]["SummaryView"]["Instructions"],
+        json_parsed["CodeRegions"][0]["SummaryView"]["Iterations"],
----------------
What happens with asm files declaring multiple code regions? Your script only processes the first code region. I am not sure how we should deal with those.

If the idea is that this script shouldn't support multiple regions, then we should warn if one of the json outputs provides information about more than one code region.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115138/new/

https://reviews.llvm.org/D115138