[PATCH] D66369: [TableGen] Make MCInst decoding more table-driven
Nicolas Guillemot via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 16 14:43:47 PDT 2022
nlguillemot added a comment.
Here's what I tried to test the performance:
1. I made the following modification which deliberately exaggerates the performance cost of disassembly:
diff --git a/llvm/tools/llvm-mc/Disassembler.cpp b/llvm/tools/llvm-mc/Disassembler.cpp
index 16ab99548adf..1a584e4023c1 100644
--- a/llvm/tools/llvm-mc/Disassembler.cpp
+++ b/llvm/tools/llvm-mc/Disassembler.cpp
@@ -46,7 +46,8 @@ static bool PrintInsts(const MCDisassembler &DisAsm,
MCInst Inst;
MCDisassembler::DecodeStatus S;
- S = DisAsm.getInstruction(Inst, Size, Data.slice(Index), Index, nulls());
+ for (int i = 0; i < 200; i++)
+ S = DisAsm.getInstruction(Inst, Size, Data.slice(Index), Index, nulls());
switch (S) {
case MCDisassembler::Fail:
SM.PrintMessage(SMLoc::getFromPointer(Bytes.second[Index]),
2. I ran llvm-lit on the `llvm/test/MC/Disassembler` folder and measured the time it takes to run the tests. (Note: The modification above causes some tests to fail, but most of them still pass.)
I turned off all other apps on my computer and I tried to give time for my machine to cool down a bit between runs, so hopefully the measurement is fair and roughly stable.
The results of running llvm-lit on this folder before and after the patch are as follows.
Before:
0m42.260s
0m43.443s
0m43.443s
0m45.445s
0m44.963s
0m43.998s
0m45.456s
0m43.990s
0m44.779s
0m44.253s
Average: 0m44.2031s
After:
0m43.732s
0m43.697s
0m44.078s
0m44.273s
0m44.415s
0m45.005s
0m44.738s
0m44.630s
0m44.466s
0m45.041s
Average: 0m44.4075s
Based on these results it looks like there might be a relatively small (<1%) runtime regression.
I did some tests locally and found the performance could be improved more in two ways:
1. Try to use LEB128 less where possible. It's slow to decode LEB128.
2. Make specialized bit extractor functions for some common arrangements of bit extractor parameters, then dispatch to these specialized bit extractors by adding more enums to DecoderCodeletID. Improves performance a bit, but increases the complexity of the implementation, so I'm not sure if it's worth it since I think this code is usually not a bottleneck anyways.
Disclaimer: These measurements are from a long time ago, don't know if it would still come out the same way now.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D66369/new/
https://reviews.llvm.org/D66369
More information about the llvm-commits
mailing list