[llvm] [NFC][TableGen] Emit more readable builtin string table. (PR #105445)
Rahul Joshi via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 21 20:49:40 PDT 2024
================
@@ -637,15 +636,17 @@ void IntrinsicEmitter::EmitIntrinsicToBuiltinMap(
// Populate the string table with the names of all the builtins after
// removing this common prefix.
- StringToOffsetTable Table;
+ SequenceToOffsetTable<StringRef> Table;
----------------
jurahul wrote:
I was finally able to run the experiments and as you suspected, SequenceToOffset table is slower. I did 2 experiments:
1. I measured the time spent in `EmitIntrinsicToBuiltinMap` when IsClang = true (as that's where we handle the most amount of intrinsics) for both cases. For `SequenceToOffset` the min, max, avg = 87, 130, 95.57, and for `StringToOffsetTable` is min, max, avg : 88, 102, 92.28. So its slightly faster. As this is still not the e2e runtime for llvm-tblgen -gen-intrinsic-impl (which is the only option that exercises this code). So the e2e impact will be smaller. The 88 here is 8.8 ms, so we get a a slowdown of 9.557 - 9.228 - 0.3 ms out of total run time of ~0.2s execution time, so its a 0.15% slowdown. Since this command line is executed just once during the entire LLVM build, I'd say the compile time impact is negligible.
2. I setup a microbenchmark in the `EmitIntrinsicToBuiltinMap` (see code below) and in that I see that SequenceToOffset is about 3.4x slower that StringToOffsetTable.
Benchmark code (added to EmitIntrinsicToBuiltinMap):
```C++
// Setup a specific benchmark
RecordKeeper *r = const_cast<RecordKeeper *>(&Records);
#define N 100
size_t offset = 0;
if (IsClang) {
r->startTimer("SequenceToOffsetTable");
for (int i = 0; i < N; i++) {
SequenceToOffsetTable<StringRef> Table;
for (const auto &[TargetPrefix, Entry] : BuiltinMap) {
auto &[Map, CommonPrefix] = Entry;
for (auto &[BuiltinName, EnumName] : Map) {
StringRef Suffix = BuiltinName.substr(CommonPrefix->size());
Table.add(Suffix);
}
}
Table.layout();
for (const auto &[TargetPrefix, Entry] : BuiltinMap) {
auto &[Map, CommonPrefix] = Entry;
for (auto &[BuiltinName, EnumName] : Map) {
StringRef Suffix = BuiltinName.substr(CommonPrefix->size());
offset += Table.get(Suffix);
}
}
}
r->stopTimer();
}
if (IsClang) {
r->startTimer("StringToOffsetTable");
for (int i = 0; i < N; i++) {
StringToOffsetTable Table;
for (const auto &[TargetPrefix, Entry] : BuiltinMap) {
auto &[Map, CommonPrefix] = Entry;
for (auto &[BuiltinName, EnumName] : Map) {
StringRef Suffix = BuiltinName.substr(CommonPrefix->size());
Table.GetOrAddStringOffset(Suffix);
}
}
for (const auto &[TargetPrefix, Entry] : BuiltinMap) {
auto &[Map, CommonPrefix] = Entry;
for (auto &[BuiltinName, EnumName] : Map) {
StringRef Suffix = BuiltinName.substr(CommonPrefix->size());
offset += Table.GetOrAddStringOffset(Suffix);
}
}
}
r->stopTimer();
}
errs() << offset << "\n";
```
So in the targeted benchmark, the SequenceToOffsetTable is 3.4x slower, but in e2e tests, its a 0.15% slowdown in that one particular `llvm-tblgen` command line that is executed just once, so much less on the total build time.
https://github.com/llvm/llvm-project/pull/105445
More information about the llvm-commits
mailing list