[llvm-dev] [tablegen] table readability / performance
Luke Drummond via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 14 04:59:33 PST 2020
Hello
I've been looking at the tables generated by
`SequenceToOffsetTable::emit`, and notice that when the generated data
are strings, the data is basically un-grep-able, and very tricky to
read, as they are emitted as an array of comma-separated char-literal:
extern const char HexagonInstrNameData[] = {
/* 0 */ 'G', '_', 'F', 'L', 'O', 'G', '1', '0', 0,
/* 9 */ 'E', 'N', 'D', 'L', 'O', 'O', 'P', '0', 0,
/* 18 */ 'V', '6', '_', 'v', 'd', 'd', '0', 0,
/* 26 */ 'P', 'S', '_', 'v', 'd', 'd', '0', 0,
/* 34 */ 'V', '6', '_', 'l', 'd', '0', 0,
/* 41 */ 'V', '6', '_', 'z', 'l', 'd', '0', 0,
[...]
};
As far as I can see, this makes it more difficult than necessary to read
for at least the following cases:
Target AsmStrs
Target InstrNameData
Target RegStrings
Target RegClassStrings
I hacked together a fix for the above cases locally, and found that for
at least for clang and gcc, the compile-time for generated tables is
significantly reduced when emitting string literals, and the user can
grep the name tables without huge effort. The above table is now:
extern const char HexagonInstrNameData[] = {
/* 0 */ "G_FLOG10\0"
/* 9 */ "ENDLOOP0\0"
/* 18 */ "V6_vdd0\0"
/* 26 */ "PS_vdd0\0"
/* 34 */ "V6_ld0\0"
/* 41 */ "V6_zld0\0"
[...]
};
My question then is: Is there a specific technical reason that we should
avoid emitting concatenated string literals rather array of
comma-separated char literals for "string-like" data?
If not, I can probably post a patch, which I feel will make it much
easier to understand the output from tablegen, and helps compilation
speed of generated tables.
Any thoughts appreciated.
All the Best
Luke
--
Codeplay Software Ltd.
Company registered in England and Wales, number: 04567874
Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF
More information about the llvm-dev
mailing list