[PATCH] D140271: [NFCI][llvm-exegesis] Benchmark: parallelize codegen (5x ... 8x less wallclock)
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 5 13:54:30 PST 2023
lebedev.ri added a comment.
To answer the question i know will come up: yes, `thread-batch-size` is somewhat useful.
The problem is that the final snippet to be measured can greatly vary in size,
depending on the unroll factor. And when you are trying to measure many instructions,
e.g. all 20k of them, and end up with ~40k snippets, if each one takes 1MB
(worst case scenario), you already need 40GB RAM.
We don't really know beforehand how much bytes any particular snippet will take,
so having a limit on the available memory to be used seems less feasible.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140271/new/
https://reviews.llvm.org/D140271
More information about the llvm-commits
mailing list