[PATCH] D140271: [NFCI][llvm-exegesis] Benchmark: parallelize codegen (5x ... 8x less wallclock)

Thu Jan 5 13:54:30 PST 2023

lebedev.ri added a comment.

To answer the question i know will come up: yes, `thread-batch-size` is somewhat useful.
The problem is that the final snippet to be measured can greatly vary in size,
depending on the unroll factor. And when you are trying to measure many instructions,
e.g. all 20k of them, and end up with ~40k snippets, if each one takes 1MB
(worst case scenario), you already need 40GB RAM.
We don't really know beforehand how much bytes any particular snippet will take,
so having a limit on the available memory to be used seems less feasible.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140271/new/

https://reviews.llvm.org/D140271