[PATCH] D84609: [MemDepAnalysis] Cut-off threshold reshuffling

Sun Jul 26 15:31:14 PDT 2020

lebedev.ri created this revision.
lebedev.ri added reviewers: nikic, fhahn, void, joerg, hfinkel, asbirlea, ebrevnov, jdoerfert.
lebedev.ri added a project: LLVM.
Herald added subscribers: bmahjour, hiraditya.

As we have seen/established in D84108 <https://reviews.llvm.org/D84108>, the cut-off thresholds for Memory Dependence Analysis may need tuning.
In particular, currently they are:

- scan up to 100 instructions per block. IMO this may be a bit low. (rL179713 <https://reviews.llvm.org/rL179713>, changed from `500`)
- Scan up to 1000 (*sic*) blocks. That seems to be somewhat high.. (D16123 <https://reviews.llvm.org/D16123>)
- There is no cumulative limit on scanned instructions, so one query can end up scanning up to 100 instructions/block * 1k blocks = 100k instructions. Just as @nikic noted in https://reviews.llvm.org/D84108#2174113, that is indeed kinda just insane.

I've collected some numbers (with limits originally unset, on test-suite + RawSpeed + darktable),
what is the maximal value of a statistic for a single query per-TU:

Number of instructions scanned in block
=======================================

    50%     75%     80%     90%     91%     92%     93%     94%     95%     96%     97%     98%     99%   99.5%   99.9%  99.95%    100%
  30.00   65.00  114.00  176.30  209.00  246.24  293.21  358.18  388.60  430.00  480.00  593.00  920.00 2077.23 6087.00 6222.00 6222.00

Proposed action: no change, keep it at `100`, which is ~80'th percentile - the cut-off is actively helpful.

Numbers of blocks scanned
=========================

   50%    75%    80%    90%    91%    92%    93%    94%    95%    96%    97%    98%    99%  99.5%  99.9% 99.95%   100%
  10.0   39.0   55.0   61.0   67.0   77.0   88.0   94.0  104.0  125.0  145.0  179.0  278.0  355.5  670.5 1045.4 1908.0

  The current limit is `1000`, which is 99.95'th percentile,
  i.e. we aren't really enforcing anything with it.
  I propose to lower it to 99th percentile (`250`).

Instructions scanned total in a single query
============================================

   50%     75%     80%     90%     91%     92%     93%     94%     95%     96%     97%     98%     99%   99.5%   99.9%  99.95%    100%
  66.0   318.0   336.0   463.0   498.0   518.0   571.0   683.0   791.0   986.0  1321.0  2241.0  3872.0  4847.5  8744.3  8828.0 16250.0

  I propose to go with 96'th-97'th percentile, and establish a new limit of `1000`,
  which is a reduction of 100 times from the current implicit limit,
  and 25 times less than the new implied limit.

So IOW, we will now be willing to process 4x less blocks per query,
and 100x less instructions per query total.

This does help somewhat with the D84108 <https://reviews.llvm.org/D84108>'s lencod context_ini compile-time regression.

That being said, these are rather arbitrary cut-off's, that happen to seem sane
on the code bases viewed, so inevitably there's some code somewhere that doesn't
fit within them, and it may therefore be penalized by lower thresholds.
But compile-time budget is finite, so we have to make some decisions, sometimes.

llvm-compile-time-tracker is currently running, no results as of present:
http://llvm-compile-time-tracker.com/compare.php?from=b1210c059d1ef084ecd275ed0ffb8343ac3cdfad&to=e3c7a0b96f8800ab21e2d5a8be414e5f83fa1aed&stat=instructions

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D84609

Files:
  llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
  llvm/lib/Analysis/MemoryDependenceAnalysis.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D84609.280757.patch
Type: text/x-patch
Size: 14754 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200726/13edf71f/attachment.bin>