[PATCH] D126236: [CaptureTracking] Increase limit but use it for all visited uses.

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 2 06:09:03 PDT 2022


nikic accepted this revision.
nikic added a comment.
This revision is now accepted and ready to land.

In D126236#3553122 <https://reviews.llvm.org/D126236#3553122>, @fhahn wrote:

> In D126236#3546453 <https://reviews.llvm.org/D126236#3546453>, @nikic wrote:
>
>>> Good point. Below I added tables with NumCaptured, NumNotcaptured, the sum and NotCaptured/Sum. I've only included SPEC2006/SPEC2017, as those have the largest number of capture queries.
>>
>> Thanks! I've put the data in a spreadsheet for easy comparison: https://docs.google.com/spreadsheets/d/13cwt-aWjAb_a1Ri9vqS4RgCslfDkEwSqYoYpEtuoeeI/edit?usp=sharing For Limit=60 we're doing slightly better than baseline on average (1.1x) with two outlier regressions. For Limit=80 we avoid one outlier and slightly improve the average (1.2x). With Limit=100 we avoid both outliers and again slightly improve the average (1.3x).
>
> I suppose the question is how far we want to go with respect to avoiding any outlier regressions :) I think either 60, 80 or 100 are clear improvements for opaque pointers.
>
> When comparing the numbers on the compile-time-tracker for 60 and 100, the majority of the difference is down to `tramp3d-v4`: https://llvm-compile-time-tracker.com/compare.php?from=520563fdc146319aae90d06f88d87f2e9e1247b7&to=818719fad01d472412c963629671a81a8703b25b&stat=instructions
>
> But I don't think this is mostly caused by time spent in capture-tracking but additional optimizations (binary size difference is about 1.2% for LTO). Looking at the stats, there's an increase in # of dead stores removed and also # of instructions removed by GVN.
>
> Given that, I'd be slightly inclined to go for the larger limit out of the options.

Yeah, I think I agree.

>>> It might be worth exploring different limits for different clients, but I think another major client is GVN (via MemDepAnalysis) and I think exploring a sufficient number of uses there can be quite important (the regression I was investigating was due to missed load elimination by GVN)
>>
>> Unfortunately MemDepAnalysis is exactly the area where cost increases are most problematic :( In part because cross-BB MDA uses completely ridiculous cutoffs -- we should probably go ahead and reduce those, and not wait for the magic bullet of NewGVN or MSSA. But that's neither here nor there.
>
> Yeah that's unfortunate. But I just noticed that MemDepAnalysis appears to use `BatchAA` for most queries already, so it should also cache some of the capture results, right?

Yes, it does use BatchAA, but only within one query. Each query uses a separate BatchAA instance. So the impact of caching is rather limited.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126236/new/

https://reviews.llvm.org/D126236



More information about the llvm-commits mailing list