[PATCH] D79036: [SCCP] Switch to widen at PHIs, stores and call edges.

Wed Apr 29 14:33:34 PDT 2020

nikic added a comment.

In D79036#2010532 <https://reviews.llvm.org/D79036#2010532>, @fhahn wrote:

> In terms of compile-time, it seems to have a relatively small negative impact (worst is +0.20%) according to http://llvm-compile-time-tracker.com/compare.php?from=209ab6d8835cd88320ceb814893759cfbda91d15&to=6ab3496169adf3115d3476b1cdde30cc4d0614f1&stat=instructions
>
> Comparing cycles, it looks mostly positive, with  a few -2%+ improvements in compile-time

Pretty sure those cycles are some unlucky noise. From the instruction counts, the impact looks fine. Looking into the per-file breakdown, there is one case that may be worth double checking: `libclamav_upx.c` from the ClamAV benchmark has a 4-5% increase in the ThinLTO configuration (which only runs mid-level optimizer, so SCCP stands out more). Possibly that exposes some pathological case.

================
Comment at: llvm/lib/Transforms/Scalar/SCCP.cpp:741
       continue;
+    NumActiveIncoming++;

----------------
The lazy counting of active values might be problematic: It works fine if phi args become feasible from left to right, but not if it happens from right to left. In that case, upon hitting the newly active argument, we will always set MaxWidenSteps to 2, regardless of how many of the later arguments are already active.

================
Comment at: llvm/lib/Transforms/Scalar/SCCP.cpp:746
+        Res.mergeIn(IV, ValueLatticeElement::MergeOptions().setMaxWidenSteps(
+                            NumActiveIncoming + 1));
     if (Res.isOverdefined())
----------------
The main problem with this heuristic is that it does not distinguish which incoming value contributes to the widening. Intuitively, what we want is to allow widening once per operand, but what could end up happening is that we have a phi node with 64 arguments, 63 of them are constant, and the last one is an addrec. In this case, we would end up executing the loop 64 times. Of course, it's something of a degenerate case...

I don't have an immediate suggestion on how to improve this at the same level of complexity though.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79036/new/

https://reviews.llvm.org/D79036