[PATCH] D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable.

Chandler Carruth via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 28 03:16:44 PST 2017


chandlerc marked 2 inline comments as done.
chandlerc added a comment.

Thanks for all the feedback, landing with suggested fixes.

I've also run SPEC with this change and saw no significant changes so it seems safe performance wise. It also didn't regress a wide range of benchmarks internally.

This change alone is however worth between 2% and 4% latency on one very unusually important benchmark: what is for us the hottest path through tcmalloc code. It helps optimize the size class computation that has long been part of tcmalloc:
https://github.com/gperftools/gperftools/blob/master/src/common.h#L204-L210

I also have at least one improvement on it that I'm also working on but it should be a good follow-up patch. Nothing really wrong as-is, and good to start breaking things into better increments.



================
Comment at: llvm/lib/Transforms/Scalar/SpeculateAroundPHIs.cpp:53-58
+  auto *PhiBB = PN.getParent();
+  SmallPtrSet<Instruction *, 4> Visited;
+  SmallVector<std::pair<Instruction *, User::value_op_iterator>, 16> DFSStack;
+
+  // Walk each user of the PHI node.
+  for (auto &U : PN.uses()) {
----------------
MatzeB wrote:
> Too much `auto` for my taste. I think writing the types is friendlier to code readers (unless you have something like a `cast<XXX>` on the RHS which makes the type obvious.
The RHS says `uses()` and the type is `Use`. That seemed sufficiently redundant to me? The tree has a bit of a mixture, combined with places that are just amazingly confusing. I even have `Use` elsewhere in this file, so happily changed.


================
Comment at: llvm/test/Transforms/SpeculateAroundPHIs/X86/lit.local.cfg:1-2
+if not 'X86' in config.root.targets:
+    config.unsupported = True
----------------
MatzeB wrote:
> As this is only a single test, you could use `REQUIRES: x86-registered-target` in the test itself instead of creating an x86 specific subdirectory.
Ah, yeah, for something this tiny that's likely to be way cleaner long term.


https://reviews.llvm.org/D37467





More information about the llvm-commits mailing list