[PATCH] D137505: [SCEV] Cache ZExt SCEV expressions.

Sun Nov 27 16:13:31 PST 2022

fhahn marked 3 inline comments as done.
fhahn added inline comments.

================
Comment at: llvm/include/llvm/Analysis/ScalarEvolution.h:73
+
+class FoldID {
+  SmallVector<unsigned, 4> Bits;
----------------
mkazantsev wrote:
> It looks general enough to be moved to some utils, no? Just to think in the future.
It is similar to `FoldingSetNodeID` which is used in multiple places. I played around with that quite a bit, but didn't manage to reach the same performance as with the version inline here.

Having this separate here is not ideal, but at the moment I don't see an easy way to unify/share the code.

================
Comment at: llvm/include/llvm/Analysis/ScalarEvolution.h:79
+  void addInteger(unsigned I) { Bits.push_back(I); }
+  void addInteger(int I) { Bits.push_back(I); }
+
----------------
mkazantsev wrote:
> The fact that we have different API for signed and unsigned leads me to think that we might be able to distinguish between them, e.g. distinguish set with single `(int)1` from set with `(unsigned) 1`, but in fact we can't. What was the intention?
> 
> Maybe better remove signed version.
The main reason for the integer overload is to avoid ambiguous function calls when `addInteger` is called with signed ints. In the end the signdness doesn't really matter, as we only look at the bits. 

================
Comment at: llvm/include/llvm/Analysis/ScalarEvolution.h:84
+    addInteger(unsigned(I >> 32));
+  }
+
----------------
mkazantsev wrote:
> What worries me is that how easy to get a collision here. Basically, `{0, 0, ~0, ~0, 0, 0}` will collide with `{0ULL, ~0ULL, 0ULL}`. In your use case it should really be fine, because we add a lot of pointers, but if it is to be done more universal, need to think on making it harder to cheat (mix in types into hash?).
Yeah, unfortunately this is a potential issue. I went with your suggestion to include the number of entries in the vector in the hash, which should hopefully help a bit.

================
Comment at: llvm/include/llvm/Analysis/ScalarEvolution.h:97
+  unsigned computeHash() const {
+    unsigned Hash = Bits[0];
+    for (unsigned I = 1; I != Bits.size(); ++I)
----------------
mkazantsev wrote:
> `assert(!Bits.empty())`?
> Otherwise, if we want to allow empty sets (dunno why, just for generality), maybe start with `Hash =Bits.size()` and then mix in all elements?
> Otherwise, if we want to allow empty sets (dunno why, just for generality), maybe start with Hash =Bits.size() and then mix in all elements?

I went with adding `Hash = Bits.size()`. 

================
Comment at: llvm/include/llvm/Analysis/ScalarEvolution.h:125
+  static unsigned getHashValue(const FoldID &Val) {
+    return Val.computeHash() & ~(1 << (8 * sizeof(unsigned) - 1));
+  }
----------------
mkazantsev wrote:
> is it just `& std::max<int>()`?  Not asking to change, just trying to understand what was the intention.
The main reason was to avoid potential conflicts with the empty/tombstone values, but that shouldn't really help much. Also, now that the hash uses ` Hash = Bits.size()` it should be even less of an issue.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137505/new/

https://reviews.llvm.org/D137505