[PATCH] D158250: [IR] Add more details to StructuralHash

Aiden Grossman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 21 23:36:19 PDT 2023


aidengrossman added inline comments.


================
Comment at: llvm/lib/IR/StructuralHash.cpp:48
+    // get decent coverage over the function.
+    if (ConstantInt *ConstInt = dyn_cast<ConstantInt>(Operand)) {
+      if (ConstInt->getBitWidth() <= 64) {
----------------
aidengrossman wrote:
> aeubanks wrote:
> > we really shouldn't have to re-implement `llvm::hash_value`. IIUC the reason we're not is because 32-bit vs 64-bit hashes are different which causes MergeFunctions to act differently because it sorts hashes. but IMO MergeFunctions shouldn't be sorting by hash, that's kinda bad. I think we should go back to the original version of https://reviews.llvm.org/D158217 and fix MergeFunctions to be more deterministic and not rely on hash values. It seems like we're just currently lucky in that the hashes for test cases in MergeFunctions are ordered consistently between 32/64 bit.
> I agree that it shouldn't be sorting by hash, but I'd rather not touch that if I don't have to. When I relanded that differential in 64da0be1fc06ee2199bd27c980736986e0eccd9d I updated it to use what `MergeFunctions` did originally and use `hash_16_bytes` along with a `uint64_t` rather than `size_t` (forgot to update the differential with the changes). The MergeFunction test was consistent before https://reviews.llvm.org/D158217 as they just directly called `hash_16_bytes` and stored everything as a `uint64_t`.
Sorry for not reading your message close enough initially. I missed basically the entire point. I've switched over to using the native `hash_value` implementations for `APInt` and `APFloat` to avoid the reimplementation. I think within `StructuralHash` we can guarantee the same hash between 32 bit and 64 bit platforms outside of the detailed hash case, but inside of the detailed hash case we don't make that guarantee. It would be good to order by something other than hash within `MergeFunctions`, but the ordering needs to be deterministic and ideally fast and hashes make quite a bit of sense for that (other than the significant churn when they change).

I'm thinking keeping the determinism between 64-bit/32-bit makes sense to preserve the MergeFunctions semantics (at least for now) for the non-detailed case and the detailed case can take advantage of `hash_value` implementations where appropriate. What do you think?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158250/new/

https://reviews.llvm.org/D158250



More information about the llvm-commits mailing list