[PATCH] D158250: [IR] Add more details to StructuralHash

Aiden Grossman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 18 14:31:36 PDT 2023


aidengrossman added inline comments.


================
Comment at: llvm/lib/IR/StructuralHash.cpp:83-91
+          if (const IntrinsicInst *InstrinsicInstruction =
+                  dyn_cast<IntrinsicInst>(&Inst))
+            hash(InstrinsicInstruction->getIntrinsicID());
+          if (const CallInst *CallInstruction = dyn_cast<CallInst>(&Inst))
+            hash(CallInstruction->getCalledFunction()->getName());
+
+          for (unsigned I = 0; I < Inst.getNumOperands(); ++I) {
----------------
nikic wrote:
> aidengrossman wrote:
> > nikic wrote:
> > > This seems like a very random collection of things to add to the hash. Why isn't this just hashing all the operands? That should cover the operand types, the called function and the intrinsic ID.
> > I was under the impression that it wasn't possible to just hash a value. I can hash the pointer, but I'm not sure that would be correct in all cases (unless everything is uniqued appropriately).
> > 
> > https://github.com/llvm/llvm-project/blob/d9cb76bc4d5e903fe045c58a42fc791d0c70172b/llvm/include/llvm/Analysis/IRSimilarityIdentifier.h#L261 implements logic that seems to follow those assumptions (and is a similar implementation to what is here).
> > 
> > Definitely could be that my assumptions are incorrect here though.
> You are right that we can't "just" hash the operand pointers, but I'd still use that as the general approach. If the operand is a `Constant` you should be able to hash the pointer as those are uniqued, for Arguments you can take the argument number, and for Instructions you could use only the type for now (to handle those we'd have to number instructions).
Also, re: pointers, my intention with this is to have a hash that is stable across different modules (interested in looking at function deduplication across modules) given the same function which makes just hashing pointers provide incorrect results.

I'm planning on writing in support for other constant types in the future and instruction operands through numbering (in follow-up patches). My intention with the `Detailed` flag currently isn't to be make every semantically-meaningful difference produce a different hash but to capture most of the common cases and be "good enough" at most cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158250/new/

https://reviews.llvm.org/D158250



More information about the llvm-commits mailing list