[llvm] [IR2Vec] Restructuring Vocabulary (PR #145119)

Aiden Grossman via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 3 20:14:57 PDT 2025


================
@@ -343,26 +431,80 @@ Error IR2VecVocabAnalysis::readVocabulary() {
     return createStringError(errc::illegal_byte_sequence,
                              "Vocabulary sections have different dimensions");
 
-  auto scaleVocabSection = [](ir2vec::Vocab &Vocab, double Weight) {
-    for (auto &Entry : Vocab)
-      Entry.second *= Weight;
-  };
-  scaleVocabSection(OpcodeVocab, OpcWeight);
-  scaleVocabSection(TypeVocab, TypeWeight);
-  scaleVocabSection(ArgVocab, ArgWeight);
-
-  Vocabulary.insert(OpcodeVocab.begin(), OpcodeVocab.end());
-  Vocabulary.insert(TypeVocab.begin(), TypeVocab.end());
-  Vocabulary.insert(ArgVocab.begin(), ArgVocab.end());
+  Dim = OpcodeDim; // All sections have the same dimension
 
   return Error::success();
 }
 
-IR2VecVocabAnalysis::IR2VecVocabAnalysis(const Vocab &Vocabulary)
-    : Vocabulary(Vocabulary) {}
+void IR2VecVocabAnalysis::generateNumMappedVocab() {
+
+// Placeholder for handling missing entities in the vocabulary.
+// Currently, we use a zero vector. In the future, we will throw an error to
+// ensure that *all* known entities are present in the vocabulary.
+#define HANDLE_MISSING_ENTITY(VAL)                                             \
----------------
boomanaiden154 wrote:

This might be cleaner as a helper function rather than a macro?

https://github.com/llvm/llvm-project/pull/145119


More information about the llvm-commits mailing list