[llvm] Refactoring llvm-ir2vec.cpp for better separation of concerns in the Tooling classes (PR #170078)

Mon Dec 8 21:46:01 PST 2025

================
@@ -282,45 +372,31 @@ class IR2VecTool {
 
   /// Generate embeddings for a single function
   void generateEmbeddings(const Function &F, raw_ostream &OS) const {
+    assert(Vocab && Vocab->isValid() && "Vocabulary not initialized");
     if (F.isDeclaration()) {
       OS << "Function " << F.getName() << " is a declaration, skipping.\n";
       return;
     }
 
-    // Create embedder for this function
-    assert(Vocab->isValid() && "Vocabulary is not valid");
-    auto Emb = Embedder::create(IR2VecEmbeddingKind, F, *Vocab);
-    if (!Emb) {
-      WithColor::error(errs(), ToolName)
-          << "Failed to create embedder for function " << F.getName() << "\n";
-      return;
-    }
-
     OS << "Function: " << F.getName() << "\n";
----------------
nishant-sachdeva wrote:

While I'm at it, I noticed that the two different `generateEmbeddings(...)` functions handle their vocabulary checks differently. This does not seem ideal , since they both can be invoked independently by a caller. 

It would be better to have a standard error message for the same possible error. Will push this change as well, unless there's any objection

https://github.com/llvm/llvm-project/pull/170078