[llvm] [llvm-ir2vec] Adding initEmbedding API to ir2vec python bindings (PR #177092)

Nishant Sachdeva via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 21 01:47:37 PST 2026


nishant-sachdeva wrote:

@mtrofin , As for the suggestion

> can't the python side present the tool as an object, wrapping the C++ object that holds on the path?

Yes, that's the idea. Currently the `llvm-ir2vec.cpp` tool acts as a wrapper around the IR2Vec Analysis class and invokes the relevant methods to reutrn the embeddings. 

The plan is to have the `PyIR2VecTool` class develop into a similar wrapper. With this in mind, in an earlier PR, we split the llvm-ir2vec.cpp into a lib/ object.

However, the IR2Vec Vocab reading pipeline is tightly linked with the pass CLI tool. Currently, this binding module attempts a hacky fix to override that vocab path and enable reading vocab files per the relevant user args. 

But yes, going ahead, I'm thinking we can plan a larger refactor for the Vocabulary reading pipeline. I think https://github.com/llvm/llvm-project/issues/159427 raised some similar points ? 

CC - @svkeerthy 

https://github.com/llvm/llvm-project/pull/177092


More information about the llvm-commits mailing list