[llvm] [MIR2Vec] Refactor MIR vocabulary to use opcode-based indexing (PR #161713)

James Y Knight via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 8 14:19:30 PDT 2025


jyknight wrote:

This introduces UB. In the error case, the nullptr in:
```
frame #1: 0x000055555ad9f2ee llc`::getMIR2VecVocabulary() at MIR2Vec.cpp:244:14
   241    if (StrVocabMap.empty()) {
   242      if (Error Err = readVocabulary()) {
   243        emitError(std::move(Err), M.getContext());
-> 244        return mir2vec::MIRVocabulary(std::move(StrVocabMap), nullptr);
   245      }
   246    }
   247 
```
is passed to the constructor which immediately dereferences it and stores it into a reference (invalid for nullptr):
```
   frame #0: 0x000055555ad9e01d llc`::MIRVocabulary() at MIR2Vec.cpp:0:1
   50  
   51   MIRVocabulary::MIRVocabulary(VocabMap &&OpcodeEntries,
   52                                const TargetInstrInfo *TII)
-> 53       : TII(*TII) {
   54     // Fixme: Use static factory methods for creating vocabularies instead of
   55     // public constructors
   56     // Early return for invalid inputs - creates empty/invalid vocabulary
```

Either the TII member needs to be a pointer, or else the constructor can't be called with nullptr.

https://github.com/llvm/llvm-project/pull/161713


More information about the llvm-commits mailing list