[llvm] [DWARF] Speedup .gdb_index dumping (PR #151806)

Wed Aug 6 10:54:11 PDT 2025

dwblaikie wrote:

> > Myself, I've worked on various indexing solutions at Google due to the large size of single binaries we have
> 
> Although I'm not particularly interested in gdb_index, I am very much interested into another index format: GSYM. Given your vast experience with different indexes, maybe you know places/folks to ask questions/request reviews about it?

Yeah, that's definitely @clayborg's wheelhouse.

(thanks for the other context on your use cases)

> > Hmm, actually at a high level: I guess this ConstantPoolVectors isn't sorted, is it? So we can't do a binary search... could we sort it? I guess not - since we do want to dump it in a way that matches the input too (in case the on-disk ordering is important to debugging the data at some point)?
> 
> I think it is sorted by construction, but given that we look for exact match, for big enough vectors it likely would still be faster to put offset->id in a hash map rather than do a binary search.

Hmm - is it? It looked like teh offsets were read in from the file in the SymTableSize loop parseImpl - doesn't look like that's necessarily ordered...

https://github.com/llvm/llvm-project/pull/151806