[llvm-dev] Implementation of DWARF expression parser

Gwynne Raskind via llvm-dev llvm-dev at lists.llvm.org
Sat Feb 20 22:50:03 PST 2016


Hi,

This is my first post to this list, so I apologize in advance if I mess up on any list etiquette. Jumping right in, I’m making use of the DebugInfo/DWARF APIs to get debugging information out of binaries (what else!). One of the bits of data I need is the location information stored in the location list section as well as inline in DW_AT_location attributes and similar.

So far I’ve succeeded in making enough sense of the API to actually extract the raw data, though I’ve found the API somewhat confusing - for example, as far as I can tell there are two different code paths that the DWARFUnit implementation takes to extract DIEs which both seem to do basically the same thing? I’m well versed in the DWARF format, but not so much in LLVM itself as of yet; it wouldn’t take much for me to be just plain wrong about what’s going on. (That being said, I could lose myself in the LLVM code base for a long time if I let myself; it’s already taught me more about compiler design and even the C++ language than my entire career up to now!)

To give some detail, the code paths I saw were DWARFCompileUnit::getNumDIEs(), which parses all DIEs via the extractDIEsToVector() path, and DWARFCompileUnit::extract(), which parses just one specified DIE via the extractImpl() path. It’s not obvious to me why the extractImpl() code exists alongside DWARFDebugInfoEntryMinimal::extractFast(). In general, the "extract" paradigm is difficult to get a handle on, and I’ve yet to find any documentation. Another example is the DWARFContext::getCompileUnitForAddress() API, which is private; I’ve couldn't find a way to invoke its logic (aside from iterating all the units and using DWARFDebugInfoEntryMinimal::getInlinedChainForAddress() on each). I have the sensation that I’m misunderstanding the intended usage pattern for these objects entirely, or that at the very least I’m thinking at the wrong abstraction layer.

Unfortunately, I’m stopped here. As far as I can tell, there is no implementation of a DWARF expression parser, per §2.5 of the DWARF 4 standard, which is necessary for making sense of DWARF location information. I don’t mind building one myself, but before I do that I’d like to know if I’m duplicating effort. If there is in fact an expression parser in the LLVM core, and not just in places where it would be obviously needed (such as LLDB), I haven’t found it, and I’d appreciate a pointer to what I missed. Or even if someone’s just working on such a parser already, that would be great to know.

If there are other resources I could use besides the source code and the mailing list to get answers for questions like this, I’d be grateful for pointers to those too; I haven’t managed to find much useful information from simple Web searches thus far. I’m also curious if this is the sort of thing for which filing a feature request would be appropriate.

Thanks in advance for any help!

— Gwynne Raskind



More information about the llvm-dev mailing list