[llvm-dev] A query language for LLVM IR (XPath)

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Tue Oct 31 10:00:26 PDT 2017


This is so cool! I once had a similar idea but the way I was thinking about
it ended up more complex than I had time to implement (I sketched it here:
http://lists.llvm.org/pipermail/llvm-dev/2013-November/067720.html).

Good idea using xpath to simplify the implementation and reuse existing
languages/libraries as a starting point!

On Oct 29, 2017 6:47 AM, "Alessandro Di Federico via llvm-dev" <
llvm-dev at lists.llvm.org> wrote:

> Hi, sometimes when dealing with LLVM IR getting to a desired point of
> the code is a bit cumbersome, in particular if you're instrumenting
> existing code. A lot of nested loops and if checks.
>
> Maybe all of this could be avoided by employing a query language. Since
> an LLVM module can be seen as a sort of tree with attributes, I think
> that reusing an existing query language for XML would be appropriate.
>
> In particular I choose XPath [1] since it's more expressive than, say,
> CSS selectors (e.g., you can move from the current element to the
> parent).
>
> Therefore, in a spare night, I took pugixml [2], a lightweight XML parser
> with XPath support, stripped away everything was XML-specific and
> adapted it so that it could query an arbitrary tree, as long as a class
> providing certain traits is provided.
>
> Attached you can find the class to query a LLVM module and example LLVM
> module (using LLVM 3.8, but newer versions should do to).
>
> The current implementation pretends that a module looks like the
> following XML tree (more or less):
>
>     <main.ll>
>       <main>
>         <basicblock1>
>           <alloca />
>           <alloca />
>           ...
>         </basicblock1>
>         ...
>       </main>
>     </main.ll>
>
> Additional information could be encoded in attributes.
> Please note that the queries are done on the LLVM IR directly, no XML
> tree is materialized.
>
> In the following you can find some examples:
>
>     $ # Find all the basic blocks containing at least an alloca
>     $ llvm-xpath '/main/*[count(alloca) > 0]' main.ll
>
>       %1 = alloca i32, align 4
>       %2 = alloca i32, align 4
>       %i = alloca i32, align 4
>       store i32 0, i32* %1, align 4
>       store i32 %argc, i32* %2, align 4
>       %3 = load i32, i32* %2, align 4
>       store i32 %3, i32* %i, align 4
>       br label %4
>
>     $ # Find all store instructions
>     $ llvm-xpath '/*/*/store'
>       store i32 0, i32* %1, align 4
>       store i32 %argc, i32* %2, align 4
>       store i32 %3, i32* %i, align 4
>       store i32 %6, i32* %i, align 4
>
> Obviously this doesn't have to be exclusively a command line tool, but
> we could have something like:
>
>     for (auto *Store : TheModule.xpath<StoreInst>("/*/*/store"))
>       /* ... */
>
> I'm not releasing the full code yet since it's very much work in
> progress, but if anyone is interested in such a thing, just ping me.
> The applications could range from using it in existing code to just
> provide it for fast prototyping, e.g., in llvmcpy [3].
>
> Obviously there are some open questions, such as how to deal with
> operands, which could lead to an infinite tree, or how to organize
> attributes. But it should be doable.
>
> ---
> Alessandro Di Federico
> PhD student at Politecnico di Milano
>
> [1] https://en.wikipedia.org/wiki/XPath
> [2] https://pugixml.org/
> [3] https://github.com/revng/llvmcpy
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171031/67c98be4/attachment.html>


More information about the llvm-dev mailing list