[LLVMdev] New kind of metadata to capture LLVM IR linking structure

David Blaikie dblaikie at gmail.com
Fri Mar 20 11:39:39 PDT 2015


On Fri, Mar 20, 2015 at 11:30 AM, Khilan Gudka <Khilan.Gudka at cl.cam.ac.uk>
wrote:

> Hello all
>
> llvm-link merges together the metadata from the IR files being linked
> together. This means that when linking different libraries together (i.e.
> multiple source files that have been compiled into a single LLVM IR file)
> it can be hard or impossible to identify the library boundaries.
>
> We're using LLVM to do static analysis of applications (together with
> their dependent libraries) and have found it useful to be able to determine
> which library a particular Instruction* or GlobalVariable* came from (e.g.
> so that we can ignore some of them, or focus analysis on particular ones).
>
> To preserve this information across linking, I've implemented a new kind
> of metadata node MDLLVMModule that records:
>
> 1) Which LLVM modules (i.e. LLVM IR file) have been linked into this LLVM
> module
> 2) Which compilation units directly contribute to this LLVM module (i.e.
> that are not part of an LLVM submodule)
>
> The format of the metadata looks like this:
>
> !llvm.module = !{!0}
>
> !0 = !MDLLVMModule(name: "test123.bc", modules: !1, cus: !24)
> !1 = !{!2}
> !2 = !MDLLVMModule(name: "test12.bc", cus: !3)
> !3 = !{!4, !18}
> !4 = !MDCompileUnit(... filename: "test1.c" ...)
> !18 = !MDCompileUnit(... filename: "test2.c" ...)
> !24 = !{!25}
> !25 = !MDCompileUnit(... filename: "test3.c" ...)
>
> Each linked LLVM module has the named metadata node "llvm.module" that
> points
> to its own MDLLVMModule node. In this example, we see that this is the
> metadata
> for llvm module "test123.bc" that is built up from linking module
> "test12.bc"
> and the compilation unit "test3.c." Module "test12.bc" itself is built up
> by linking the compilation units "test1.c" and "test2.c"
>
> The name of a module defaults to the base filename of the output file, but
> this can be overridden with the (also new) command-line flag -module-name
> to llvm-link, as in:
>
> llvm-link -module-name=mytest -o test.bc <files>
>
> I thought this might be useful to the wider LLVM community and would like
> to see this added to LLVM.
>
> I have attached a patch that I produced against r232466. I've also added a
> corresponding DILLVMModule class.
>

What's the benefit/purpose of the MDLLVMModule over just having the
MDCompileUnits themselves? I would imagine the user cares about which
source file the problem was in (obtained from the MDCompileUnit), not the
sequence of BC modules that may've been built into?


>
> I would be very grateful if someone could review this.
>
> Thanks
>
> --
> Khilan Gudka
> Research Associate
> Security Group
> Computer Laboratory
> University of Cambridge
> http://www.cl.cam.ac.uk/~kg365/
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150320/be608702/attachment.html>


More information about the llvm-dev mailing list