[LLVMdev] How to reduce the footprint of MDNodes? (About the comment you made at BOF LTO)

Chris Lattner clattner at apple.com
Tue Nov 12 16:14:28 PST 2013


On Nov 12, 2013, at 1:28 PM, Chandler Carruth <chandlerc at google.com> wrote:
> On Mon, Nov 11, 2013 at 11:29 PM, Chris Lattner <clattner at apple.com> wrote:
> Hi Manman (and llvmdev),
> 
> I filed these two bugs to track the ideas that I was cooking:
> 
> http://llvm.org/bugs/show_bug.cgi?id=17891
> http://llvm.org/bugs/show_bug.cgi?id=17892
> 
> TL;DR: I'm saying we should go from:
> 
>         !14 = metadata !{i32 786445, metadata !1, metadata !10, metadata !"y", i32 3, i64 32, i64 32, i64 32, i32 0, metadata !13}
> to:
>         !14 = metadata !"v12,14,y,3,0,32,32,32"(metadata !1, metadata !13)
> 
> So, I like where you're going here, but a few tweaks.
> 
> First, there are two things going on here: removing an indirection through a referenced metadata node and flattening N values into a string inclusion. Removing the indirection seems obvious strict goodness, my comments are about the second part.

David pointed out that the PR17892 is a microoptimization that may not even be worthwhile.  I think that the flattening into a string is the important part.

> I'm moderately opposed to just encoding these in a string format. I think we can do something substantially better both for space, time, and readability. Fundamentally, there is no reason for the original metadata node you describe to not *encode* its operands into a dense bit-packed blob of memory. We can still expose APIs that manipulate them as separate entities, and have the AsmPrinter and AsmParser map back and forth with nice human-readable forms. But even a simple varint encoding will be both smaller and faster than ascii.

I guess you could make it work, but would that actually be simpler than what is proposed?  If it is denser, how much denser would it have to be to justify the complexity?

> Just to be clear, I still want the nice format (much like your proposed format, but maybe with the numbers outside of the "s) in the textual IR, I just think we should use a more direct and efficient in-memory encoding (and in-bitcode encoding if that isn't already suitably dense).

Where would the encoding schema be specified?

Note that there are simple things that can be done to make MDNodes more efficient in common cases.  The CallbackVH is only necessary when pointing to Value*’s that are not MDNode/MDString, and Constants-other-than-GlobalValue.  If we make MDNode detect when it has “all-immortal” operands (like most debug info nodes) then we could just store Value*’s directly.  This would be a completely invisible implementation improvement, but would not provide the same level of improvement as the “flatten into strings” approach.  The two are quite complementary.

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/b94228a1/attachment.html>


More information about the llvm-dev mailing list