[llvm-dev] Find uses of Metadata / DITypes

Thu Mar 26 09:24:37 PDT 2020

Thanks, that helps. At a very high level, I would be very careful about adding additional metadata. Particularly the DIType hierarchy can get quite large and has in the past been a memory and performance bottleneck. If you are planning to upstream your work, you'll need to measure the impact on a full-LTO build of, e.g., clang itself and prove that the memory usage doesn't explode. However, it sounds like what are trying to do is more pointed to improve optimization remarks, so it might be feasible to just scan the information on demand looking just at what is currently visible (like the IR Verifier is doing, for example), without blowing up the footprint for everything else.

>  Would it be unreasonable to save Metadata to Metadata uses, like what is done for Value to Value uses?

If you look at DIComposite type in C++, we don't even store the uses in the other direction as metadata pointer, but instead refer to types by their unique name, to support type uniquing in LTO.

Admittedly, I haven't thought about this deeply, but the way I would approach it would be to enumerate what types are visible from within each (inlined) lexical scope by walking only the llvm.dbg.* intrinsics in that scope, and build up a dictionary on the side to capture the reverse links.

On Mar 25, 2020, at 10:49 AM, Henrik Olsson <hnrklssn at gmail.com> wrote:
> 
> Ah yes, of course! For our thesis we're trying to reconstruct Value names from the IR to C syntax, to help with clarity in optimisation remarks. To do this for something like a GetElementPointer we first have to find the name of the pointer operand, and then we try to name the offsets. Naming array offsets is relatively straightforward, but for structs we need the DICompositeType which contains the struct field names. So we make the recursive call to name the pointer operand also return the operand's DIType, and from this we get the base type of the pointer. However we're struggling with handling bitcasts properly at the moment. When the pointer operand of a GEP is a bitcast, say from a struct pointer type to a pointer to the struct's first field we can figure that out by diffing the Value types and then traversing the DIType accordingly. When the cast is from smaller to wider type, on the other hand, we cannot just traverse the DIType structure to the wider type as the link is only in one direction. We recognise that there may be several potential wider types matching the Value type, but this is a best effort matching.

Would you mind posting a concrete IR example for this? It makes it easier for me to visualize the problem.

thanks,
adrian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200326/ba2a0a24/attachment.html>