[PATCH] D40508: Replace long type names in IR with hashes

Wed Nov 29 22:47:14 PST 2017

sepavloff added a comment.

In https://reviews.llvm.org/D40508#939513, @rjmccall wrote:

> In https://reviews.llvm.org/D40508#938854, @sepavloff wrote:
>
> > In https://reviews.llvm.org/D40508#938686, @rjmccall wrote:
> >
> > > The LLVM linking model does not actually depend on struct type names matching.  My understanding is that, at best, it uses that as a heuristic for deciding whether to make an effort to unify two types, but it's not something that's ultimately supposed to impact IR semantics.
> >
> >
> > It is mainly true with an exception, when `llvm-link` resolves opaque types it relies on type name only. And this behavior creates the issue that https://reviews.llvm.org/D40567 tries to resolve.
>
>
> It is not clear from that report what the actual problem is.  Two incomplete types get merged by the linker?  So what?

`llvm-link` is expected to produce IR that is semantically consistent with the program initially represented by a set of TUs. In this case it is not true. A function defined in source as `foo(ABC<int>&)` is converted by linking to `foo<int*> &)` and this breaks initial semantics.

>>> If we needed IR type names to match reliably, we would use a mangled name, not a pretty-print.
>> 
>> There is no requirement for IR type name to be an identifier, so pretty-print fits the need of type identification.
> 
> Not really; pretty-printing drops a lot of information that is pertinent in a stable identifier, like scopes and so on, and makes arbitrary decisions about other things, like where to insert spaces, namespace qualifiers, etc.

Type name mangling indeed is attractive solution. It has at least the advantages:

- It is reversible (in theory).
- It can be more compact. For instance, there is no need to spell type name that is encountered already, a some kind of reference is sufficient.

And there are arguments against it:

- It make working with IR harder for developers as readability is broken,
- Type name in IR is mostly a decoration, with the exception of rare case of opaque type resolution, so type name mangling may be considered as an overkill.

On the other hand pretty-printing can be finely tuned so that all necessary information appears in its result. As there is no requirements on compatibility of type names in bitcode files, things like number of spaces look not so important, it is enough that the same version of clang was used to compile bc files that are linked by `llvm-link`. After all it is readable.

https://reviews.llvm.org/D40508