<div dir="ltr"><div><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Oct 11, 2013 at 11:48 AM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">> With C++'s ODR, we are able to unique C++ types by using type identifiers to<br>
> refer to types.<br>
> Type identifiers are generated by C++ mangler. What about languages without<br>
> ODR? Should we unique C types as well?<br>
><br>
<br>
</div>We can, but the identifier will need to be constructed on, likely, a<br>
language dependent basis to ensure uniqueness.<br>
<div class="im"><br>
> One solution for C types is to generate a cross-CU unique identifier for C<br>
> types. And before linking, we update all type identifiers in a source module<br>
> with the corresponding hash of the C types, then linking can continue as<br>
> usual.<br>
><br>
<br>
</div>Yes.<br>
<div class="im"><br>
> This requires clang to generate a cross-CU unique identifier for C types<br>
> (one simple scheme is using a identifier that is unique within the CU and<br>
> concatenating the CU's file name). And it also requires hashing of C types<br>
> at DebugInfo IR level. We can add an API such as<br>
> updateTypeIdentifiers(Module *), linker can call it right before linking in<br>
> a source module.<br>
><br>
<br>
</div>I think the easiest design you'll get for uniquing C types that are<br>
named the same thing (i.e. type defined in a .h file) is to use the<br>
name of the struct combined with the file (and possibly line/column)<br>
as an identifier.</blockquote><div><br></div><div>Since we don't have ODR, we may have macros defined differently for a struct in a .h file,</div><div>thus having two versions of the struct from two different CU. It seems that we can't assume</div>
<div>structs with the same name and defined in the same file/line/column are the same.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> If you want to unify by structure then you'll need<br>
to do something the equivalent to the type hashing that we're<br>
implementing in the back end, but that'll be more difficult to<br>
construct via the front end - it may be possible though.<br></blockquote><div><br></div><div>Hashing the types can happen either at the front end or at IR level. That is our first design choice :)</div><div><br></div><div>
I think we should try not to hash the types for non-LTO builds at the front end or at IR level, since it does not give us</div><div>any benefit given that we are hashing them at the back end.</div><div><br></div><div>One advantage of hashing it at IR level is that we can just hash the MDNodes that affect the</div>
<div>type MDNode, at front end, the AST contains more information and should be harder to hash.</div><div><br></div><div>Thanks,</div><div>Manman</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span class="HOEnZb"><font color="#888888"><br>
-eric<br>
</font></span></blockquote></div><br></div></div>