<div dir="ltr"><div><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Oct 11, 2013 at 11:48 AM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">> With C++'s ODR, we are able to unique C++ types by using type identifiers to<br>

> refer to types.<br>

> Type identifiers are generated by C++ mangler. What about languages without<br>

> ODR? Should we unique C types as well?<br>

><br>

<br>

</div>We can, but the identifier will need to be constructed on, likely, a<br>

language dependent basis to ensure uniqueness.<br>

<div class="im"><br>

> One solution for C types is to generate a cross-CU unique identifier for C<br>

> types. And before linking, we update all type identifiers in a source module<br>

> with the corresponding hash of the C types, then linking can continue as<br>

> usual.<br>

><br>

<br>

</div>Yes.<br>

<div class="im"><br>

> This requires clang to generate a cross-CU unique identifier for C types<br>

> (one simple scheme is using a identifier that is unique within the CU and<br>

> concatenating the CU's file name). And it also requires hashing of C types<br>

> at DebugInfo IR level. We can add an API such as<br>

> updateTypeIdentifiers(Module *), linker can call it right before linking in<br>

> a source module.<br>

><br>

<br>

</div>I think the easiest design you'll get for uniquing C types that are<br>

named the same thing (i.e. type defined in a .h file) is to use the<br>

name of the struct combined with the file (and possibly line/column)<br>

as an identifier.</blockquote><div><br></div><div>Since we don't have ODR, we may have macros defined differently for a struct in a .h file,</div><div>thus having two versions of the struct from two different CU. It seems that we can't assume</div>

<div>structs with the same name and defined in the same file/line/column are the same.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> If you want to unify by structure then you'll need<br>


to do something the equivalent to the type hashing that we're<br>

implementing in the back end, but that'll be more difficult to<br>

construct via the front end - it may be possible though.<br></blockquote><div><br></div><div>Hashing the types can happen either at the front end or at IR level. That is our first design choice :)</div><div><br></div><div>

I think we should try not to hash the types for non-LTO builds at the front end or at IR level, since it does not give us</div><div>any benefit given that we are hashing them at the back end.</div><div><br></div><div>One advantage of hashing it at IR level is that we can just hash the MDNodes that affect the</div>

<div>type MDNode, at front end, the AST contains more information and should be harder to hash.</div><div><br></div><div>Thanks,</div><div>Manman</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<span class="HOEnZb"><font color="#888888"><br>

-eric<br>

</font></span></blockquote></div><br></div></div>