<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Tue, Feb 23, 2016 at 3:07 PM Sanjoy Das <<a href="mailto:sanjoy@playingwithpointers.com">sanjoy@playingwithpointers.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Tue, Feb 23, 2016 at 10:55 AM, Chandler Carruth <<a href="mailto:chandlerc@gmail.com" target="_blank">chandlerc@gmail.com</a>> wrote:<br>

>> Part of the challenge here is to specify the attribute in a way that<br>

>> allows inlining, but not IPA without inlining.  In fact, maybe it is<br>

>> best to not call it "interposable" at all?<br>

><br>

><br>

> Yea, this is something *very* different from interposable. GCC and other<br>

> compilers that work to support symbol interposition make specific efforts to<br>

> not inline them in specific ways (that frankly I don't fully understand, as<br>

> it doesn't seem to be always which is what the definition of interposable<br>

> indicates to me...).<br>

<br>

Sure, not calling it interposable is fine for me.  Credit where credit<br>

is due: Philip had warned me about this exact thing offline (that the<br>

term "interposable" is already taken).<br>

<br>

>> In other words, opt refined the semantics of @foo() (i.e. reduced the<br>

>> set of behaviors it may have) in ways that would make later<br>

>> optimizations invalid if we de-refine the implementation of @foo().<br>

>><br>

>> Given this, I'd say we don't need a new attribute / linkage type, and<br>

>> can add our restriction to the available_externally linkage.<br>

><br>

><br>

> Interesting example, I agree it seems quite broken. Even more interesting, I<br>

> can't see anything we do in LLVM that prevents this from breaking<br>

> essentially everywhere. =[[[[[[<br>

><br>

> link_once and link_once_odr at least seem equally broken because we don't<br>

> put the caller and callee into a single comdat or anything to ensure that<br>

> the optimized one is selected at link time.<br>

><br>

> But there are also multiple different kinds of overriding we should think<br>

> about:<br>

><br>

> 1) Can the definition get replaced at link time (or at runtime via an<br>

> interpreter) with a differently *optimized* variant stemming from the same<br>

> definition (thus it has the same behavior but not the same refinement). This<br>

> is the "ODR" guarantee in some linkages (and vaguely implied for<br>

> available_externally)<br>

><br>

> 2) Can the definition get replaced at link time (or at runtime via an<br>

> interpreter) with a function that has fundamentally different behavior<br>

><br>

> 3) To support replacing the definition, the call edge must be preserved.<br>

<br>

I'm working under context of a optimizer that does not know if its<br>

input has been previously optimized or if its input is "raw" IR.<br>

Realistically, I'd say deviating LLVM from this will be painful.<br></blockquote><div><br></div><div>I'm not suggesting that either. I think there is a happy middle ground, but I'm probably not explaining it very effectively, sorry. Lemme just try again.</div><div><br></div><div>There are two conceptually separable aspects of IPO as it is commonly performed within LLVM. One is to use attributes on a function to optimize callers. The second is to use the definition of a function to deduce more refined attributes.</div><div><br></div><div>This separation is what I was trying to draw attention to between (1) and (2) above. My idea is that with (1) it remains fine to optimize callers based on a function's attributes, but not to deduce more refined attributes. But with (2) I don't think you can do either.</div><div><br></div><div>I think (3) differs from both (1) and (2) because in some cases the restrictions only remain *if* the call edge remains. If you nuke (or rename) the call edge, the restrictions go away completely. In other cases though (my (3) example), the compiler is required to leave that exact call edge in place.</div><div><br></div><div>Currently, we clearly don't actually separate these conceptual sides of IPO. We have a very all-or-nothing approach instead. So maybe this distinction isn't interesting. But hopefully it explains how I'm thinking of it. And because frontends can often directly specify *some* attributes that we know a-priori, it doesn't seem a vacuous distinction to me in theory.</div><div><br></div><div>Does that explain things any better?</div></div></div>