[Lldb-commits] [PATCH] D78972: Treat hasWeakODRLinkage symbols as external in REPL computations

Thu Apr 30 12:21:14 PDT 2020

jingham added a comment.

In D78972#2012796 <https://reviews.llvm.org/D78972#2012796>, @labath wrote:

> In D78972#2011571 <https://reviews.llvm.org/D78972#2011571>, @jingham wrote:
>
> > That one does work all the way, including calling the function.  That should be surprising.   It shouldn't with an unpatched lldb, right?
>
>
> Yep, I realized that is not the best example after posting that comment.
>
> > We do this whole dance again every time we run this expression.  It looks like we don't look for instances of a template if we know how to make it, we just make it again.  That actually makes sense.  When you are compiling, you don't know if you will find an instantiation, so you have to make it.  Then you leave it to the linker to remove duplicates.
>
> You're describing the "normal" `linkonce` semantics. Things get more complex once you start having explicit instantiations: `template void f<int>();` forces a compiler to emit an instantiation even though it is not needed (this results in the `weak_odr` linkage. `extern template void f<int>();` inhibits instantiation -- the compiler will assume you have an explicit instantiation elsewhere. The fact that the explicit instantiation is "weak" allows for some interesting effects, but it's tricky to demonstrate them, as they all involve UB. Once can see some of that with normal compilation like this:
>
>   $ head a.cc b.cc c.cc
>   ==> a.cc <==
>   #include <cstdio>
>  
>   template<typename T> void f() { printf("Hello T\n"); }
>  
>   template void f<int>(); // weak
>  
>   ==> b.cc <==
>   #include <cstdio>
>  
>   extern "C" void _Z1fIiEvv() { printf("Hello _Z1fIiEvv\n"); } // strong
>  
>   ==> c.cc <==
>   #include <cstdio>
>  
>   template<typename T> void f() { printf("Hello T\n"); }
>  
>   extern template void f<int>(); // inhibit instantiation
>  
>   int main() {
>     f<int>();
>   }
>   $ c++ a.cc c.cc
>   $ ./a.out 
>   Hello T
>   $ c++ a.cc c.cc b.cc
>   $ ./a.out 
>   Hello _Z1fIiEvv
>
>
> I don't know whether any of this is relevant for top-level expressions.

Thanks for the explanation!

I think if you are bothering to create a template in --top-level your intent was to create something new.  So the current setup, which will always emit an instantiation from the template that was injected into the Expression ASTContext, means subsequent experiments will use this version, which seems closest to that intent.  That is certainly what happens today.  If I build either version of the example above and do:

  (lldb) expr --top-level -- template<typename T> void f() { (void) printf("Hello EXPR\n"); }
  (lldb) expr f<int>()
  Hello EXPR

That behavior is unsurprisingly not affected by this patch.

I guess another use case might be "change all uses of this template function throughout the program to point to this new body".  But can we even do that reliably?  The linker has already resolved all the extant references to whichever of these symbols it decided should win.  We'd have to interpose on that choice after the fact, which we don't do anywhere else, and seems tricky to get right.

So I think the current behavior is the best we can do.

> Incidentally it looks like we have a problem with function template specializations in top-level expressions, though I don't think that is related to this patch in any way:
> 
>   (lldb) expr --top-level --
>   Enter expressions, then terminate with an empty line to evaluate:
>     1: template<typename T> void f() { (void) printf("Hello T\n"); } 
>     2: template<> void f<int>() { (void) printf("Hello int\n"); } 
>     3: void g() { f<int>(); } 
>   (lldb) p g()
>   Hello int
>   (lldb) p f<int>()
>   Hello T

The f<int>() specialization here emitted with external linkage, which makes sense.  So that does not seem related.  We need to be a smarter linker but it doesn't look like we remember the linkage of symbols made in expressions so we don't have the data to do this right as it stands.

>> Note that this is all about creating new code in a running program, and it almost seems like you really should use the new version you created if it is any different.  Or anyway, it's not entirely clear what the rule should be.  And anyway, the thing we have to do is not NOT export these symbols, it is really "figure out if they exist somewhere else and are equivalent" and use those, otherwise export these.  Since I don't yet have an example where there are two versions that would get used, I don't want to try to write code to do that blind.
> 
> Yep, at this point I am pretty lost about what should be the correct behavior, and whether the change is detectable with c++. However, I don't like the fact that the change has no test coverage.

Well, there will be a swift test.  I do have a case there where this is detectible.  That's not entirely satisfactory, but it doesn't look like we can come up with something in C++ which we can actually use to test this.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78972/new/

https://reviews.llvm.org/D78972