[PATCH] D64939: Add a proposal for a libc project under the LLVM umbrella.

Mon Aug 12 21:36:53 PDT 2019

jfb added inline comments.

================
Comment at: llvm/docs/Proposals/LLVMLibC.rst:37
+  testing like fuzz testing and sanitizer-supported testing.
+- ABI independent implementation as far as possible.
+- Use source based implementations as far possible rather than
----------------
chandlerc wrote:
> sivachandra wrote:
> > jfb wrote:
> > > dlj wrote:
> > > > jfb wrote:
> > > > > sivachandra wrote:
> > > > > > dlj wrote:
> > > > > > > jfb wrote:
> > > > > > > > sivachandra wrote:
> > > > > > > > > jfb wrote:
> > > > > > > > > > Will it be ABI-stable? Maybe it's worth expanding on how the ABI will evolve, and what will be stable.
> > > > > > > > > I am not sure how exactly to word it here. Do you have any suggestions on what the ABI promise should be and what exactly to say in a proposal like this?
> > > > > > > > Depends on what people want to do with it. I'm just saying it should be given thought. If you want to interop with other libc then you need to match their ABI, which can be a burden. IIUC musl matched glibc almost accidentally, and is moving away from doing so. Then you might consider whether your libc is ABI stable over time, and how you'd manage that. The answer might change between static and dynamic versions.
> > > > > > > (Sorry for chiming in so late... I chatted with Siva, and I volunteered to add my take.)
> > > > > > > 
> > > > > > > I don't think we would ever try to maintain ABI-level interop with other libc implementations on purpose. There is some subtlety here, though, so let me step back for a moment.
> > > > > > > 
> > > > > > > (The next few paragraphs are largely for the benefit of folks who may be following along the review thread, so I'm surely restating more than necessary. This is a bit long, so you can skip to the end if you want... I won't tell anyone. ;-) )
> > > > > > > 
> > > > > > > Fundamentally, our aim is to define a tightly-bounded interface (i.e., provides what it must; no more, no less), and retain as much flexibility as we can for everything behind the interface. The "implementation stuff" behind the interface then has to be structured in a rational way, so that bits and pieces can be replaced. (I'm going to come back to that point in a bit.) The "interface" in this case are effectively just the headers: struct and enum definitions, function declarations, maybe a couple of extern variable declarations, and... really, that's about it.
> > > > > > > 
> > > > > > > The "implementation flexibility" goal is where things like DSOs, interceptors, and delegation come into play. The point in the design space we need (for Google production workloads) is, arguably, one of(*) the simplest: a fully statically linked (*)PIE binary, wherein we are willing to build everything from sources using the same compiler.
> > > > > > > 
> > > > > > > The question of things like ABI stability seem to me, quite frankly, out of the scope that we would even //want// to define. There is a simple reason why I would like to stay somewhat purposefully blind to these questions: I do not believe we can predict why folks might need guarantees from the //library//, instead of relying on the guarantees that come from the combination of language and compiler implementation. (Such reasons surely exist, but I expect many of them to be novel, surprising, or both.)
> > > > > > > 
> > > > > > > In other words: if you can build the .a or .lib, and you have headers that match it, and that archive works like any other library archive, //why would one need still more guarantees?// FWIW, this question is only maybe 75% rhetorical... I find myself routinely ... let's say, "impressed" by new and interesting ways to ... let's say, "meaningfully change program behavior" using just the linker's command line options. (Obviously, I'm toning down my opinions here... but my strong bias is that these uses need to be brought to light before any fundamental design accommodations are made.)
> > > > > > > 
> > > > > > > A logical extension of the above is that the implementation will look as much as possible like any other library. We don't want to insert an "upward contract" (or would it be "downward?") that might get in the way of whatever else a packager might want to do. (Providing a DSO is, IMO, more akin to release or distribution packaging than simply building a library. I'm purposefully using the term "packaging" instead of anchoring on, say, the specifics of ELF DSOs.)
> > > > > > > 
> > > > > > > I think there is an interesting analog here to the LLVM project in general: similar to how the broader LLVM project's "compiler as a library" approach yields a toolbox of things which can be used to build a compiler (but also other things), the "libc as a library" aphorism(/pun) is meant to point out that the task of "building a libc" isn't monolithic, either (and maybe there are other things people want in this space, and we can make our work useful to them). This comes back to the question about using or replacing parts of the overall libc: there's really no reason that the whole thing //needs// to be monolithic. There are, of course, some internally-cohesive subsystems that would be hard to break apart; but at the macro level, a "libc" isn't terribly cohesive. The monolithic coupling is largely artificial.
> > > > > > > 
> > > > > > > ---
> > > > > > > 
> > > > > > > Alright, so that's the background.
> > > > > > > 
> > > > > > > There is, quite admittedly, some duplicity in the goals -- at least, as I've stated them.
> > > > > > > 
> > > > > > > One good example: we actually //would// like to make reasonable "narrowing" guarantees: if there is some design or implementation space we could leave empty, and doing so would drastically simplify a packaging use case, then that's something we should try to accommodate. (Especially if it doesn't add substantial cost to the implementation.)
> > > > > > > 
> > > > > > > However, a guarantee like "our ABI will match <other libc>" is not what I would call narrowing: it would require us to cover more of the design/implementation/ABI space than absolutely necessary, so it doesn't really seem like a guarantee we would want to make.
> > > > > > > 
> > > > > > > On a more technical note, I should also point out that there is an important detail that is easy to miss, and I think it's mostly buried somewhere within this bullet point:
> > > > > > > 
> > > > > > > > - Provide C symbols as specified by the standards, but take advantage
> > > > > > > >   and use C++ language facilities for the core implementation.
> > > > > > > 
> > > > > > > So, for example, the entire "implementation" ought to be available without squatting on symbol names used by libc. Our expectation is to use link-time aliasing to satisfy most libc symbols, but I think the important takeaway is that the libc symbols will notionally only exist in an independent layer on top of the "actual" implementation (i.e., separate symbol logic from program logic).
> > > > > > > 
> > > > > > > ---
> > > > > > > 
> > > > > > > One seemingly-equivocal goal in all of this is delegation to a separate libc, and this seems like it might raise questions about ABI interop. Personally, I would not characterize this as "interoperability;" rather, I would characterize it by saying that a "syscall" might be implemented by, say, a trap instruction; but it doesn't have to be. For instance, it could delegate -- probably through clever linking tricks -- to a vendor-supplied libc. More generally, I would say that this could apply to any libcall (not just syscalls). My expectation is that, since we'll need something like this as a bootstrapping mechanism anyhow, it would be unwise to try to make it an anti-goal. Inhibiting that use case just seems like more work to cover ground that we don't particularly care about. (Implementing such delegation is hard, but actually preventing it would be even harder... and to what gain? Bragging rights?)
> > > > > > > 
> > > > > > > I do **strongly** suspect that there are other users who might want to use this libc, but still have to use a vendor library to actually talk to their kernel. Or maybe folks would want to delegate almost everything to their existing libc, except for one or two routines. Or maybe folks want an easier way to rebuild existing programs to run inside their shiny, new sandbox. Or they want easier kernel-bypass networking, or they want to use an in-process virtual filesystem, or any other thing which, today, commonly requires using new, incompatible APIs.
> > > > > > > 
> > > > > > > ---
> > > > > > > 
> > > > > > > And just as a quick sanity check: I don't actually expect that this libc would be adopted primarily as a system libc... at least, not in the near term, by any existing platform. It should be complete and high enough quality to serve that purpose, though. It might make sense to use for a new platform that hasn't already cemented its ABI (c.f.: Alpine linux and musl), or for a vendor to adopt for an epoch release (I do think an ABI layer could be fashioned to make that feasible). But there is plenty of value in not relying on a system libc, and in being able to replace parts (work around bugs, provide a different implementation, etc.).
> > > > > > If I understand David correctly, he is essentially trying to say something which I have failed to convey so far: We will keep the ABI compatibility and stability questions open for now and let someone who cares for these issues come and provide/fill in these details in future.
> > > > > > 
> > > > > > Does that make sense? Considering we (the team at Google I represent) are not particularly interested in these questions, we do not want to say/guess/promise about them. That said, we are also not preventing anyone from formalizing answers to these questions in future.
> > > > > I made two points:
> > > > > 
> > > > > * Will this library have ABI compatibility with itself over time?
> > > > > * Since there's a desire to allow mixing this libc with another libc implementation, you'll likely need to be ABI compatible to do so. Alternatively, you can define boundaries which don't require this compatibility, and I'd like to hear more.
> > > > > 
> > > > > I'd like them answered separately. I think it's critical to the success of this projects to have other non-Google-production participants, and I expect that some will care about ABI stability.
> > > > It seems like this line of discussion is quickly starting to go in circles... but maybe I'm misunderstanding your questions.
> > > > 
> > > > > Will this library have ABI compatibility with itself over time?
> > > > 
> > > > Can you be more specific? "ABI" is a pretty broad topic...
> > > > 
> > > > For example, we have no control over whether a user of the library chooses an alternate calling convention, then attempts to link against a prebuilt archive. I strongly doubt this is the intent of your question, but it's an example of why I'm asking for narrowing... we need to be careful not to over-promise. "ABI" is a vague enough notion that simply saying "we will have a stable ABI" would be //vastly// overreaching.
> > > > 
> > > > > Since there's a desire to allow mixing this libc with another libc implementation, you'll likely need to be ABI compatible to do so.
> > > > 
> > > > (We don't believe this is entirely the case, at least for ELF.)
> > > > 
> > > > > Alternatively, you can define boundaries which don't require this compatibility, and I'd like to hear more.
> > > > 
> > > > This is the intent, and why the term "layer" is used in this bullet point:
> > > > 
> > > > > Ability to layer this libc over the system libc [...]
> > > > 
> > > > The specific mechanism, however, is something that ought to be addressed in a standalone design doc. We do have such a plan for ELF, but it is fairly intricate.
> > > > 
> > > > In any event, the discussion of exactly how (and why) it would be implemented is something that is ... "nuanced," to put it lightly. It seems unnecessary to try to include such a deep technical specification in this particular, high-level document. In fact, I could even go further: trying to define "layering" will require us to anchor to the mechanism for a specific platform, or at least family of platforms. That seems antithetical to the current goals of this document (and, frankly, feels more like a wedge than an actual question... so perhaps this would be better deferred until Siva sends the specific design, after this doc is committed).
> > > > Can you be more specific? "ABI" is a pretty broad topic...
> > > 
> > > libc++ has ABI guarantees, I'd expect you to figure out similar guarantees for a libc. Yes it's broad, and yes I'm asking that you figure out exactly what that means. Yes this includes LLVM libc version X -> Y as well as with other libc implementations (since interop with other libc is part of this proposal).
> > As you have said, there are two kinds of ABI compatibility that one could discuss here:
> > 
> > 1. LLVM libc X versus LLVM libc Y - To begin with at least, we are not interested in such a question. For our use case, we are OK if LLVM libc X is never compatible with LLVM libc Y. However, I do understand that there will be users and developers out there for whom ABI compatibility matters. In this proposal, I want to keep the compatibility question open as I cannot guess for somebody else requiring such compatibility guarantees
> > 
> > [You brought up the libc++ ABI guarantees. One cannot have a namespace based ABI management scheme for a libc as we cannot use namespaces in libc public headers.]
> > 
> > 2. LLVM libc versus another system libc - In general, we are not interested in this question as well at this point. Ideally, LLVM libc should not have to make any guarantees about compatibility with another libc. But I think there probably is a misunderstanding of the bullet point "ability to layer LLVM libc over another libc." We do not mean that we want to be able to "mix" LLVM libc with another libc in general. Users of LLVM libc cannot for example open a file using LLVM libc and close it using another libc. The users have to consistently use entry points from LLVM libc alone. However, LLVM libc might call into the system libc for the implementation (and hence act as a layer between the user code and the system-libc). The translation from LLVM libc data structures to the system libc data structures happens inside of LLVM libc, invisible to the LLVM libc user.
> I think there is some miscommunication happening here....
> 
> I think what Siva and David are trying to say is that we want people who have specific ABI compatibility needs to drive the ABI compatibility design. It's not any form of opposition to having such a proposal, it just should probably come from the people working in that space.
> 
> I'll try to explain this from the perspective JF asked the question: libc++ has ABI guarantees. But originally, libc++ *only* had a stable ABI. There was no "unstable" ABI because the libc++ authors didn't need one. Later on, folks were trying to start using libc++ (us) and happened to want an "unstable" ABI that could easily track updates and fixes that weren't ABI compatible. At that point we worked w/ Marshall and others to come up with an approach that would work for our use case but also wouldn't get in the way of the stable libc++ ABI. I think it was a good thing that libc++ didn't try to design this system up-front. It would have been a waste of time given that there weren't any users involved with libc++ at the time to even consume it. But I also think it was really important to figure out a way to support that once there was a concrete use case in mind.
> 
> I think we're basically seeing the reverse position here. Our use case is for an unstable ABI. We're totally open to having a stable ABI but as we don't have a concrete use case in mind, it seems like it would be better to wait for folks to have specific requirements for a stable ABI and then design a solution that works for them.
> 
> 
> Maybe "we are not interested in this question" is easily misinterpreted as "we are not interested in answering this question at all". Sorry if so, I think that's just a slight communication issue. I think another way to put it "we aren't the right people to figure out the core use cases that will drive any answer to this question, but we're happy for folks to propose or work toward that direction".
> 
> Hopefully this helps address some of the confusion.
> I think what Siva and David are trying to say is that we want people who have specific ABI compatibility needs to drive the ABI compatibility design. It's not any form of opposition to having such a proposal, it just should probably come from the people working in that space.

That's fine with me. What I'm asking is simple:

1. Try to leave a placeholder for ABI in the initial proposal.
2. Reach out to people who've expressed interest in the RFC, in a way that would require ABI compatibility of some form. Most folks from the RFC are clearly not reading this review. Once it's in good shape, send a ping to the RFC. It seems like there's folks within Google interested in usecases of that sort. Dynamic linking certainly would entail an ABI. They might not want to work on it *now*, but at a minimum they might have insights to improve the placeholder above.
3. Clarify that entire "interop with other libc" thing. It's pretty unclear to me, the way I've understood it sounds like there's ABI implications. If there isn't great!

I am asking a bunch of questions, but you don't need to have perfect answers or to sign up for the work. However, I do think you want to put some thought into it so either interested folks can jump in now, or when interested folks come along everything is at least set up with ABI in mind.

Simple examples:

* Will you force inline functions, and leave other non-C API functions as the non-inlined ODR functions? If someone were to link them then these functions would now be your ABI. You'd need to think about what parts of them can / can't change. You certainly can't remove them... can you change their namespace (if you implement in C++)? Their parameters? Return type, etc...
* How will you handle syscalls? On some platforms you're not supposed to call them directly because they're not part of the platform API.
* How will you manage structs exposed through the C / POSIX API?

Some of these can stay up in the air for a bit! But some would require moving your code around and setting stuff in stone to enforce into an ABI. What have other libc implementations done? If you think they did something silly, why?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64939/new/

https://reviews.llvm.org/D64939