<div dir="ltr"><div dir="ltr">On Mon, Jun 24, 2019 at 3:23 PM Siva Chandra via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Hello LLVM Developers,</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Informal conversations with others within the LLVM community has told us that a libc in LLVM is actually a broader need, and we are increasingly consolidating our toolchains around LLVM. Hence, we wanted to see if the LLVM project would be interested in us developing this upstream as part of the project. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">To be very clear: we don't expect our needs to exactly match everyone else's -- part of our impetus is to simplify things wherever we can, and that may not quite match what others want in a libc. That said, we do believe that the effort will still be directly beneficial and usable for the broader LLVM community, and may serve as a starting point for others in the community to flesh out an increasingly complete set of libc functionality.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing:</span></p><br><ol style="margin-top:0pt;margin-bottom:0pt"><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic.</span></p></li><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The libc should support static non-PIE and static-PIE linking. This means, providing the CRT (the C runtime) and a PIE loader for static non-PIE and static-PIE linked executables.</span></p></li><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice.</span></p></li></ol></span></div></blockquote><div><br></div><div>I don’t think that POSIX additions should be part of the core library. Not all interesting targets are POSIX: e.g. Windows. I think that POSIX should be a separate standalone library piece as you mention that dynamic loading should be downthread. I think that the only pieces that should be available in the core should be the C11 core specification.</div><div><br></div><div>What parts of the C standard do you consider as not being worth implementing?</div><div><br></div><div>If you are looking to implement “extensions” which replace the modern coding practices, does that mean that the surface really should be the MSVCRT implementation then? Because it does deprecate the “unsafe” routines in favour of safe versions (suffixed with `_s`). Additionally, you could always just implement the C standard annex and use those instead.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><ol style="margin-top:0pt;margin-bottom:0pt"><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Vendor extensions must be considered very carefully, and only admitted when necessary. Similar to Clang and libc++, it does seem inevitable that we will need to provide some level of compatibility with other vendors' extensions.</span></p></li></ol></span></div></blockquote><div><br></div><div>How would this work for reasonable bodies of code which are built on Linux? e.g. Chrome does have Linux specific paths and I would be surprised if Chrome does not depend on any GNU behaviours.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><ol style="margin-top:0pt;margin-bottom:0pt"><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The project should be an exemplar of developing with LLVM tooling. Two examples are fuzz testing from the start, and sanitizer-supported testing.</span></p></li></ol><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">There are also few areas which we do not intend to invest in at this point:</span></p><br><ol style="margin-top:0pt;margin-bottom:0pt"><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Implement dynamic loading and linking support.</span></p></li></ol></span></div></blockquote><div><br></div><div>If this is done as a “library” layer, then so should POSIX and the C99/C11 annexes.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><ol style="margin-top:0pt;margin-bottom:0pt"><li dir="ltr" style="list-style-type:decimal;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Support for more architectures (we'll start with just x86-64 for simplicity).</span></p></li></ol></span></div></blockquote><div><br></div><div>I think that AArch64 is pretty core these days and leaving that out is pretty restrictive. At this point Windows AArch64 is an interesting target. With Linux AArch64 and Windows AArch64 becoming more mainstream, it seems like a poor design tradeoff to limit the target to Linux x86_64.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts.</span></p></span></div></blockquote><div><br></div><div>This is really tricky and finicky to implement (I have done something like this in the past). On ELF you can interposition symbols, but on PE/COFF with two level namespace binding, this needs to be statically resolved. Would the approach mean that symbols are interpositioned at compile time to ensure that they are fully redirected? How will you manage cross-domain memory once a malloc implementation is included into the library? What happens with threading?</div><div><br></div><div>The general libc implementation would require that full threading is under its control - consider cases like the IE model for TLS. This requires the loader to be aware of the modules and the full spacing. Another example where this starts to break down is with faulty - it was just a library layer that implemented compressed memory mapped library loading because a previous libc implementation - bionic - suffered from extensive issues including the inability to load more than a handful of modules. This is far from only limitation of the bionic libc implementation, but this doesn’t seem like the appropriate forum for discussing the previous libc implementation attempts.</div><div><br></div><div>One other point of interest to this is how would the loader integration work? With glibc, the loader effectively embeds a copy of libc for itself, and has to dig through the kernel handoff (AT_AUXV) to get the loader location. What happens with multiple object file formats? PE/COFF does not load the same way as ELF and may ripple through the rest of the library. The libc integration is needed for the resolution of symbols as well as for TLS.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">So, what do you think about incorporating this new libc under the LLVM project?</span></p></span></div></blockquote><div><br></div><div>As stated, I really feel that this is far too specialised to certain use cases that are pertinent to Google. I think that this needs to be broadened to allow a general purpose libc much as libc++ is a general C++ implementation. I think that the project has a different set of requirements and seems like it would be extremely interesting to see how it would develop over time. This could really be an interesting choice for a certain type of project but as described feels like it is best explored outside of the umbrella of LLVM.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="gmail-m_-6752588611450016043gmail-docs-internal-guid-31026820-7fff-9c8b-7125-459f728330fd"><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thank you,</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Siva Chandra and the rest of the Google LLVM contributors</span></p></span><br class="gmail-m_-6752588611450016043gmail-Apple-interchange-newline"></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature">Saleem Abdulrasool<br>compnerd (at) compnerd (dot) org</div></div>