[llvm-dev] A libc in LLVM
David Chisnall via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 27 05:26:51 PDT 2019
[ I have worked on FreeBSD libc, so a few clarifications here: ]
On 26/06/2019 17:02, Andrew Kelley via llvm-dev wrote:
> Finally, I'm only aware of 2 operating systems where the libc is not an
> integral part of the system, which is Linux and Windows. For example on
> macOS, FreeBSD, OpenBSD, and DragonFlyBSD, the libc is guaranteed to be
> available, and must be dynamically linked, because this is the stable
> syscall ABI.
Solaris and macOS (kind-of) belong on this list, but FreeBSD does not
and I don't believe other BSDs do, though the situation is somewhat more
complex. On FreeBSD, the system call ABI is stable and there are compat
layers that allow foreign or legacy system call interfaces to be exposed
to userspace processes (e.g. a FreeBSD 7 system call table on FreeBSD
12, or a Linux system call table on any FreeBSD. The Capsicum sandbox
mode is also implemented in part by pivoting the system call layer: once
you call cap_enter, some system calls are simply not exposed to you at
all).
There is even CloudABI, which uses a mostly musl-derived libc and a
Capsicum-derived system call table. This is used for statically linked
applications with a custom launcher that gives strong security guarantees.
That said, the relationship between FreeBSD's libc, libthr (pthreads)
and rtld are quite complex, as are their interactions with the kernel.
Supporting dlopening libthr turned out to be incredibly hard to support
in practice, but even without that, there is some complexity from the
fact that libc must allow libthr to preempt a number of its symbols (and
must provide implementations of things like pthread_mutex for programs
that do not start threads). In the 5.x time frame, we did support two
different pthreads implementations. This was, in hindsight, an
absolutely terrible idea and not something that I'd ever recommend
anyone do ever again.
On macOS, libSystem is actually the public interface to the kernel, so
you can bring along your own libc if you want to, you just have to
dynamically link to libSystem to get access to system calls (or you do
what Go did, try to make them without going via libSystem, and watch
every single program written in your language die when the kernel's
gettimeofday interface changes...). This; however, makes it effectively
impossible to difficult to bring your own dyld replacement to macOS,
because it must be able to load libSystem without making any system calls...
> So it would only make sense for an LLVM libc to be for
> Linux and Windows. It seems reasonable to assume that Google is only
> interested in Linux. In this case I have to re-iterate my original
> question, what are the needs that are not being met by existing Linux
> libcs, such as musl?
I am also unconvinced that it is possible to design a clean platform
abstraction layer for libc that would work over even Linux and FreeBSD
without imposing significant penalties for one or the other. If you add
Windows into the mix, then it gets a lot harder. POSIX's decision to
use int, rather than a pointer type, for file descriptors and to make
specific guarantees about reuse order (rather than just providing dup2
as a moderately sane interface) means that userspace code will need to
implement the file descriptor table. Do we build higher-level layering
on top of file descriptors or do we support Windows HANDLEs natively for
internal usage and use fds only for public APIs?
The idea of an LLVM libc has been proposed a few times and generally the
pushback has been that it doesn't make sense because libc is so
intimately tied to the host kernel that it's very hard to consider it as
a portable component.
David
More information about the llvm-dev
mailing list