[libc-dev] API generation
David Chisnall via libc-dev
libc-dev at lists.llvm.org
Tue Nov 19 01:33:44 PST 2019
As I understand it, the WASI interface is now very close to CloudABI,
which is one of the use cases I was interested in. There are two
slightly conflated goals for the header generation that, I think, are
going to need deconflating in the future:
- Being able to support different sets of standards (e.g. pure C11,
POSIX, POSIX + GNU extensions + BSD extensions) so that a compilation
unit can opt into only a subset of the required things.
- Being able to support different sets of standards so that an
implementation can ship a useful standards-compilant subset (e.g. just
C11 on a non-POSIX platform).
- Being able to support different subsets for builds with different
sets of available target abstractions.
For legacy compatibility, the current WASI libc supports libpreload, but
it's often easier to support a Capsicum environment with a more
CloudABI-like interface that disallows all of the explicit global
namespace operations. For WASI / Capsicum deployments, I would like to
be able to build a version of libc that exposes only CloudABI-like
symbols, so I get linker failures (that I can then fix) when I use
something that relies on access to the global namespace.
The first and second of these look very similar, but the second and
third are the ones that share useful tooling. Existing libc
implementations support the former with a load of macros to
conditionally expose things. These are annoying to maintain
(particularly if, for example, a BSD extension is later standardised in
POSIX: you then need to rework the logic in the headers for exposing
them). Ideally, we'd just add POSIX20 or whatever to the list of
standards and let the tool deal with it. For the first use case, I
think we will still end up needing conditional exposure via macros, but
that's easier to machine generate than to write by hand.
For the second and third use cases, the goal in both cases is to make
subsetting easier. We could later extend this with some static analysis
plugins that check for isolation (e.g. C11 can't depend on POSIX,
Capsicum-safe functions can't depend on non-Capsicum-safe functions).
The final benefit that we haven't really explored yet for header
generation is supporting different compiler annotations for API
contracts that are not expressible in standard C. For example, the
Windows headers use SAL annotations to define in / out parameters, the
size of buffers, and so on. There are GNU extensions for some of these,
but they often go in different places (e.g. as function attributes with
parameters that index a specific function parameter versus parameter
attributes). If we encode the high-level contracts in the TableGen,
then we should be able to generate MS C and GNU C variants of the same
set of interfaces.
The TableGen format lets us put a lot more metadata on the functions and
definitions than we would necessarily want to end up in any given build
of the headers.
I agree that we are going to end up with TableGen files that are quite
complex, but I believe that we should end up with a cleaner separation
of concerns. I have worked on a libc that did this manually, and
refactoring any of the macro code is very painful because it is all very
order-dependent and changes have non-local effects. In the TableGen
world, the back end will parse all of the definitions, build the
dependency graph, and then generate the macros. A change that requires
reworking macros across half a dozen files is not a problem in this
On 18/11/2019 23:57, Petr Penzin via libc-dev wrote:
> I work on WebAssembly, and I was hoping we would eventually use LLVM
> libc for end-to-end Wasm toolchain. I have some questions about "ground
> truth" approach to libc API. I am sorry if those have been asked, could
> not find the answers looking through mailing list messages and code
> I was wondering what does API generation buy for the developers and
> users. Maybe the question is how did previous implementations of libc
> get away without generating headers, but also is API generation a
> reasonable and foolproof solution.
> Most importantly, the motivation seems to be that there are a few
> potential standards a libc implementation needs to comply with. But how
> many substantially different APIs are there realistically? If it is in
> lower single digits, does this really make it worth the effort?
> Secondly, libc API is not only types and function prototypes, it
> typically includes depends on "feature test macros". I am not sure it is
> possible to gracefully support those in a generated API. Encoding test
> macros in API "ground truth" rules would make API rules as complex as C
> macro code they are trying to replace. Leaving test macros up to the C
> header files would result in a mix of preprocessor and rule logic which
> would probably be more confusing than going all the way in either
> (preprocessor or generation) direction.
> Finally, somewhat rhetorical point on precedent and expertise. There is
> enough precedent for a portable libc API written directly; likewise
> C/C++ developers can understand and modify C headers without ramp-up -
> not sure that can be said about tablegen. Writing header files is a
> relatively simple part of the development process and there is a lot of
> it happening inside and outside of LLVM.
> libc-dev mailing list
> libc-dev at lists.llvm.org
More information about the libc-dev