[libc-dev] API generation
Petr Penzin via libc-dev
libc-dev at lists.llvm.org
Tue Nov 19 18:20:08 PST 2019
Hi David,
Thanks for the answers, I am going to send a separate reply to Siva's
message about API generation. There was a conversation about upstreaming
WASI during Wasm CG meeting, I thought that with this effort eventually
there would be all the pieces for a end-to-end Wasm toolchain (not only
WASI-based, but JS as well) in LLVM.
+ Dan Gohman, in case he has any thoughts on this, as he maintains
current WASI implementation
Best,
Petr
On 11/19/19 1:33 AM, David Chisnall via libc-dev wrote:
> Hi Petr,
>
> As I understand it, the WASI interface is now very close to CloudABI,
> which is one of the use cases I was interested in. There are two
> slightly conflated goals for the header generation that, I think, are
> going to need deconflating in the future:
>
> - Being able to support different sets of standards (e.g. pure C11,
> POSIX, POSIX + GNU extensions + BSD extensions) so that a compilation
> unit can opt into only a subset of the required things.
>
> - Being able to support different sets of standards so that an
> implementation can ship a useful standards-compilant subset (e.g. just
> C11 on a non-POSIX platform).
>
> - Being able to support different subsets for builds with different
> sets of available target abstractions.
>
> For legacy compatibility, the current WASI libc supports libpreload,
> but it's often easier to support a Capsicum environment with a more
> CloudABI-like interface that disallows all of the explicit global
> namespace operations. For WASI / Capsicum deployments, I would like
> to be able to build a version of libc that exposes only CloudABI-like
> symbols, so I get linker failures (that I can then fix) when I use
> something that relies on access to the global namespace.
>
> The first and second of these look very similar, but the second and
> third are the ones that share useful tooling. Existing libc
> implementations support the former with a load of macros to
> conditionally expose things. These are annoying to maintain
> (particularly if, for example, a BSD extension is later standardised
> in POSIX: you then need to rework the logic in the headers for
> exposing them). Ideally, we'd just add POSIX20 or whatever to the
> list of standards and let the tool deal with it. For the first use
> case, I think we will still end up needing conditional exposure via
> macros, but that's easier to machine generate than to write by hand.
>
> For the second and third use cases, the goal in both cases is to make
> subsetting easier. We could later extend this with some static
> analysis plugins that check for isolation (e.g. C11 can't depend on
> POSIX, Capsicum-safe functions can't depend on non-Capsicum-safe
> functions).
>
> The final benefit that we haven't really explored yet for header
> generation is supporting different compiler annotations for API
> contracts that are not expressible in standard C. For example, the
> Windows headers use SAL annotations to define in / out parameters, the
> size of buffers, and so on. There are GNU extensions for some of
> these, but they often go in different places (e.g. as function
> attributes with parameters that index a specific function parameter
> versus parameter attributes). If we encode the high-level contracts
> in the TableGen, then we should be able to generate MS C and GNU C
> variants of the same set of interfaces.
>
> The TableGen format lets us put a lot more metadata on the functions
> and definitions than we would necessarily want to end up in any given
> build of the headers.
>
> I agree that we are going to end up with TableGen files that are quite
> complex, but I believe that we should end up with a cleaner separation
> of concerns. I have worked on a libc that did this manually, and
> refactoring any of the macro code is very painful because it is all
> very order-dependent and changes have non-local effects. In the
> TableGen world, the back end will parse all of the definitions, build
> the dependency graph, and then generate the macros. A change that
> requires reworking macros across half a dozen files is not a problem
> in this context.
>
> David
>
> On 18/11/2019 23:57, Petr Penzin via libc-dev wrote:
>> Hi,
>>
>> I work on WebAssembly, and I was hoping we would eventually use LLVM
>> libc for end-to-end Wasm toolchain. I have some questions about
>> "ground truth" approach to libc API. I am sorry if those have been
>> asked, could not find the answers looking through mailing list
>> messages and code reviews.
>>
>> http://lists.llvm.org/pipermail/libc-dev/2019-October/000003.html
>>
>> http://lists.llvm.org/pipermail/libc-dev/2019-October/000009.html
>>
>> I was wondering what does API generation buy for the developers and
>> users. Maybe the question is how did previous implementations of libc
>> get away without generating headers, but also is API generation a
>> reasonable and foolproof solution.
>>
>> Most importantly, the motivation seems to be that there are a few
>> potential standards a libc implementation needs to comply with. But
>> how many substantially different APIs are there realistically? If it
>> is in lower single digits, does this really make it worth the effort?
>>
>> Secondly, libc API is not only types and function prototypes, it
>> typically includes depends on "feature test macros". I am not sure it
>> is possible to gracefully support those in a generated API. Encoding
>> test macros in API "ground truth" rules would make API rules as
>> complex as C macro code they are trying to replace. Leaving test
>> macros up to the C header files would result in a mix of preprocessor
>> and rule logic which would probably be more confusing than going all
>> the way in either (preprocessor or generation) direction.
>>
>> Finally, somewhat rhetorical point on precedent and expertise. There
>> is enough precedent for a portable libc API written directly;
>> likewise C/C++ developers can understand and modify C headers without
>> ramp-up - not sure that can be said about tablegen. Writing header
>> files is a relatively simple part of the development process and
>> there is a lot of it happening inside and outside of LLVM.
>>
>>
>> Best,
>>
>> Petr
>>
>>
>> _______________________________________________
>> libc-dev mailing list
>> libc-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/libc-dev
>>
> _______________________________________________
> libc-dev mailing list
> libc-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/libc-dev
More information about the libc-dev
mailing list