[llvm-dev] Crefl - a Clang plug-in and C-type-reflection-API
Michael Clark via llvm-dev
llvm-dev at lists.llvm.org
Sun May 9 19:51:22 PDT 2021
Hi Folks,
Writing to let you know about a component I have been working on that
could be of interest to the LLVM and Clang community:
> "Crefl - a Clang plug-in and C-type-reflection-API"
The Crefl API and plugin provide access to runtime reflection metadata
for C interfaces supporting arbitrarily nested combinations of:
intrinsic, enum, struct, union, field, array, constant, and function.
Crefl focuses on addressing the following areas:
- a clang plug-in that outputs portable reflection metadata.
- a reflection database format for portable reflection metadata.
- an API that provides task-oriented access to reflection metadata.
I am aware that the C++ standards committee is focusing on compile-time
type reflection for C++ and I am aware of similar work in-tree for Clang
AST serialization, this work is intended to be complementary. Crefl is a
small runtime dependency and I am focusing on a portable format and
exposing reflection metadata to C. Crefl itself is written in C++.
The Crefl plugin is nearing beta level interface stability and
reliability. I have been ironing out bugs and API usability issues.
There is now a rudimentary reflection metadata linker that performs
de-duplication using recursive tree hash sums. Linking still needs work
regards incomplete types, modules, name indexing and namespaces.
The Crefl repo contains samples/example2_embed which shows reflection
metadata embedding into a binary. An ASN.1 implementation exists in the
tree and I am currently working on structure packing and alignment. The
intention is to write an ASN.1 serializer for C structures and then use
that to read and write the reflection metadata itself. The metadata is
currently stored using C native structure packing and alignment. The
reflection API is nearly complete and I am now starting on structure
serialization which is the major planned use of the API.
There exists a cmake macro that handles invoking the crefl plugin,
merging the metadata and embedding it into a linkable object file.
include(cmake/crefl_macro.cmake)
add_executable(example2_embed samples/example2_embed/main.c)
crefl_target_reflect(example2_embed example2_embed_refl)
target_link_libraries(example2_embed example2_embed_refl cmodel)
Here is a sample showing how to access the embedded metadata:
int main(int argc, const char **argv)
{
decl_db *db = crefl_db_new();
crefl_db_read_mem(db, __crefl_main_data, __crefl_main_size);
size_t nsources = 0;
crefl_archive_sources(crefl_root(db), NULL, &nsources);
assert(nsources == 1);
decl_ref *_sources = calloc(nsources, sizeof(decl_ref));
assert(_sources);
crefl_archive_sources(crefl_root(db), _sources, &nsources);
size_t ntypes = 0;
crefl_source_decls(_sources[0], NULL, &ntypes);
decl_ref *_types = calloc(ntypes, sizeof(decl_ref));
assert(_types);
crefl_source_decls(_sources[0], _types, &ntypes);
for (size_t i = 0; i < ntypes; i++) {
_print(_types[i], 0);
}
crefl_db_destroy(db);
}
I would also like to implement a reference counted allocator wrapper
providing for serialization of arbitrary C object graphs. Handling
arrays would need some sort of `alloc(T)(n)` for typed array buffers
perhaps using negative array indices to find object metadata containing
count. That would be to support serialization of strings and arrays ...
((struct _alloc*)ptr)[-1].rc
... using some structures to support reference counting:
struct _alloc { size_t count; dtor_notify_t dtor_notify; rc_t rc; };
struct _ref { void* obj; size_t base; }
struct _weakref { void* obj; size_t base; dtor_notify_t dtor_notify; }
Support for arrays and references has not been implemented. To keep the
memory overhead down it may be necessary to compress array dimensions
using a scheme similar to ASN.1 object identifiers, a reverse LEB
encoding or the sds string library scheme for compressed array sizes.
The idea to use destructor notifiers is to support zeroing weak
references, and to avoid needing to maintain a secondary weak reference
count and separate allocations for the reference count. This assumes
code using this library to serialize object graphs would use a reference
counting object allocator consistently. i.e. a shared_from_this would
conceptually be a no-op and references could degrade to pointers. The
challenge would be cramming what we need into say 16 bytes and
minimizing alloc/unref overhead. destructor notifiers might need 32-bit
relative addresses to sufficiently compress function pointers. One might
even need the support of some special relocations in the linker.
The ultimate goal is to expose something like the following to C:
obj_write(T)(stream, obj)
obj_read(T)(stream, obj)
and a reference counting interface with destructor notification:
obj_alloc(T)(obj)
obj_ref(T)(obj)
obj_unref(T)(obj)
obj_dtor_notify(T)(obj,target,func)
obj_dtor_denotify(T)(obj,target,func)
obj_weakref(T)(obj)
obj_weakunref(T)(obj)
The plan is to make something that can be used from C++ as a foundation
for simulation object state serialization, but also making sure that
components using this architecture can be written in C. It could be that
we model simplified classic inheritance and Rust or Zig style traits and
interfaces, the main rationale being that whatever we implement, we
expose a mapping to C. The ideas regards arrays and references are still
somewhat sketchy. Might be that we need compiler support for references,
bounded arrays and closure scoped destructors in C. Pie in the sky?
In any case, this is mainly a heads up to see if folk are interested and
would care to give feedback or collaborate. The Git repository is here:
- https://github.com/michaeljclark/crefl/
Regards,
Michael
More information about the llvm-dev
mailing list