[PATCH] D124221: Add new builtin __builtin_reflect_struct.

Erich Keane via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 22 09:40:09 PDT 2022


erichkeane added a comment.

>> I think you've misunderstood; that is not required. Though with the design as-is, it might make sense to restrict this to being C++-only, given that there's not really a way to effectively use this from C.

Ah, I see the example in SemaCXX.  its sad we can't do this from C, I would consider the use case of `__builtin_dump_struct` to be very handy in C.

I guess I think I would want to start off with, "what do we see the 'usecase' of these builtins" being.  `__builtin_dump_struct` to me seems to have a pretty clear and semantically enforced one: its useful to printf it for debugging purposes.

This seems to be more useful-enough that it leaves me feeling like users would very quickly fall into, "man, I wish this did more" because of it.  Its just a shame we can't get SG7 to work quicker :)



================
Comment at: clang/docs/LanguageExtensions.rst:2434
+
+     bool print(int arg1, int arg2, std::initializer_list<Field> fields) {
+       for (auto f : fields) {
----------------
rsmith wrote:
> erichkeane wrote:
> > I'm curious as to the need for the 'arg1' and 'arg2'?  Also, what is the type of the 'fields' array that works in C?
> This is aimed to fix a major deficiency in `__builtin_dump_struct`, that there's no way to pass information into its callback (other than stashing it in TLS).
> 
> As noted in the patch description, this builtin is not especially usable in C. I'm not sure whether that's fixable, given the very limited metaprogramming support in C.
Presumably one could also put it in the functor for the callback, or even as a capture in the lambda.  Though, I'm not positive I see the VALUE to passing this information, but was just curious as to the intent.


================
Comment at: clang/docs/LanguageExtensions.rst:2437
+         std::cout << f.name;
+         if (f.bitwidth)
+           std::cout << " : " << f.bitwidth;
----------------
rsmith wrote:
> erichkeane wrote:
> > Hmm... what is the type of f.bitwidth?  A zero-length bitwidth is a valid thing, so having it just be an 'int' doesn't seem workable.
> It's a `size_t`. There are no zero-size bitfield members. (Unnamed bitfield aren't members, so are excluded here.) In any case, you can tell the difference between a bitfield and a regular member by the length of the initialiser list; we don't pass a bit width for non-bit-field members.
>>There are no zero-size bitfield members. (Unnamed bitfield aren't members, so are excluded here.)
Hrm... Semantically YES, lexically I'm not sure folks make that differentiation.  One issue with skipping 'unnamed bitfields' is it makes something like:

```
struct Foo {
int : 31; // Take up some space so something else could have been put there.
int field : 1; 
int field2 : 2;
};
```
and comparing `sizeof` the type with the output to be... jarring at least.

>> you can tell the difference between a bitfield and a regular member by the length of the initialiser list;

I see now how you've done that with ctors in C++.  It feels that the way of doing this in C is unfortunately quite clunky (in addition, not being able to use arbitrary types).


================
Comment at: clang/docs/LanguageExtensions.rst:2462
+to the type to reflect. The second argument ``f`` should be some callable
+expression, and can be a function object or an overload set. The builtin calls
+``f``, passing any further arguments ``args...`` followed by an initializer
----------------
rsmith wrote:
> erichkeane wrote:
> > Is the purpose of the overload set/functor so you can support multiple types (seeing how the 'base' below works)?  How does this all work in C? (or do we just do fields in C, since they cannot have the rest?
> > 
> > I wonder if we would be better if we treated this function as a 'callback' or a 'callback set' that returned a discriminated union of a pre-defined type that describes what is being returned.  That way it could handle member functions, static variables, etc.
> This doesn't really work well in C, as noted in the patch description. A set of callbacks might work a bit better across both languages, but I don't think it solves the larger problem that there's no good way to pass type information into a callback in C.
Well, if the callback itself took the discriminated union for each thing, and we did 1 callback per base/field/etc, perhaps that would be useful? I just am having a hard time seeing this being all that useful in C, which is a shame.




================
Comment at: clang/docs/LanguageExtensions.rst:2471
+
+The initializer list contains the following components:
+
----------------
rsmith wrote:
> erichkeane wrote:
> > I find myself wishing this was a more complete list.
> What else do you want? My goal here was to be able to do what `__builtin_dump_struct` does but without its many arbitrary limitations. From C++, this is enough for that.
I think this is again falling down the "this is close enough to reflection, I want reflection" hole :)  I find myself wanting to be able to introspect member functions, static variables/functions/etc.


================
Comment at: clang/docs/LanguageExtensions.rst:2473
+
+* For each direct base class, the address of the base class, in declaration
+  order.
----------------
rsmith wrote:
> erichkeane wrote:
> > I wonder if bases should include their virtualness or access specifier.
> I mean, maybe, but I don't have a use case for either. For access in general my leaning is to say that the built-in doesn't get special access rights -- either it can access everything or you'll get an access error if you use it -- in which case including the access for bases and fields doesn't seem likely to be super useful.
I guess it depends on the use case for these builtins.  I saw the 'dump-struct' builtin's purpose was for quick & dirty printf-debugging.  At which point the format/features were less important than the action.  So from that perspective, I would see this having 'special access rights' as sort of a necessity.  

Though again, this is a significantly more general builtin, which comes with additional potential use cases.


================
Comment at: clang/lib/Sema/SemaChecking.cpp:420
+    auto *FD = IFD ? IFD->getAnonField() : dyn_cast<FieldDecl>(D);
+    if (!FD || (FD->isUnnamedBitfield() || FD->isAnonymousStructOrUnion()))
+      continue;
----------------
rsmith wrote:
> erichkeane wrote:
> > Losing the unnamed bitfield is unfortunate, and I the 'dump struct' handles.  As is losing the anonymous struct/union.
> > 
> > Also, how does this handle 'record type' members?  Does the user have to know the inner type already?  If they already know that much information about the type, they wouldn't really need this, right?
> If `__builtin_dump_struct` includes unnamed bitfields, that's a bug in that builtin.
> 
> In general, the user function gets given the bases and members with the right types, so they can use an overload set or a template to dispatch based on that type. See the SemaCXX test for a basic example of that.
I thought it did?  For the use case I see `__builtin_dump_struct` having, I would think printing the existence of unnamed bitfields to be quite useful.




================
Comment at: clang/test/CodeGenCXX/builtin-reflect-struct.cpp:24
+
+int consume(int, Expected);
+
----------------
rsmith wrote:
> erichkeane wrote:
> > OH! I see now.  This is unfortunately quite limiting by doing it this way.  At this point the user of this builtin is required to know the layout of the struct before calling the builtin, which seems unfortunate. Despite the downsides of being just a 'dump function', the other builtin had the benefit of working on an arbitrary type.
> Well, no, this is set up this way to test code generation. See the SemaCXX test for an example of using this to dump an arbitrary type.
Yep, thanks for that.  I see now how it can be used that way.


================
Comment at: clang/test/SemaCXX/builtin-reflect-struct.cpp:52
+
+struct FieldOrBase {
+  template <typename Base>
----------------
Ah, this is perhaps the example I was looking for.  


================
Comment at: clang/test/SemaCXX/builtin-reflect-struct.cpp:82
+    // Assume it's an integer.
+    T value = t;
+    ConstexprString s = "";
----------------
What would this look like if you needed to look into a union?  Perhaps it would be handled similarly to "__is_class"?  In both builtins, its a shame we don't have a way to properly handle the 'active member'.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124221/new/

https://reviews.llvm.org/D124221



More information about the cfe-commits mailing list