[PATCH] D124221: Add new builtin __builtin_reflect_struct.

Erich Keane via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 22 10:17:03 PDT 2022


erichkeane added inline comments.


================
Comment at: clang/docs/LanguageExtensions.rst:2434
+
+     bool print(int arg1, int arg2, std::initializer_list<Field> fields) {
+       for (auto f : fields) {
----------------
rsmith wrote:
> erichkeane wrote:
> > rsmith wrote:
> > > erichkeane wrote:
> > > > I'm curious as to the need for the 'arg1' and 'arg2'?  Also, what is the type of the 'fields' array that works in C?
> > > This is aimed to fix a major deficiency in `__builtin_dump_struct`, that there's no way to pass information into its callback (other than stashing it in TLS).
> > > 
> > > As noted in the patch description, this builtin is not especially usable in C. I'm not sure whether that's fixable, given the very limited metaprogramming support in C.
> > Presumably one could also put it in the functor for the callback, or even as a capture in the lambda.  Though, I'm not positive I see the VALUE to passing this information, but was just curious as to the intent.
> Suppose you want to dump a struct to a log file, or you want the dump in a string. Or you want to carry along an indentation level. My first attempt at using `__builtin_dump_struct` essentially failed because it doesn't have anything like this; even if we don't want this patch, this is functionality we should add to that builtin.
Ah, I see!  Sure, I can see how those are useful, though again, mostly to the C implementation (as those should all be part of the functor's state), and this doesn't work particularly well in C.


================
Comment at: clang/docs/LanguageExtensions.rst:2462
+to the type to reflect. The second argument ``f`` should be some callable
+expression, and can be a function object or an overload set. The builtin calls
+``f``, passing any further arguments ``args...`` followed by an initializer
----------------
rsmith wrote:
> erichkeane wrote:
> > rsmith wrote:
> > > erichkeane wrote:
> > > > Is the purpose of the overload set/functor so you can support multiple types (seeing how the 'base' below works)?  How does this all work in C? (or do we just do fields in C, since they cannot have the rest?
> > > > 
> > > > I wonder if we would be better if we treated this function as a 'callback' or a 'callback set' that returned a discriminated union of a pre-defined type that describes what is being returned.  That way it could handle member functions, static variables, etc.
> > > This doesn't really work well in C, as noted in the patch description. A set of callbacks might work a bit better across both languages, but I don't think it solves the larger problem that there's no good way to pass type information into a callback in C.
> > Well, if the callback itself took the discriminated union for each thing, and we did 1 callback per base/field/etc, perhaps that would be useful? I just am having a hard time seeing this being all that useful in C, which is a shame.
> > 
> > 
> I think that would work. I'm not sure it would be better, though, given that it loses the ability to build a type whose shape depends on the number of bases and fields, because you're not given them all at once. (Eg, if you want to build a static array of field names.)
Yep, thats definitely true. There are ways to DO that (either a 2-pass at this to get the sizes, allocate something, then go through again filling in your struct OR a constant 'append' thing with a `vector` like type), but I see what you mean.  It seems that either case we are limiting a couple of useful use cases.


================
Comment at: clang/docs/LanguageExtensions.rst:2473
+
+* For each direct base class, the address of the base class, in declaration
+  order.
----------------
rsmith wrote:
> erichkeane wrote:
> > rsmith wrote:
> > > erichkeane wrote:
> > > > I wonder if bases should include their virtualness or access specifier.
> > > I mean, maybe, but I don't have a use case for either. For access in general my leaning is to say that the built-in doesn't get special access rights -- either it can access everything or you'll get an access error if you use it -- in which case including the access for bases and fields doesn't seem likely to be super useful.
> > I guess it depends on the use case for these builtins.  I saw the 'dump-struct' builtin's purpose was for quick & dirty printf-debugging.  At which point the format/features were less important than the action.  So from that perspective, I would see this having 'special access rights' as sort of a necessity.  
> > 
> > Though again, this is a significantly more general builtin, which comes with additional potential use cases.
> The use cases are the things that `__builtin_dump_struct` gets close to but fails at, such as: I have an `EXPECT_EQ` macro that I'd like to log the difference in input and output values where possible, or I want to build a simple print-to-string utility for structs. Or I want a dump to stout but I want it formatted differently from what `__builtin_dump_struct` thinks is best.
> 
> The fact that the dump built-in gets us into this area but can't solve these problems is frustrating.
>>The use cases are the things that __builtin_dump_struct gets close to but fails at, such as: I have an EXPECT_EQ macro that I'd like to log the difference in input and output values where possible

THAT is an interesting use case, I see value to that too.

>>The fact that the dump built-in gets us into this area but can't solve these problems is frustrating.

I definitely see that. FWIW, I don't think I was the one to approve/review the initial version of `__builtin_dump_struct` (and perhaps would have been equally frustrated if I thought that through), but was looking into the patches after the fact to help out.

I guess I find myself equally/additionally/etc frustrated at THIS patch because it gets us so close to everything SG7 promises they can invent.


================
Comment at: clang/lib/Sema/SemaChecking.cpp:420
+    auto *FD = IFD ? IFD->getAnonField() : dyn_cast<FieldDecl>(D);
+    if (!FD || (FD->isUnnamedBitfield() || FD->isAnonymousStructOrUnion()))
+      continue;
----------------
rsmith wrote:
> erichkeane wrote:
> > rsmith wrote:
> > > erichkeane wrote:
> > > > Losing the unnamed bitfield is unfortunate, and I the 'dump struct' handles.  As is losing the anonymous struct/union.
> > > > 
> > > > Also, how does this handle 'record type' members?  Does the user have to know the inner type already?  If they already know that much information about the type, they wouldn't really need this, right?
> > > If `__builtin_dump_struct` includes unnamed bitfields, that's a bug in that builtin.
> > > 
> > > In general, the user function gets given the bases and members with the right types, so they can use an overload set or a template to dispatch based on that type. See the SemaCXX test for a basic example of that.
> > I thought it did?  For the use case I see `__builtin_dump_struct` having, I would think printing the existence of unnamed bitfields to be quite useful.
> > 
> > 
> Why? They're not part of the value, they're just padding.
They are lexically part of the type and are part of what makes up the 'size' of the thing. I would expect something dumping the type to be as consistent lexically as possible.


================
Comment at: clang/lib/Sema/SemaChecking.cpp:420
+    auto *FD = IFD ? IFD->getAnonField() : dyn_cast<FieldDecl>(D);
+    if (!FD || (FD->isUnnamedBitfield() || FD->isAnonymousStructOrUnion()))
+      continue;
----------------
rsmith wrote:
> erichkeane wrote:
> > rsmith wrote:
> > > erichkeane wrote:
> > > > rsmith wrote:
> > > > > erichkeane wrote:
> > > > > > Losing the unnamed bitfield is unfortunate, and I the 'dump struct' handles.  As is losing the anonymous struct/union.
> > > > > > 
> > > > > > Also, how does this handle 'record type' members?  Does the user have to know the inner type already?  If they already know that much information about the type, they wouldn't really need this, right?
> > > > > If `__builtin_dump_struct` includes unnamed bitfields, that's a bug in that builtin.
> > > > > 
> > > > > In general, the user function gets given the bases and members with the right types, so they can use an overload set or a template to dispatch based on that type. See the SemaCXX test for a basic example of that.
> > > > I thought it did?  For the use case I see `__builtin_dump_struct` having, I would think printing the existence of unnamed bitfields to be quite useful.
> > > > 
> > > > 
> > > Why? They're not part of the value, they're just padding.
> > They are lexically part of the type and are part of what makes up the 'size' of the thing. I would expect something dumping the type to be as consistent lexically as possible.
> Looks like `__builtin_dump_struct` currently includes them *and* prints a value (whatever garbage happens to be in those padding bits). That's obviously a bug.
O.o! Yeah, printing the value is nonsense.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124221/new/

https://reviews.llvm.org/D124221



More information about the cfe-commits mailing list