[PATCH] D124221: Reimplement `__builtin_dump_struct` in Sema.
Erich Keane via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu May 5 08:05:13 PDT 2022
erichkeane added a comment.
In D124221#3494000 <https://reviews.llvm.org/D124221#3494000>, @aaron.ballman wrote:
> In D124221#3493792 <https://reviews.llvm.org/D124221#3493792>, @erichkeane wrote:
>
>> FWIW, I'm in favor of the patch as it sits.
>>
>> As a followup: So I was thinking about the "%s" specifier for string types. Assuming char-ptr types are all strings is a LITTLE dangerous, but more so the way we're doing it. Its a shame we don't have some way of setting a 'max' limit to the number of characters we have for 2 reasons:
>>
>> 1- For safety: If the char-ptr points to non-null-terminated memory, it'll stop us from just arbitrarily printing into space by limiting at least the NUMBER of characters we print into nonsense.
>> 2- For readability: printing a 'long' string likely makes this output look like nonsense and breaks everything up. Limiting us to only a few characters is likely a good idea.
>> 3- <Bonus #3 from @aaron.ballman >: It might discourage SOME level of attempts at using this for reflection, or at least make it a little harder.
>>
>> What I would love would be for something like a 10 char max:
>>
>> struct S {
>> char *C;
>> };
>> S s { "The Rest of this string is cut off"};
>> print as:
>> struct U20A a = {
>> .c = 0x1234 "The Rest o"
>> };
>>
>> Sadly, I don't see something like that in printf specifiers? Unless someone smarter than me can come up with some trickery. PERHAPS have the max-limit compile-time configurable, but I don't feel strongly.
>
> The C Standard has this in the specification of the %s format specifier:
>
> If no l length modifier is present, the argument shall be a pointer to storage of character
> type. Characters from the storage are written up to (but not including) the terminating
> null character. If the precision is specified, no more than that many bytes are written. If
> the precision is not specified or is greater than the size of the storage, the storage shall
> contain a null character.
>
> So you can use the precision modifier on %s to limit the length to a particular number of bytes. The only downside I can think of to picking a limit is, what happens when the user stores valid UTF-8 data in their string and prints it via `%.10s` (will we then potentially be splitting a codepoint in half and that does something bad?
Ah! TIL! That I think would be an excellent improvement to this builtin. I suspect we could just choose a fixed value to start (maybe even in this patch @rsmith ?), then add a compile-time option in the future if folks care to extend it. I would posit that 10-15 would be a good start? Any shorter would be prohibitive I think? I could be talked into a little longer...
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124221/new/
https://reviews.llvm.org/D124221
More information about the cfe-commits
mailing list