[PATCH] D124221: Reimplement `__builtin_dump_struct` in Sema.
Aaron Ballman via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu May 5 07:59:25 PDT 2022
aaron.ballman added a comment.
In D124221#3493792 <https://reviews.llvm.org/D124221#3493792>, @erichkeane wrote:
> FWIW, I'm in favor of the patch as it sits.
>
> As a followup: So I was thinking about the "%s" specifier for string types. Assuming char-ptr types are all strings is a LITTLE dangerous, but more so the way we're doing it. Its a shame we don't have some way of setting a 'max' limit to the number of characters we have for 2 reasons:
>
> 1- For safety: If the char-ptr points to non-null-terminated memory, it'll stop us from just arbitrarily printing into space by limiting at least the NUMBER of characters we print into nonsense.
> 2- For readability: printing a 'long' string likely makes this output look like nonsense and breaks everything up. Limiting us to only a few characters is likely a good idea.
> 3- <Bonus #3 from @aaron.ballman >: It might discourage SOME level of attempts at using this for reflection, or at least make it a little harder.
>
> What I would love would be for something like a 10 char max:
>
> struct S {
> char *C;
> };
> S s { "The Rest of this string is cut off"};
> print as:
> struct U20A a = {
> .c = 0x1234 "The Rest o"
> };
>
> Sadly, I don't see something like that in printf specifiers? Unless someone smarter than me can come up with some trickery. PERHAPS have the max-limit compile-time configurable, but I don't feel strongly.
The C Standard has this in the specification of the %s format specifier:
If no l length modifier is present, the argument shall be a pointer to storage of character
type. Characters from the storage are written up to (but not including) the terminating
null character. If the precision is specified, no more than that many bytes are written. If
the precision is not specified or is greater than the size of the storage, the storage shall
contain a null character.
So you can use the precision modifier on %s to limit the length to a particular number of bytes. The only downside I can think of to picking a limit is, what happens when the user stores valid UTF-8 data in their string and prints it via `%.10s` (will we then potentially be splitting a codepoint in half and that does something bad?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124221/new/
https://reviews.llvm.org/D124221
More information about the cfe-commits
mailing list