[libc-commits] [PATCH] D74021: Created uChar implementation for libc
Marcus Johnson via Phabricator via libc-commits
libc-commits at lists.llvm.org
Wed Mar 11 22:33:37 PDT 2020
MarcusJohnson91 added a comment.
Bump
================
Comment at: libc/src/uchar/c16rtomb.cpp:16
+
+size_t LLVM_LIBC_ENTRYPOINT(c16rtomb)(char *restrict s, char16_t c16,
+ mbstate_t *restrict ps) {
----------------
sivachandra wrote:
> It would be good if you can add comments explaining the different cases that being handled here.
0xD800 - 0xDFFF is a surrogate pair, the c16 parameter can not contain the 2 16 bit code units required to actually decode it to the proper UTF-32 codepoint.
so should I try to make the decoder stateful in order to make that happen (not sure if that's even possible), or should I just replace it with the invalid replacement character?
================
Comment at: libc/src/uchar/c16rtomb.cpp:25
+ StringSize = 1;
+ s = new char(StringSize);
+ s[0] = c16 & 0x7F;
----------------
abrachet wrote:
> For what its worth, this allocates one char and initializes it to StringSize. If you wanted to allocate StringSize chars you need `new char[StringSize]`. But `new` isn't safe to use for us, and I think it wouldn't even link because we have `-nostdlib` specified.
>
>
> ```lang=C
> void changeParam(int a) {
> a = 5;
> }
>
> int main() {
> int a = 0;
> changeParam(a);
> assert(!a); // passes, a not changed.
> }
> ```
> The same is true with pointers if I pass my `char *` to c16rtomb, if you reassign to it by malloc or new, it changes the local variable but not the callers variable. These functions (unless passed a nullptr) assume that `s` is already pointing to enough valid memory.
Ahh, I appreciate the tip on new, I'm new to C++.
As for if the string is allocated I'm trying to look up some examples of how these APIs are called on github, it'll be a lot harder to verify if a string can hold a codepoint/codeunit if it's pre-allocated, because each index would look like a null terminater to begin with, right?
================
Comment at: libc/src/uchar/mbrtoc32.cpp:17
+ size_t StringSize = 1ULL;
+ char32_t Decoded = calloc(1, sizeof(char32_t));
+
----------------
abrachet wrote:
> What's the point of this?
What's the point of what?
the char32_t variable being initialized as an array instead of a plain value?
The standard requires it, I agree it's dumb, but that's the API callers expect.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74021/new/
https://reviews.llvm.org/D74021
More information about the libc-commits
mailing list