[libc-commits] [libc] [libc] utf8 to 32 CharacterConverter (PR #143973)

Brooks Moses via libc-commits libc-commits at lists.llvm.org
Fri Jun 13 15:51:22 PDT 2025


================
@@ -22,13 +25,65 @@ bool CharacterConverter::isComplete() {
   return state->bytes_processed == state->total_bytes;
 }
 
-int CharacterConverter::push(char8_t utf8_byte) {}
+int CharacterConverter::push(char8_t utf8_byte) {
+  // Checking the first byte if first push
+  if (state->bytes_processed == 0 && state->total_bytes == 0) {
+    state->partial = static_cast<char32_t>(0);
+    uint8_t numOnes = static_cast<uint8_t>(cpp::countl_one(utf8_byte));
+    // 1 byte total
----------------
brooksmoses wrote:

This comment should be clear about what is "1 byte total".  Perhaps "The UTF-8 character has 1 byte total" or something like that?

https://github.com/llvm/llvm-project/pull/143973


More information about the libc-commits mailing list