[libc-commits] [libc] [libc] Implemented CharacterConverter push/pop for utf32->utf8 conversions (PR #143971)

Michael Jones via libc-commits libc-commits at lists.llvm.org
Fri Jun 13 10:19:28 PDT 2025


================
@@ -47,24 +47,31 @@ ErrorOr<char8_t> CharacterConverter::pop_utf8() {
   if (state->bytes_processed >= state->total_bytes)
     return Error(-1);
 
-  const char8_t first_byte_headers[] = {0, 0xC0, 0xE0, 0xF0};
-  const char32_t utf32 = state->partial;
-  const char32_t tot_bytes = state->total_bytes;
-  const char32_t bytes_proc = state->bytes_processed;
+  const char8_t FIRST_BYTE_HEADERS[] = {0, 0xC0, 0xE0, 0xF0};
+  const char8_t CONTINUING_BYTE_HEADER = 0x80;
+
+  // the number of bits per utf-8 byte that actually encode character
+  // information not metadata (# of bits excluding the byte headers)
+  const int ENCODED_BITS_PER_BYTE = 6;
+  const int MASK_LOWER_SIX = 0x3f;
----------------
michaelrj-google wrote:

This can be `mask_trailing_ones<int, ENCODED_BITS_PER_BYTE>()` instead of `0x3F`

https://github.com/llvm/llvm-project/pull/143971


More information about the libc-commits mailing list