[PATCH] D76291: [Support] Fix formatted_raw_ostream for UTF-8
Hubert Tong via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Mar 18 08:42:09 PDT 2020
hubert.reinterpretcast added inline comments.
================
Comment at: llvm/lib/Support/FormattedStream.cpp:59
+ size_t NumBytes = getNumBytesForUTF8(PartialUTF8Char[0]);
+ if (NumBytes > (PartialUTF8Char.size() + Size)) {
+ // If we still don't have enough bytes for a complete code point, just
----------------
The overflow-free version should be preferred: `Size < NumBytes - PartialUTF8Char.size()`
Considering that `NumBytes - PartialUTF8Char.size()` is already used in the `else`, this might as well be named `NumBytesNeededFromBuffer` first (at which point it would be the only use of `NumBytes`, so we can get rid of `NumBytes`):
```
size_t NumBytesNeededFromBuffer =
getNumBytesForUTF8(PartialUTF8Char[0]) - PartialUTF8Char.size();
if (Size < NumBytesNeededFromBuffer) {
```
================
Comment at: llvm/lib/Support/FormattedStream.cpp:87
+ // even if the buffer is being flushed.
+ if ((Ptr + NumBytes) > End) {
+ PartialUTF8Char = StringRef(Ptr, End - Ptr);
----------------
Prefer the overflow-free version: `End - Ptr < NumBytes`. If inclined to name `End - Ptr` (it would occur twice), `BytesAvailable` makes sense.
================
Comment at: llvm/unittests/Support/formatted_raw_ostream_test.cpp:88
+
+TEST(formatted_raw_ostreamTest, Test_UTF8) {
+ SmallString<128> A;
----------------
ostannard wrote:
> hubert.reinterpretcast wrote:
> > Should there be a test for combining characters?
> This doesn't support combining characters, I don't think there's much point in adding a test for something which doesn't work.
Got it; thanks.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D76291/new/
https://reviews.llvm.org/D76291
More information about the cfe-commits
mailing list