[PATCH] D74731: [Clangd] Fixed assertion when processing extended ASCII characters.
Yancheng Zheng via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Feb 17 12:30:49 PST 2020
AnakinZheng marked an inline comment as done.
AnakinZheng added inline comments.
================
Comment at: clang-tools-extra/clangd/SourceCode.cpp:62
unsigned char C = static_cast<unsigned char>(U8[I]);
- if (LLVM_LIKELY(!(C & 0x80))) { // ASCII character.
+ if (LLVM_LIKELY(!(C & 0x100))) { // ASCII or extended ASCII character.
if (CB(1, 1))
----------------
kadircet wrote:
> AnakinZheng wrote:
> > kadircet wrote:
> > > AnakinZheng wrote:
> > > > kadircet wrote:
> > > > > `C` is one byte long, it is not possible for it to ever satisfy `C&0x100`.
> > > > Updated the patch, now it should be correct.
> > > sorry but this is still the same since `C` is an `unsigned char` it is always guaranteed to be less than or equal to 255.
> > >
> > > As @sammccall explained above, we should rather re-arrange the logic to get rid of the assertion below, e.g. when `UTF8Length` is less than 2 or more than 4 we can choose to interpret it as ascii, instead of asserting, while commenting the reason for this choice.
> > It's unsigned short in the new patch. Ok, I'll try @sammccall 's suggestion.
> > It's unsigned short in the new patch.
>
> Sorry I missed that change, but still `U8[I]` is a one byte element(it is of type `char`), you can't get away by just casting it to `unsigned short`.
Yup, just aware of that, @sammccall 's suggestion looks reasonable, implementing now.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74731/new/
https://reviews.llvm.org/D74731
More information about the cfe-commits
mailing list