[PATCH] D97663: [llvm-objcopy] Fix crash for binary input files with non-ascii names
Fangrui Song via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 4 14:33:48 PST 2021
MaskRay accepted this revision.
MaskRay added a comment.
This revision is now accepted and ready to land.
In D97663#2602894 <https://reviews.llvm.org/D97663#2602894>, @jhenderson wrote:
> In D97663#2595340 <https://reviews.llvm.org/D97663#2595340>, @MaskRay wrote:
>
>> C11 says
>>
>>> The header <ctype.h> declares several functions useful for classifying and mapping characters.198) In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
>>
>> The crash problem you described may be similar to https://drewdevault.com/2020/09/25/A-story-of-two-libcs.html
>> With certain values (negative ones?) glibc may crash.
>
> Right, the the problem with the current code before the patch - by using a non-ascii character, some of the bytes of a multi-byte character invariably end up in the undefined range that causes problems with `isalnum`. It's basically the same as other issues I've been involved with in the past. Using `llvm::isAlnum` fixes the issue, since it doesn't have the same undefined behaviour.
OK, I did not check that `llvm::isAlnum` accepts `char` instead of `int`.
Still worth clarifying that this is fixing UB, which may be a crash on some implementations (glibc? Though I cannot reproduce it).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D97663/new/
https://reviews.llvm.org/D97663
More information about the llvm-commits
mailing list