[PATCH] D97663: [llvm-objcopy] Fix crash for binary input files with non-ascii names

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 4 14:33:48 PST 2021


MaskRay accepted this revision.
MaskRay added a comment.
This revision is now accepted and ready to land.

In D97663#2602894 <https://reviews.llvm.org/D97663#2602894>, @jhenderson wrote:

> In D97663#2595340 <https://reviews.llvm.org/D97663#2595340>, @MaskRay wrote:
>
>> C11 says
>>
>>> The header <ctype.h> declares several functions useful for classifying and mapping characters.198) In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
>>
>> The crash problem you described may be similar to https://drewdevault.com/2020/09/25/A-story-of-two-libcs.html
>> With certain values (negative ones?) glibc may crash.
>
> Right, the the problem with the current code before the patch - by using a non-ascii character, some of the bytes of a multi-byte character invariably end up in the undefined range that causes problems with `isalnum`. It's basically the same as other issues I've been involved with in the past. Using `llvm::isAlnum` fixes the issue, since it doesn't have the same undefined behaviour.

OK, I did not check that `llvm::isAlnum` accepts `char` instead of `int`.

Still worth clarifying that this is fixing UB, which may be a crash on some implementations (glibc? Though I cannot reproduce it).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97663/new/

https://reviews.llvm.org/D97663



More information about the llvm-commits mailing list