[PATCH] D97663: [llvm-objcopy] Fix crash for binary input files with non-ascii names

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 4 02:49:15 PST 2021


jhenderson added a comment.

In D97663#2595340 <https://reviews.llvm.org/D97663#2595340>, @MaskRay wrote:

> C11 says
>
>> The header <ctype.h> declares several functions useful for classifying and mapping characters.198) In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
>
> The crash problem you described may be similar to https://drewdevault.com/2020/09/25/A-story-of-two-libcs.html
> With certain values (negative ones?) glibc may crash.

Right, the the problem with the current code before the patch - by using a non-ascii character, some of the bytes of a multi-byte character invariably end up in the undefined range that causes problems with `isalnum`. It's basically the same as other issues I've been involved with in the past. Using `llvm::isAlnum` fixes the issue, since it doesn't have the same undefined behaviour.



================
Comment at: llvm/tools/llvm-objcopy/ELF/Object.cpp:1294
   std::replace_if(std::begin(SanitizedFilename), std::end(SanitizedFilename),
-                  [](char C) { return !isalnum(C); }, '_');
+                  [](char C) { return !isAlnum(C); }, '_');
   Twine Prefix = Twine("_binary_") + SanitizedFilename;
----------------
MaskRay wrote:
> may need an `unsigned char` cast.
Why? The signature for `isAlnum` is `bool isAlnum(char C)`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97663/new/

https://reviews.llvm.org/D97663



More information about the llvm-commits mailing list