[PATCH] D37331: [ELF] Prevent crash with binary inputs with non-ascii file names

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 6 04:44:15 PDT 2017


jhenderson added inline comments.


================
Comment at: ELF/InputFiles.cpp:934
   for (size_t I = 0; I < S.size(); ++I)
-    if (!isalnum(S[I]))
+    if (!isalnum(static_cast<unsigned char>(S[I])))
       S[I] = '_';
----------------
ruiu wrote:
> jhenderson wrote:
> > ruiu wrote:
> > > jhenderson wrote:
> > > > ruiu wrote:
> > > > > isalnum might be locale-aware, so it is probably not safe to use here in the first place. Doing it manually (i.e. `'a' <= S[I] <= 'z' || 'A' <= S[I] <= 'Z' || '0' <= S[I] <= '0'` ) seems better.
> > > > isalnum is locale aware, but as far as I can see, we never change the locale away from the default C locale, so do we need to worry about it?
> > > Not all people are using their OSes in English/US UI, and I think if system's default locale is not C, isalpha could behave differently depending on the definition of "alphabet" in that system locale.
> > By default, the system locale is unimportant. According to the C99 standard (I don't have a copy of the later C standard), the locale on program start up is always "C". We do not call setlocale anywhere in LLVM or LLD, so we are always in the "C" locale at the time this is called. The "C" locale means that the characters that isalnum is true for are [A-Za-z0-9].
> > 
> > So, I think the question should be, are we ever going to set LLD to run in a different locale to the "C" locale?
> LLVM or lld may be used as a library, so if a main program sets a locale, it affects our code.
That's a fair point. I'll make that change.


https://reviews.llvm.org/D37331





More information about the llvm-commits mailing list