[PATCH] D74473: [analyzer] StdLibraryFunctionsChecker: Use platform dependent EOF and UCharMax

Gabor Marton via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Feb 18 08:53:23 PST 2020


martong marked an inline comment as done.
martong added inline comments.


================
Comment at: clang/lib/StaticAnalyzer/Core/CheckerHelpers.cpp:114-117
+  const auto MacroIt = llvm::find_if(
+      PP.macros(), [&](const auto &K) { return K.first->getName() == Macro; });
+  if (MacroIt == PP.macro_end())
+    return llvm::None;
----------------
balazske wrote:
> martong wrote:
> > martong wrote:
> > > Szelethus wrote:
> > > > NoQ wrote:
> > > > > martong wrote:
> > > > > > Szelethus wrote:
> > > > > > > This seems a bit clunky even for the `Preprocessor` -- how about
> > > > > > > 
> > > > > > > ```lang=c++
> > > > > > > const auto *MacroII = PP.getIdentifierInfo(Macro);
> > > > > > > if (!MacroII)
> > > > > > >   return;
> > > > > > > const MacroInfo *MI = PP.getMacroInfo(MacroII);
> > > > > > > assert(MI);
> > > > > > > ```
> > > > > > Ok, but we cannot assert on `MI`, because there may be cases when the macro is not defined in a TU. In that case we should just return with None.
> > > > > What exactly happens when the macro is `#undef`-ined and redefined? We get the last redefinition that's valid at the end of the translation unit, right? Can we check whether there are multiple definitions and guard against that?
> > > > Ugh, now that you say it that is a valid concern. I had to deal with that back in the day: https://reviews.llvm.org/D52794?id=171962#inline-476352
> > > Solving this does not seem easy in my opinion. To handle `#undef`s we should build an infrastructure where summaries can reference callable objects. This is necessary, because in `evalCall` the value of `EOF` would depend on the souce location of the call expression of the function with the summary. Not impossible to solve, but certainly introduces complexity. Do you think that the redefinition of EOF is so common? I mean maybe it is too much hassle for a very rare edge case (?).
> > The standard library (libc or libc++) should define EOF consitently in stdio.h.
> > Now, if the application redefines the value of EOF then the code could be broken, or at least it would not be compatible with libc. Consider the following code that is perfectly legit if we don't redefine EOF, but if we do:
> > ```
> > #include <stdio.h>
> > #include <stdlib.h>
> > #define EOF -2 // Here be dragons !!!
> > int main(void)
> > {
> >     FILE* fp = fopen("test.txt", "r");
> >     int c;
> >     while ((c = fgetc(fp)) != EOF) { // BOOM: Infinite loop !!!
> >        putchar(c);
> >     }
> >     fclose(fp);
> > }
> > ```
> > So, redefinition of EOF (or any standard macro) results in broken code.
> > And this is also a warning:
> > ```
> > ) clang eof-inf.c
> > eof-inf.c:3:9: warning: 'EOF' macro redefined [-Wmacro-redefined]
> > #define EOF -2 // Here be dragons !!!
> >         ^
> > /usr/include/x86_64-linux-gnu/bits/libio.h:66:10: note: previous definition is here
> > # define EOF (-1)
> >          ^
> > 1 warning generated.
> > ```
> It should be possible to get the list of all macro definitions (of a macro). We can select the first item, or one that is inside a system header (by the SourceLocation).
Actually, redefinition of a macro in a standard library is undefined behavior:

C++11:
```
17.6.4.3.1
[macro.names]
1) A translation unit that includes a standard library header shall not #define or #undef names declared in any standard library header.
```
and
```
[reserved.names]
1) The C ++ standard library reserves the following kinds of names:
— macros
— global names
— names with external linkage
2) If a program declares or defines a name in a context where it is reserved, other than as explicitly allowed by this Clause, its behavior is undefined.
```

Also, this is stated in C99 (7.1).
Plus[[ https://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html |  from libc manual ]]:
```
The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your program explicitly includes the header file that defines or declares them. 
```

Thus, I don't think we should come up with solutions to handle already broken code (i.e. programs with undefined behavior).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74473/new/

https://reviews.llvm.org/D74473





More information about the cfe-commits mailing list