[clang] 71d83bb - Add -fkeep-system-includes modifier for -E

Paul Robinson via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 6 13:00:11 PDT 2023


Author: Paul Robinson
Date: 2023-10-06T12:55:48-07:00
New Revision: 71d83bb426104fb77605382c61968aaf55b537b8

URL: https://github.com/llvm/llvm-project/commit/71d83bb426104fb77605382c61968aaf55b537b8
DIFF: https://github.com/llvm/llvm-project/commit/71d83bb426104fb77605382c61968aaf55b537b8.diff

LOG: Add -fkeep-system-includes modifier for -E

This option will cause -E to preserve the #include directives
for system headers, rather than expanding them into the output.
This can greatly reduce the volume of preprocessed source text
in a test case, making test case reduction simpler.

Note that -fkeep-system-includes is not always appropriate. For
example, if the problem you want to reproduce is induced by a
system header file, it's better to expand those headers fully.
If your source defines symbols that influence the content of a
system header (e.g., _POSIX_SOURCE) then -E will eliminate the
definition, potentially changing the meaning of the preprocessed
source. If you use -isystem to point to non-system headers, for
example to suppress warnings in third-party software, those will
not be expanded and might make the preprocessed source less useful
as a test case.

Added: 
    clang/test/Frontend/Inputs/dashE/dashE.h
    clang/test/Frontend/Inputs/dashE/sys/a.h
    clang/test/Frontend/Inputs/dashE/sys/b.h
    clang/test/Frontend/dashE-sysincludes.cpp

Modified: 
    clang/docs/CommandGuide/clang.rst
    clang/docs/ReleaseNotes.rst
    clang/include/clang/Basic/DiagnosticDriverKinds.td
    clang/include/clang/Driver/Options.td
    clang/include/clang/Frontend/PreprocessorOutputOptions.h
    clang/lib/Driver/ToolChains/Clang.cpp
    clang/lib/Frontend/PrintPreprocessedOutput.cpp
    clang/test/Preprocessor/minimize-whitespace-messages.c

Removed: 
    


################################################################################
diff  --git a/clang/docs/CommandGuide/clang.rst b/clang/docs/CommandGuide/clang.rst
index 139c8f25137d3fb..e1c872cdc55396a 100644
--- a/clang/docs/CommandGuide/clang.rst
+++ b/clang/docs/CommandGuide/clang.rst
@@ -684,6 +684,21 @@ Preprocessor Options
 
   Do not search clang's builtin directory for include files.
 
+.. option:: -fkeep-system-includes
+
+  Usable only with :option:`-E`. Do not copy the preprocessed content of
+  "system" headers to the output; instead, preserve the #include directive.
+  This can greatly reduce the volume of text produced by :option:`-E` which
+  can be helpful when trying to produce a "small" reproduceable test case.
+
+  This option does not guarantee reproduceability, however. If the including
+  source defines preprocessor symbols that influence the behavior of system
+  headers (for example, ``_XOPEN_SOURCE``) the operation of :option:`-E` will
+  remove that definition and thus can change the semantics of the included
+  header. Also, using a 
diff erent version of the system headers (especially a
+  
diff erent version of the STL) may result in 
diff erent behavior. Always verify
+  the preprocessed file by compiling it separately.
+
 
 ENVIRONMENT
 -----------

diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 02c07166f89a887..d2a435041a1542f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -152,6 +152,7 @@ Non-comprehensive list of changes in this release
 
 New Compiler Flags
 ------------------
+
 * ``-fverify-intermediate-code`` and its complement ``-fno-verify-intermediate-code``.
   Enables or disables verification of the generated LLVM IR.
   Users can pass this to turn on extra verification to catch certain types of
@@ -159,6 +160,10 @@ New Compiler Flags
   Since enabling the verifier adds a non-trivial cost of a few percent impact on
   build times, it's disabled by default, unless your LLVM distribution itself is
   compiled with runtime checks enabled.
+* ``-fkeep-system-includes`` modifies the behavior of the ``-E`` option,
+  preserving ``#include`` directives for "system" headers instead of copying
+  the preprocessed text to the output. This can greatly reduce the size of the
+  preprocessed output, which can be helpful when trying to reduce a test case.
 
 Deprecated Compiler Flags
 -------------------------

diff  --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 91a95def4f80de4..9c00fa50bb95580 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -182,8 +182,8 @@ def err_drv_invalid_Xopenmp_target_with_args : Error<
   "invalid -Xopenmp-target argument: '%0', options requiring arguments are unsupported">;
 def err_drv_argument_only_allowed_with : Error<
   "invalid argument '%0' only allowed with '%1'">;
-def err_drv_minws_unsupported_input_type : Error<
-  "'-fminimize-whitespace' invalid for input of type %0">;
+def err_drv_opt_unsupported_input_type : Error<
+  "'%0' invalid for input of type %1">;
 def err_drv_amdgpu_ieee_without_no_honor_nans : Error<
   "invalid argument '-mno-amdgpu-ieee' only allowed with relaxed NaN handling">;
 def err_drv_argument_not_allowed_with : Error<

diff  --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 381126be176ac4f..5415b18d3f406df 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2519,6 +2519,17 @@ defm minimize_whitespace : BoolFOption<"minimize-whitespace",
           "whitespace such that two files with only formatting changes are "
           "equal.\n\nOnly valid with -E on C-like inputs and incompatible "
           "with -traditional-cpp.">, NegFlag<SetFalse>>;
+defm keep_system_includes : BoolFOption<"keep-system-includes",
+  PreprocessorOutputOpts<"KeepSystemIncludes">, DefaultFalse,
+  PosFlag<SetTrue, [], [ClangOption, CC1Option],
+          "Instead of expanding system headers when emitting preprocessor "
+          "output, preserve the #include directive. Useful when producing "
+          "preprocessed output for test case reduction. May produce incorrect "
+          "output if preprocessor symbols that control the included content "
+          "(e.g. _XOPEN_SOURCE) are defined in the including source file. The "
+          "portability of the resulting source to other compilation environments "
+          "is not guaranteed.\n\nOnly valid with -E.">,
+  NegFlag<SetFalse>>;
 
 def ffreestanding : Flag<["-"], "ffreestanding">, Group<f_Group>,
   Visibility<[ClangOption, CC1Option]>,

diff  --git a/clang/include/clang/Frontend/PreprocessorOutputOptions.h b/clang/include/clang/Frontend/PreprocessorOutputOptions.h
index d542032431b14c8..db2ec9f2ae20698 100644
--- a/clang/include/clang/Frontend/PreprocessorOutputOptions.h
+++ b/clang/include/clang/Frontend/PreprocessorOutputOptions.h
@@ -26,6 +26,7 @@ class PreprocessorOutputOptions {
   unsigned RewriteImports  : 1;    ///< Include contents of transitively-imported modules.
   unsigned MinimizeWhitespace : 1; ///< Ignore whitespace from input.
   unsigned DirectivesOnly : 1; ///< Process directives but do not expand macros.
+  unsigned KeepSystemIncludes : 1; ///< Do not expand system headers.
 
 public:
   PreprocessorOutputOptions() {
@@ -40,6 +41,7 @@ class PreprocessorOutputOptions {
     RewriteImports = 0;
     MinimizeWhitespace = 0;
     DirectivesOnly = 0;
+    KeepSystemIncludes = 0;
   }
 };
 

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index c74f6ff447261dc..bfd6c5c2864abf7 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -68,7 +68,9 @@ using namespace llvm::opt;
 static void CheckPreprocessingOptions(const Driver &D, const ArgList &Args) {
   if (Arg *A = Args.getLastArg(clang::driver::options::OPT_C, options::OPT_CC,
                                options::OPT_fminimize_whitespace,
-                               options::OPT_fno_minimize_whitespace)) {
+                               options::OPT_fno_minimize_whitespace,
+                               options::OPT_fkeep_system_includes,
+                               options::OPT_fno_keep_system_includes)) {
     if (!Args.hasArg(options::OPT_E) && !Args.hasArg(options::OPT__SLASH_P) &&
         !Args.hasArg(options::OPT__SLASH_EP) && !D.CCCIsCPP()) {
       D.Diag(clang::diag::err_drv_argument_only_allowed_with)
@@ -6717,11 +6719,21 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
                    options::OPT_fno_minimize_whitespace, false)) {
     types::ID InputType = Inputs[0].getType();
     if (!isDerivedFromC(InputType))
-      D.Diag(diag::err_drv_minws_unsupported_input_type)
-          << types::getTypeName(InputType);
+      D.Diag(diag::err_drv_opt_unsupported_input_type)
+          << "-fminimize-whitespace" << types::getTypeName(InputType);
     CmdArgs.push_back("-fminimize-whitespace");
   }
 
+  // -fno-keep-system-includes is default.
+  if (Args.hasFlag(options::OPT_fkeep_system_includes,
+                   options::OPT_fno_keep_system_includes, false)) {
+    types::ID InputType = Inputs[0].getType();
+    if (!isDerivedFromC(InputType))
+      D.Diag(diag::err_drv_opt_unsupported_input_type)
+          << "-fkeep-system-includes" << types::getTypeName(InputType);
+    CmdArgs.push_back("-fkeep-system-includes");
+  }
+
   // -fms-extensions=0 is default.
   if (Args.hasFlag(options::OPT_fms_extensions, options::OPT_fno_ms_extensions,
                    IsWindowsMSVC))

diff  --git a/clang/lib/Frontend/PrintPreprocessedOutput.cpp b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
index f86ba08d36223be..7f5f6690682300e 100644
--- a/clang/lib/Frontend/PrintPreprocessedOutput.cpp
+++ b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
@@ -97,6 +97,9 @@ class PrintPPOutputPPCallbacks : public PPCallbacks {
   bool IsFirstFileEntered;
   bool MinimizeWhitespace;
   bool DirectivesOnly;
+  bool KeepSystemIncludes;
+  raw_ostream *OrigOS;
+  std::unique_ptr<llvm::raw_null_ostream> NullOS;
 
   Token PrevTok;
   Token PrevPrevTok;
@@ -105,12 +108,13 @@ class PrintPPOutputPPCallbacks : public PPCallbacks {
   PrintPPOutputPPCallbacks(Preprocessor &pp, raw_ostream *os, bool lineMarkers,
                            bool defines, bool DumpIncludeDirectives,
                            bool UseLineDirectives, bool MinimizeWhitespace,
-                           bool DirectivesOnly)
+                           bool DirectivesOnly, bool KeepSystemIncludes)
       : PP(pp), SM(PP.getSourceManager()), ConcatInfo(PP), OS(os),
         DisableLineMarkers(lineMarkers), DumpDefines(defines),
         DumpIncludeDirectives(DumpIncludeDirectives),
         UseLineDirectives(UseLineDirectives),
-        MinimizeWhitespace(MinimizeWhitespace), DirectivesOnly(DirectivesOnly) {
+        MinimizeWhitespace(MinimizeWhitespace), DirectivesOnly(DirectivesOnly),
+        KeepSystemIncludes(KeepSystemIncludes), OrigOS(os) {
     CurLine = 0;
     CurFilename += "<uninit>";
     EmittedTokensOnThisLine = false;
@@ -118,6 +122,8 @@ class PrintPPOutputPPCallbacks : public PPCallbacks {
     FileType = SrcMgr::C_User;
     Initialized = false;
     IsFirstFileEntered = false;
+    if (KeepSystemIncludes)
+      NullOS = std::make_unique<llvm::raw_null_ostream>();
 
     PrevTok.startToken();
     PrevPrevTok.startToken();
@@ -350,6 +356,10 @@ void PrintPPOutputPPCallbacks::FileChanged(SourceLocation Loc,
 
   CurLine = NewLine;
 
+  // In KeepSystemIncludes mode, redirect OS as needed.
+  if (KeepSystemIncludes && (isSystem(FileType) != isSystem(NewFileType)))
+    OS = isSystem(FileType) ? OrigOS : NullOS.get();
+
   CurFilename.clear();
   CurFilename += UserLoc.getFilename();
   FileType = NewFileType;
@@ -394,14 +404,16 @@ void PrintPPOutputPPCallbacks::InclusionDirective(
     StringRef SearchPath, StringRef RelativePath, const Module *Imported,
     SrcMgr::CharacteristicKind FileType) {
   // In -dI mode, dump #include directives prior to dumping their content or
-  // interpretation.
-  if (DumpIncludeDirectives) {
+  // interpretation. Similar for -fkeep-system-includes.
+  if (DumpIncludeDirectives || (KeepSystemIncludes && isSystem(FileType))) {
     MoveToLine(HashLoc, /*RequireStartOfLine=*/true);
     const std::string TokenText = PP.getSpelling(IncludeTok);
     assert(!TokenText.empty());
     *OS << "#" << TokenText << " "
         << (IsAngled ? '<' : '"') << FileName << (IsAngled ? '>' : '"')
-        << " /* clang -E -dI */";
+        << " /* clang -E "
+        << (DumpIncludeDirectives ? "-dI" : "-fkeep-system-includes")
+        << " */";
     setEmittedDirectiveOnThisLine();
   }
 
@@ -412,7 +424,8 @@ void PrintPPOutputPPCallbacks::InclusionDirective(
     case tok::pp_import:
     case tok::pp_include_next:
       MoveToLine(HashLoc, /*RequireStartOfLine=*/true);
-      *OS << "#pragma clang module import " << Imported->getFullModuleName(true)
+      *OS << "#pragma clang module import "
+          << Imported->getFullModuleName(true)
           << " /* clang -E: implicit import for "
           << "#" << PP.getSpelling(IncludeTok) << " "
           << (IsAngled ? '<' : '"') << FileName << (IsAngled ? '>' : '"')
@@ -794,8 +807,7 @@ struct UnknownPragmaHandler : public PragmaHandler {
 
 
 static void PrintPreprocessedTokens(Preprocessor &PP, Token &Tok,
-                                    PrintPPOutputPPCallbacks *Callbacks,
-                                    raw_ostream *OS) {
+                                    PrintPPOutputPPCallbacks *Callbacks) {
   bool DropComments = PP.getLangOpts().TraditionalCPP &&
                       !PP.getCommentRetentionState();
 
@@ -863,7 +875,7 @@ static void PrintPreprocessedTokens(Preprocessor &PP, Token &Tok,
       // components. We don't have a good way to round-trip those.
       Module *M = reinterpret_cast<Module *>(Tok.getAnnotationValue());
       std::string Name = M->getFullModuleName();
-      OS->write(Name.data(), Name.size());
+      Callbacks->OS->write(Name.data(), Name.size());
       Callbacks->HandleNewlinesInToken(Name.data(), Name.size());
     } else if (Tok.isAnnotation()) {
       // Ignore annotation tokens created by pragmas - the pragmas themselves
@@ -871,14 +883,14 @@ static void PrintPreprocessedTokens(Preprocessor &PP, Token &Tok,
       PP.Lex(Tok);
       continue;
     } else if (IdentifierInfo *II = Tok.getIdentifierInfo()) {
-      *OS << II->getName();
+      *Callbacks->OS << II->getName();
     } else if (Tok.isLiteral() && !Tok.needsCleaning() &&
                Tok.getLiteralData()) {
-      OS->write(Tok.getLiteralData(), Tok.getLength());
+      Callbacks->OS->write(Tok.getLiteralData(), Tok.getLength());
     } else if (Tok.getLength() < std::size(Buffer)) {
       const char *TokPtr = Buffer;
       unsigned Len = PP.getSpelling(Tok, TokPtr);
-      OS->write(TokPtr, Len);
+      Callbacks->OS->write(TokPtr, Len);
 
       // Tokens that can contain embedded newlines need to adjust our current
       // line number.
@@ -895,7 +907,7 @@ static void PrintPreprocessedTokens(Preprocessor &PP, Token &Tok,
       }
     } else {
       std::string S = PP.getSpelling(Tok);
-      OS->write(S.data(), S.size());
+      Callbacks->OS->write(S.data(), S.size());
 
       // Tokens that can contain embedded newlines need to adjust our current
       // line number.
@@ -970,7 +982,7 @@ void clang::DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
   PrintPPOutputPPCallbacks *Callbacks = new PrintPPOutputPPCallbacks(
       PP, OS, !Opts.ShowLineMarkers, Opts.ShowMacros,
       Opts.ShowIncludeDirectives, Opts.UseLineDirectives,
-      Opts.MinimizeWhitespace, Opts.DirectivesOnly);
+      Opts.MinimizeWhitespace, Opts.DirectivesOnly, Opts.KeepSystemIncludes);
 
   // Expand macros in pragmas with -fms-extensions.  The assumption is that
   // the majority of pragmas in such a file will be Microsoft pragmas.
@@ -1028,7 +1040,7 @@ void clang::DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
   } while (true);
 
   // Read all the preprocessed tokens, printing them out to the stream.
-  PrintPreprocessedTokens(PP, Tok, Callbacks, OS);
+  PrintPreprocessedTokens(PP, Tok, Callbacks);
   *OS << '\n';
 
   // Remove the handlers we just added to leave the preprocessor in a sane state

diff  --git a/clang/test/Frontend/Inputs/dashE/dashE.h b/clang/test/Frontend/Inputs/dashE/dashE.h
new file mode 100644
index 000000000000000..c1a9df40e34ddde
--- /dev/null
+++ b/clang/test/Frontend/Inputs/dashE/dashE.h
@@ -0,0 +1,3 @@
+int dashE_1;
+#include <a.h>
+int dashE_2;

diff  --git a/clang/test/Frontend/Inputs/dashE/sys/a.h b/clang/test/Frontend/Inputs/dashE/sys/a.h
new file mode 100644
index 000000000000000..55b2bcb4ccb4083
--- /dev/null
+++ b/clang/test/Frontend/Inputs/dashE/sys/a.h
@@ -0,0 +1,3 @@
+int a_1;
+#include <b.h>
+int a_2;

diff  --git a/clang/test/Frontend/Inputs/dashE/sys/b.h b/clang/test/Frontend/Inputs/dashE/sys/b.h
new file mode 100644
index 000000000000000..67bdce58dfb976e
--- /dev/null
+++ b/clang/test/Frontend/Inputs/dashE/sys/b.h
@@ -0,0 +1 @@
+int b_1;

diff  --git a/clang/test/Frontend/dashE-sysincludes.cpp b/clang/test/Frontend/dashE-sysincludes.cpp
new file mode 100644
index 000000000000000..db0c488d715775d
--- /dev/null
+++ b/clang/test/Frontend/dashE-sysincludes.cpp
@@ -0,0 +1,23 @@
+// RUN: mkdir %t.dir
+// RUN: %clang_cc1 -E -fkeep-system-includes -I %S/Inputs/dashE -isystem %S/Inputs/dashE/sys %s | FileCheck %s
+
+int main_1 = 1;
+#include <a.h>
+int main_2 = 1;
+#include "dashE.h"
+int main_3 = 1;
+
+// CHECK: main_1
+// CHECK: #include <a.h>
+// CHECK-NOT: a_1
+// CHECK-NOT: a_2
+// CHECK-NOT: b.h
+// CHECK: main_2
+// CHECK-NOT: #include "dashE.h"
+// CHECK: dashE_1
+// CHECK: #include <a.h>
+// CHECK-NOT: a_1
+// CHECK-NOT: a_2
+// CHECK-NOT: b.h
+// CHECK: dashE_2
+// CHECK: main_3

diff  --git a/clang/test/Preprocessor/minimize-whitespace-messages.c b/clang/test/Preprocessor/minimize-whitespace-messages.c
index a78ddb471fb7c08..f930bbe9c257f2c 100644
--- a/clang/test/Preprocessor/minimize-whitespace-messages.c
+++ b/clang/test/Preprocessor/minimize-whitespace-messages.c
@@ -1,8 +1,11 @@
-// RUN: not %clang -c -fminimize-whitespace %s 2>&1 | FileCheck %s --check-prefix=ON
-// ON: error: invalid argument '-fminimize-whitespace' only allowed with '-E'
+// RUN: not %clang -c -fminimize-whitespace %s 2>&1 | FileCheck %s --check-prefix=ON -DOPT=-fminimize-whitespace
+// RUN: not %clang -c -fkeep-system-includes %s 2>&1 | FileCheck %s --check-prefix=ON -DOPT=-fkeep-system-includes
+// ON: error: invalid argument '[[OPT]]' only allowed with '-E'
 
-// RUN: not %clang -c -fno-minimize-whitespace %s 2>&1 | FileCheck %s  --check-prefix=OFF
-// OFF: error: invalid argument '-fno-minimize-whitespace' only allowed with '-E'
+// RUN: not %clang -c -fno-minimize-whitespace %s 2>&1 | FileCheck %s  --check-prefix=OFF -DOPT=-fno-minimize-whitespace
+// RUN: not %clang -c -fno-keep-system-includes %s 2>&1 | FileCheck %s  --check-prefix=OFF -DOPT=-fno-keep-system-includes
+// OFF: error: invalid argument '[[OPT]]' only allowed with '-E'
 
-// RUN: not %clang -E -fminimize-whitespace -x assembler-with-cpp %s 2>&1 | FileCheck %s --check-prefix=ASM
-// ASM: error: '-fminimize-whitespace' invalid for input of type assembler-with-cpp
+// RUN: not %clang -E -fminimize-whitespace -x assembler-with-cpp %s 2>&1 | FileCheck %s --check-prefix=ASM -DOPT=-fminimize-whitespace
+// RUN: not %clang -E -fkeep-system-includes -x assembler-with-cpp %s 2>&1 | FileCheck %s --check-prefix=ASM -DOPT=-fkeep-system-includes
+// ASM: error: '[[OPT]]' invalid for input of type assembler-with-cpp


        


More information about the cfe-commits mailing list