[clang] [Clang] Match MSVC handling of duplicate header search paths in Microsoft compatibility modes. (PR #105738)

Tom Honermann via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 22 14:45:31 PDT 2024


https://github.com/tahonermann created https://github.com/llvm/llvm-project/pull/105738

Clang has historically mimicked gcc behavior for header file searching in which user search paths are ordered before system search paths, user search paths that duplicate a (later) system search path are ignored, and search paths that duplicate an earlier search path of the same user/system kind are ignored.

MSVC behavior differs in that user search paths are ordered before system search paths (like gcc), and search paths that duplicate an earlier search path are ignored regardless of user/system kind (similar to gcc, but without the preference for system search paths over duplicate user search paths).

The gcc and MSVC differences are observable for driver invocations that pass, e.g., `-Idir1 -isystem dir2 -isystem dir1`. The gcc behavior will result in `dir2` being searched before `dir1` while the MSVC behavior will result in `dir1` being searched before `dir2`.

This patch modifies Clang to match the MSVC behavior for handling of duplicate header search paths when running in Microsoft compatibility mode (e.g., when invoked with `-fms-compatibility` or with the `clang-cl` driver).

>From b4b550f2dbf47dbce0acd9388fe3c8c8d1cb7ce9 Mon Sep 17 00:00:00 2001
From: Tom Honermann <tom.honermann at intel.com>
Date: Thu, 22 Aug 2024 09:44:56 -0700
Subject: [PATCH] [Clang] Match MSVC handling of duplicate header search paths
 in Microsoft compatibility modes.

Clang has historically mimicked gcc behavior for header file searching in
which user search paths are ordered before system search paths, user search
paths that duplicate a (later) system search path are ignored, and search
paths that duplicate an earlier search path of the same user/system kind are
ignored.

MSVC behavior differs in that user search paths are ordered before system
search paths (like gcc), and search paths that duplicate an earlier search
path are ignored regardless of user/system kind (similar to gcc, but without
the preference for system search paths over duplicate user search paths).

The gcc and MSVC differences are observable for driver invocations that pass,
e.g., `-Idir1 -isystem dir2 -isystem dir1`. The gcc behavior will result in
`dir2` being searched before `dir1` while the MSVC behavior will result in
`dir1` being searched before `dir2`.

This patch modifies Clang to match the MSVC behavior for handling of duplicate
header search paths when running in Microsoft compatibility mode (e.g., when
invoked with `-fms-compatibility` or with the clang-cl driver).
---
 clang/docs/ReleaseNotes.rst                   | 14 +++
 clang/lib/Lex/InitHeaderSearch.cpp            | 21 +++--
 clang/test/Driver/header-search-duplicates.c  | 74 +++++++++++++++
 .../microsoft-header-search-duplicates.c      | 90 +++++++++++++++++++
 4 files changed, 190 insertions(+), 9 deletions(-)
 create mode 100644 clang/test/Driver/header-search-duplicates.c
 create mode 100644 clang/test/Driver/microsoft-header-search-duplicates.c

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 5c156a9c073a9c..d9c96d8902efa9 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -363,6 +363,20 @@ Windows Support
   When `-fms-compatibility-version=18.00` or prior is set on the command line this Microsoft extension is still
   allowed as VS2013 and prior allow it.
 
+- Clang now matches MSVC behavior for handling of duplicate header search paths
+  when running in Microsoft compatibility mode. Historically, Clang has
+  mimicked gcc behavior in which user search paths are ordered before
+  system search paths, user search paths that duplicate a (later) system search
+  path are ignored, and search paths that duplicate an earlier search path of
+  the same user/system kind are ignored. The MSVC behavior is that user search
+  paths are ordered before system search paths (like gcc), and search paths that
+  duplicate an earlier search path are ignored regardless of user/system kind
+  (similar to gcc, but without the preference for system search paths over
+  duplicate user search paths). These differences are observable for driver
+  invocations that pass, e.g., `-Idir1 -isystem dir2 -isystem dir1`. The gcc
+  behavior will search `dir2` before `dir1` and the MSVC behavior will search
+  `dir1` before `dir2`.
+
 LoongArch Support
 ^^^^^^^^^^^^^^^^^
 
diff --git a/clang/lib/Lex/InitHeaderSearch.cpp b/clang/lib/Lex/InitHeaderSearch.cpp
index 2218db15013d92..a4cba469cd7e31 100644
--- a/clang/lib/Lex/InitHeaderSearch.cpp
+++ b/clang/lib/Lex/InitHeaderSearch.cpp
@@ -368,7 +368,8 @@ void InitHeaderSearch::AddDefaultIncludePaths(
 /// If there are duplicate directory entries in the specified search list,
 /// remove the later (dead) ones.  Returns the number of non-system headers
 /// removed, which is used to update NumAngled.
-static unsigned RemoveDuplicates(std::vector<DirectoryLookupInfo> &SearchList,
+static unsigned RemoveDuplicates(const LangOptions &Lang,
+                                 std::vector<DirectoryLookupInfo> &SearchList,
                                  unsigned First, bool Verbose) {
   llvm::SmallPtrSet<const DirectoryEntry *, 8> SeenDirs;
   llvm::SmallPtrSet<const DirectoryEntry *, 8> SeenFrameworkDirs;
@@ -394,14 +395,15 @@ static unsigned RemoveDuplicates(std::vector<DirectoryLookupInfo> &SearchList,
         continue;
     }
 
-    // If we have a normal #include dir/framework/headermap that is shadowed
-    // later in the chain by a system include location, we actually want to
-    // ignore the user's request and drop the user dir... keeping the system
-    // dir.  This is weird, but required to emulate GCC's search path correctly.
+    // When not in MSVC compatibility mode, if we have a normal
+    // #include dir/framework/headermap that is shadowed later in the chain by
+    // a system include location, we actually want to ignore the user's request
+    // and drop the user dir... keeping the system dir.  This is weird, but
+    // required to emulate GCC's search path correctly.
     //
     // Since dupes of system dirs are rare, just rescan to find the original
     // that we're nuking instead of using a DenseMap.
-    if (CurEntry.getDirCharacteristic() != SrcMgr::C_User) {
+    if (!Lang.MSVCCompat && CurEntry.getDirCharacteristic() != SrcMgr::C_User) {
       // Find the dir that this is the same of.
       unsigned FirstDir;
       for (FirstDir = First;; ++FirstDir) {
@@ -484,14 +486,14 @@ void InitHeaderSearch::Realize(const LangOptions &Lang) {
       SearchList.push_back(Include);
 
   // Deduplicate and remember index.
-  RemoveDuplicates(SearchList, 0, Verbose);
+  RemoveDuplicates(Lang, SearchList, 0, Verbose);
   unsigned NumQuoted = SearchList.size();
 
   for (auto &Include : IncludePath)
     if (Include.Group == Angled || Include.Group == IndexHeaderMap)
       SearchList.push_back(Include);
 
-  RemoveDuplicates(SearchList, NumQuoted, Verbose);
+  RemoveDuplicates(Lang, SearchList, NumQuoted, Verbose);
   unsigned NumAngled = SearchList.size();
 
   for (auto &Include : IncludePath)
@@ -510,7 +512,8 @@ void InitHeaderSearch::Realize(const LangOptions &Lang) {
   // Remove duplicates across both the Angled and System directories.  GCC does
   // this and failing to remove duplicates across these two groups breaks
   // #include_next.
-  unsigned NonSystemRemoved = RemoveDuplicates(SearchList, NumQuoted, Verbose);
+  unsigned NonSystemRemoved = RemoveDuplicates(Lang, SearchList, NumQuoted,
+                                               Verbose);
   NumAngled -= NonSystemRemoved;
 
   Headers.SetSearchPaths(extractLookups(SearchList), NumQuoted, NumAngled,
diff --git a/clang/test/Driver/header-search-duplicates.c b/clang/test/Driver/header-search-duplicates.c
new file mode 100644
index 00000000000000..2480d9f24d8200
--- /dev/null
+++ b/clang/test/Driver/header-search-duplicates.c
@@ -0,0 +1,74 @@
+// Test that the gcc-like driver, when not in Microsoft compatibility mode,
+// emulates the gcc behavior of eliding a user header search path when the
+// same path is present as a system header search path.
+// See microsoft-header-search-duplicates.c for Microsoft compatible behavior.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// Test the clang driver to validate that the gcc compatible behavior is enabled
+// by default. The -nostdinc option is used to suppress default search paths to
+// ease testing.
+// RUN: %clang \
+// RUN:     -v -fsyntax-only -nostdinc \
+// RUN:     -I%t/include/w \
+// RUN:     -isystem %t/include/z \
+// RUN:     -I%t/include/x \
+// RUN:     -isystem %t/include/y \
+// RUN:     -isystem %t/include/x \
+// RUN:     -I%t/include/w \
+// RUN:     -isystem %t/include/y \
+// RUN:     -isystem %t/include/z \
+// RUN:     %t/test.c 2>&1 | FileCheck -DPWD=%t %t/test.c
+
+#--- test.c
+#include <a.h>
+#include <b.h>
+#include <c.h>
+
+// The expected behavior is that user search paths are ordered before system
+// search paths, that user search paths that duplicate a (later) system search
+// path are ignored, and that search paths that duplicate an earlier search
+// path of the same user/system kind are ignored.
+// CHECK:      ignoring duplicate directory "[[PWD]]/include/w"
+// CHECK-NEXT: ignoring duplicate directory "[[PWD]]/include/x"
+// CHECK-NEXT:  as it is a non-system directory that duplicates a system directory
+// CHECK-NEXT: ignoring duplicate directory "[[PWD]]/include/y"
+// CHECK-NEXT: ignoring duplicate directory "[[PWD]]/include/z"
+// CHECK:      #include <...> search starts here:
+// CHECK-NEXT: [[PWD]]/include/w
+// CHECK-NEXT: [[PWD]]/include/z
+// CHECK-NEXT: [[PWD]]/include/y
+// CHECK-NEXT: [[PWD]]/include/x
+// CHECK-NEXT: End of search list.
+
+#--- include/w/b.h
+#define INCLUDE_W_B_H
+#include_next <b.h>
+
+#--- include/w/c.h
+#define INCLUDE_W_C_H
+#include_next <c.h>
+
+#--- include/x/a.h
+#if !defined(INCLUDE_Y_A_H)
+#error 'include/y/a.h' should have been included before 'include/x/a.h'!
+#endif
+
+#--- include/x/b.h
+#if !defined(INCLUDE_W_B_H)
+#error 'include/w/b.h' should have been included before 'include/x/b.h'!
+#endif
+
+#--- include/x/c.h
+#if !defined(INCLUDE_Z_C_H)
+#error 'include/z/c.h' should have been included before 'include/x/c.h'!
+#endif
+
+#--- include/y/a.h
+#define INCLUDE_Y_A_H
+#include_next <a.h>
+
+#--- include/z/c.h
+#define INCLUDE_Z_C_H
+#include_next <c.h>
diff --git a/clang/test/Driver/microsoft-header-search-duplicates.c b/clang/test/Driver/microsoft-header-search-duplicates.c
new file mode 100644
index 00000000000000..1b73eeceb37eba
--- /dev/null
+++ b/clang/test/Driver/microsoft-header-search-duplicates.c
@@ -0,0 +1,90 @@
+// Test that the cl-like driver and the gcc-like driver, when in Microsoft
+// compatibility mode, retain user header search paths that are duplicated in
+// system header search paths.
+// See header-search-duplicates.c for gcc compatible behavior.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// Test the clang driver to validate that the Microsoft compatible behavior is
+// enabled when the -fms-compatibility option is used. The -nostdinc option
+// is used to suppress default search paths to ease testing.
+// RUN: %clang \
+// RUN:     -v -fsyntax-only -fms-compatibility -nostdinc \
+// RUN:     -I%t/include/w \
+// RUN:     -isystem %t/include/z \
+// RUN:     -I%t/include/x \
+// RUN:     -isystem %t/include/y \
+// RUN:     -isystem %t/include/x \
+// RUN:     -I%t/include/w \
+// RUN:     -isystem %t/include/y \
+// RUN:     -isystem %t/include/z \
+// RUN:     %t/test.c 2>&1 | FileCheck -DPWD=%t %t/test.c
+
+// Test the clang-cl driver to validate that the Microsoft compatible behavior
+// is enabled by default. The -nobuiltininc option is used instead of -nostdinc
+// or /X because the latter two suppress all system include paths (including
+// those specified by -isystem and -imsvc). The -imsvc option is used because
+// the -nobuiltininc option suppresses search paths specified by -isystem.
+// RUN: %clang_cl \
+// RUN:     -v -fsyntax-only -nobuiltininc \
+// RUN:     -I%t/include/w \
+// RUN:     -imsvc %t/include/z \
+// RUN:     -I%t/include/x \
+// RUN:     -imsvc %t/include/y \
+// RUN:     -imsvc %t/include/x \
+// RUN:     -I%t/include/w \
+// RUN:     -imsvc %t/include/y \
+// RUN:     -imsvc %t/include/z \
+// RUN:     %t/test.c 2>&1 | FileCheck -DPWD=%t %t/test.c
+
+#--- test.c
+#include <a.h>
+#include <b.h>
+#include <c.h>
+
+// The expected behavior is that user search paths are ordered before system
+// search paths and that search paths that duplicate an earlier search path
+// (regardless of user/system concerns) are ignored.
+// CHECK:      ignoring duplicate directory "[[PWD]]/include/w"
+// CHECK-NEXT: ignoring duplicate directory "[[PWD]]/include/x"
+// CHECK-NEXT: ignoring duplicate directory "[[PWD]]/include/y"
+// CHECK-NEXT: ignoring duplicate directory "[[PWD]]/include/z"
+// CHECK-NOT:   as it is a non-system directory that duplicates a system directory
+// CHECK:      #include <...> search starts here:
+// CHECK-NEXT: [[PWD]]/include/w
+// CHECK-NEXT: [[PWD]]/include/x
+// CHECK-NEXT: [[PWD]]/include/z
+// CHECK-NEXT: [[PWD]]/include/y
+// CHECK-NEXT: End of search list.
+
+#--- include/w/b.h
+#define INCLUDE_W_B_H
+#include_next <b.h>
+
+#--- include/w/c.h
+#define INCLUDE_W_C_H
+#include_next <c.h>
+
+#--- include/x/a.h
+#define INCLUDE_X_A_H
+#include_next <a.h>
+
+#--- include/x/b.h
+#if !defined(INCLUDE_W_B_H)
+#error 'include/w/b.h' should have been included before 'include/x/b.h'!
+#endif
+
+#--- include/x/c.h
+#define INCLUDE_X_C_H
+
+#--- include/y/a.h
+#if !defined(INCLUDE_X_A_H)
+#error 'include/x/a.h' should have been included before 'include/y/a.h'!
+#endif
+
+#--- include/z/c.h
+#include_next <c.h>
+#if !defined(INCLUDE_X_C_H)
+#error 'include/x/c.h' should have been included before 'include/z/c.h'!
+#endif



More information about the cfe-commits mailing list