[clang] [clang][deps] Fix `__has_include` behavior with umbrella headers (PR #70144)

Jan Svoboda via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 25 14:22:37 PDT 2023


jansvoboda11 wrote:

> Can you clarify how this bug occurs in terms of what information about the header is stored that causes us to get the wrong module?

I updated the test case to be more in line with how the bug manifested in the wild. Let me annotate the test case here to hopefully clarify things:

```c
//--- tu.c
#include "poison.h" // This imports Poison.pcm. In its HEADER_SEARCH_TABLE record, this PCM stores
                    // information that "FW/B.h" is textual. This happens because the compiler never
                    // looked for the module map when it encountered `__has_include(<FW/B.h>)`, and
                    // therefore set `HeaderFileInfo::{isModuleHeader,isCompilingModuleHeader}` to
                    // `false` for that header. `ASTWriter` then decided to serialize info on that header:
                    // https://github.com/llvm/llvm-project/blob/740582fa4c9512b34128dc97b2eff56820984421/clang/lib/Serialization/ASTWriter.cpp#L2023

#if __has_include(<FW/B.h>) // This looks into Poison.pcm, finds out that "FW/B.h" is textual,
                            // and caches that in HeaderSearch::FileInfo.
#endif

#include "import.h" // This transitively imports FW_Private.pcm.
                    // This does not parse FW_Private module map, that would actually tell us that "FW/B.h"
                    // is most likely part of FW_Private (via the PrivateHeaders umbrella header).
                    // FW_Private.pcm does contain the information that "FW/B.h" is part of FW_Private, but...

#include <FW/B.h> // ... this does not deserialize the info on "FW/B.h" from FW_Private.pcm
                  // due to the unimplemented FIXME here:
                  // https://github.com/llvm/llvm-project/blob/740582fa4c9512b34128dc97b2eff56820984421/clang/lib/Lex/HeaderSearch.cpp#L1320
                  // This header is therefore considered textual.
```

So an alternative solution would be to implement that fixme and ensure we do deserialize `HeaderFileInfo` from newly loaded PCM files. That would tell us the FW_Private.pcm considers "FW/B.h" modular. I'd rather fix what we actually serialize into PCM files first.

> It seems bad that passing `nullptr` to `LookupFile` could cause incorrect behavior and makes me wonder if we need to always trigger module lookup or if there is another way to fix this that would make it safe to not do module lookup here.

Yeah. I did try to fix up all calls to `LookupFile` to perform module map lookup, but a bunch of tests started failing (mostly standard C++ modules tests IIRC), so there's probably more nuance required there.

https://github.com/llvm/llvm-project/pull/70144


More information about the cfe-commits mailing list