<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/54411>54411</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[clang-scan-deps] Changing Current Working Directory causes HeaderSearch to fail
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
cmcgirr-amd
</td>
</tr>
</table>
<pre>
Version: LLVM13.0.1
OS: Ubuntu1804
### Problem
Given a specific `compile_commands.json` with two compile commands in certain order, we get the following error:
```
clang-scan-deps --compilation-database=compile_commands.json -j 1
/example/src/main.o: \
/example/src/main.cpp \
/example/include/header1.h
Error while scanning dependencies for src/function.cpp:
/example/src/function.cpp:1:10: fatal error: 'header2.h' file not found
#include "header2.h"
```
The full example is given in the attached issue, but to summarize we have a compilation database that specifies:
```
[
{
"directory": "DIR/src",
"file": "main.cpp",
"output": "main.o",
"arguments": [
"clang++",
"-I",
"../include",
"-o",
"main.o",
"main.cpp"
]
},
{
"directory": "DIR",
"file": "src/function.cpp",
"output": "src/function.o",
"arguments": [
"clang++",
"-I",
"src",
"-o",
"src/function.o",
"src/function.cpp"
]
}
]
```
and a directory structure that is:
```
├── <Call clang-scan-deps here at this level>
├── compile_commands.json
├── include
│ └── header1.h
└── src
├── function.cpp
├── header2.h
└── main.cpp
```
Which works normally as `header2.h` is placed next to the source file (`function.cpp`) and the include path is explicitly given. However, when replayed using this `compile_commands.json` database, the `clang-scan-deps` tool fails to find the appropriate header.
### Running the example
[example.zip](https://github.com/llvm/llvm-project/files/8282380/example.zip)
Please see the file `run.sh` for details or simply execute with `sh run.sh`
### Workaround
At the moment, a quick solution to the problem is to disable the re-use of the FileManager for a worker. This can be accomplished with the command argument `--reuse-filemanager=false`
But the problem seems to be deeper with how FileManager stores and caches seen paths.
### Investigation
I have done a bit of an investigation as to why the reuse of a FileManager causes problems and this is what I have found so far with no clear solution. Maybe this will help anyone who is looking at the issue, or my assumptions are completely wrong and we can close this issue.
It seems in file `clang/lib/Basic/FileManager.cpp:175` within function `llvm::Expected<DirectoryEntryRef>
FileManager::getDirectoryRef(StringRef DirName, bool CacheFailure)` the lookup of:
```
DirectoryEntry &UDE = UniqueRealDirs[Status.getUniqueID()];
```
Fails to recognize that between the time we ran the first compile command and the second that we have changed directories. The issue is we queried the same directory but given two different relative paths. Once with "." when the CWD was `DIR/src` and then again with `src/` when the CWD was `DIR`. From the FileSystem's perspective these are the same path which is correct. And so we received the same UniqueID for the directory as expected.
However, a few lines later is what we do to initialize the `UDE` if it is empty.
```
if (UDE.getName().empty()) {
// We don't have this directory yet, add it. We use the string
// key from the SeenDirEntries map as the string.
UDE.Name = InterndDirName;
}
```
Now the first time we query for `DIR/src` the `DirectoryEntry` for that ID will be empty. So we initialize the variable to a string. In this case the relative path "." because that is what was requested. As part of the `HeaderSearch` the current directory is checked, that is "." which is set to 'DIR/src' at this time.
When we move to the next compile command our CWD has changed to `DIR`, one level up. Now if we ask the FileManager for the `DirectoryEntry` of `DIR/src`, the FileSystem will generate the same UniqueID as we have seen before. What changes now is the returned `DirectoryEntry` object `UDE` has already been initialized to the previous value of ".". So we return the object without updating the path to be relative to the new CWD.
And so when the `HeaderSearch` completes for `function.cpp`, the path `DIR/src` will resolve to "." as that is the value cached by the previously retrieved `DirectoryEntry`. Causing the header to not be found.
I've tried simply initializing the `DirectoryEntry` objects with absolute paths, but this breaks other features/tests relying on relative paths being produced. I am not 100% convinced what the proper solution is or if FileManager re-use should be allowed.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzNWF1z4rgS_TXkRYULTEKShzwkIbmbqv2qyc7m8ZZsC6yJsVlJxsP--nu6JRlDYDb7dqsIMXZL6o_T3aedNcXu7k9lrG7q0exe_Pzzn79MZ8kkmY4mi9Hk_rdXuvs1a2vXTm8ml_5u-E5n_iN-N01WqfXw4X_0VtVCCrtRuV7qXIzmk7xZb3Sl_ov_a1kXNvlmcep8IjrtSuG6RgQJESWErkWujJP435hCmVH6KDolVsoJVyqxbKqq6XS9EsqYxkDXAwXnk_Dhn3kl69XY5rIeF2pjxXjsj5MOxo8L6WQmrRrNFif1FONvYhoNf1bf5XpTKVxZk-N7DQ2Thnw1unr0UkKck8s3m7Nius6rtqCrUkkYPE1KL_dEBoquJPeQDTVZDTtUXag618rCGUb4U5ZtnZNRdNLeJx-1OZKb0t-EjFjCGVXvU2h47bVJoU16LZakRN04HNnWRY-GoDvE04F4ejIY_vsPimFb4SivmdBWrBg5CDgFWDon81IVeGBbRdHPWoS-EbZFbIz-WxEcSrmFqBiEU8RwYhfpIgqV3XvjSJ2rhxgOBOR68INvpGmhjcpdY3ZkELskXbx8iZ5ModmHJeSlvXSM_BnhpnWb1h2JN2eEpVm1a1U7G-WvjhQOcoz4UfrAn487Banxyw8eJskQlOf3OKlqeHjelqFA8M2hrVeLYVgWBxv8mzB9IkAnUuJzoTpa-H8QtDOY_EysPmPMadFPxS8k2-JkEqLWIov7EArrTJu71oQk1uez9ykd3V6Obh_Dxc2kvxCj2eOjRIU5rv6lMlResDVqTqW2qhrNnv55u9M97B-XxQwaCt6k5JSw6PLEoqMO8CNRisXe0T9Q5CBcn1qxL-UfxE9p0ufyqUC9lTovRdeYd4v-YdYIzE5IS-RgfwwIAUKyqWSOul-r71zvqRnYpjW58t1nlN5A8MAaHJLeCgIRCcdmtJHgFthPfd9UOtcOB3KHSYT4qekQd88pSvQco3DoDoe2lporI-NHtKWnDNiAjiTZQ5SRlGuaCg1VV5bsWOqgn9xsTLMxWjoVXJycJldfWt_raVFs4EPBq4dwN_lbbyi10pvSuQ0nS_qMzwrsqs0SqI8fVbWN_8ZQ4BtSjZIY9qEuPd-kN-kM0eypAu-Z3g4P_L1S1FetUp6CcTTmE9PWieXgEQ8plGOLiZJobLSD6ipvYStzPYjZUvRLThv-BphIM2AY_H3vmd-6oVpKnpfir1bn70BH1XLrD2jZeFJKscedQluJn_zEqHELC5ol_3qGAb_IWq6UYdUl4xPhEH8QABBKkSFcOcGg0pbIiOerZc9TRaztZNh4bBS2H5Nj1n5fcMqlrACUQ1MfWnegKVy6ZmVxXqFA7Yw_qWy6Ay0tqiP4Hh2cEzuytLJmpNszIHqpt8o6vWJyNBR58eSpaGpiUJl25BZJ_GuwgFIUanXlLvgvuE8eqJVL3LbRGhsyES7Ep6MCHs5i0ohwISmCgTWYP1Bl-hgm4he5y5Rf3mmU71JVG-y4Iz27sqE9q6Z5p8SQ3os9Q0QM11RVQBE3tBk0MRwq4NkpQLEzDS2DEuCOFN-8aqyKumKXAye-uBAYUNII9tCjnyud4ftBWk2tcOCMyKivrygj2EpaHsoVbcGJiBSd3T99Bzt1qkCnWsTe91Q7s_uiln1PGuztV2EE6qVJML15dQbuwLXAg1_l2tNlqj6PhJJnJCR6KWUzVSV4jBzYbhDIc331UB3U3PnXxRM66kJ8rfVfrfqiZAUZixr06qRrbQKt_KMXqkN0FgrS7OHk7s-xJuKQZlUTl-c-nynXEZ5JRafXzPCNrEO1MdYdj4l9zbfYiC-xSxwL8hKhQs5GWoEhgDI74IWxqVBBFB6ETeC5AQmhecOPJDSgFnq5BHFAphtFk8ZWhbQTv9V5LG5gzfjzPYV2fHxbiM53uf3MgBgEtZFeKxpw-8rIvIrn4jM7zCeJeDbNuq9frzvrMH-n10g_DPOEJ1INj4FradTeMG6HHbdhKm6NITsTce8zkjytcoW1A1_EgHJ1pLt750juq4zeRAyTZtBapViqTlS6Rm2Ay1AoYj3oqO4QAHStnZaVRwAnGHDGPGAptOPujVTeJSdhBBlADQsIfB72BLyElwQQghccTAu-MYo3Lnxwm_NY4Qqwt26nfIMpMH7CRyTe2uBLzrUPG76rnVjGwLwCxMgOyh0az9dyw3W0X53sl5P2pLrg5Hqp4aW6iFk86xXfM-hDF_yKBrHPjpgzBOodB-0YeMHJh-kdWzdnD6LNZRc12LtevDI6jiK1xQju22pD73q8WdDfezKXNjbcQa70-ZEp7hiR3AdMwEVGQXXLmLoHoKVxsVdDx5-YK72iW-RltCVvDefkPnR0eqnyd1RVJmf-hH1mBvxbxeQSCBgM89f9WECePOgFb5SQHbGPrYo8gxnqcUUCU-WkLaXtKxAdFPOX-xRaGQ8eot0kgmIIKGNzad9PEpOzUYNzjiIcGem-NvhwrlStDDHOj8ktbV8ymU1kCmeqRLyR77wJxNk75lMcU0xmNcw6rVJGzHKQyeQIWRkEDyVV8QueiKRiz9jUVjetBayqlglGCFhEnz-TZcMBVDMxj8ODIOORJTPKPI3qkdcHq6O4HAQ11r5YbE-ALJIHG9Ppw-DxKPZHH6cbux6Mram8IhGGXA08Mn0ykdW5f9-V7Q5cAtYC41FHtmc8nqDJx7ElDhR0Fr2lywLnOqQ1ADqpw30vcPQ-JnGfH8TW-n4lM-ZsoQ32L-goezJEG3Neg40AYCVpkqchA250lOTVjo5p6qNWCnXpPmhk0eZUA16EXLMd0wkcfYVg1FtMd8TDS9kzaGLL_QigefJAOg1zKNB-C8BUBXN6enmMEy6Ku1lxO7uVFw7zoboDnTme464W4pFygDR7DNWG5hP63XsoMuAhenjmA9G5aE1196_nMuYo5LOry8vp9KK8S7PpMi2msyzPror0ejKRtzezya3M0vxynl1nF5XMVGXJBCCM0B5ocQoTLvRdOknTyWx6Pbm-nF3Ok9nNfDZN52o6maVyms9HlxPMLLpKSI-kMasLc8cqZe3K4iHmH2f3D8Gx9apW7DHaX7bIRnOXr_OVNmYs18UFH3_H6v8PxaCM4g">