[PATCH] D105328: [Frontend] Only compile modules if not already finalized

Ben Barham via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jul 13 17:35:13 PDT 2021


bnbarham marked 5 inline comments as done.
bnbarham added inline comments.


================
Comment at: clang/lib/Frontend/CompilerInstance.cpp:1063
+        << ModuleName;
+    return ImportingInstance.getFrontendOpts().AllowPCMWithCompilerErrors;
+  }
----------------
bnbarham wrote:
> vsapsai wrote:
> > Can we get in infinite loop with `AllowPCMWithCompilerErrors = true`? Specifically, I'm thinking about the scenario
> > 
> > 1. `compileModuleAndReadAST` obtains a file lock and calls `compileModule`
> > 2. `compileModule` calls `compileModuleImpl`
> > 3. Module is finalized but `AllowPCMWithCompilerErrors` is true, so `compileModuleImpl` returns true
> > 4. `compileModule` returns true
> > 5. `compileModuleAndReadAST` tries to read AST because compilation was successful
> > 6. AST is out of date, so `compileModuleAndReadAST` decides to try again, goto 1
> > 
> > Haven't tried to reproduce it locally but even if this scenario is impossible, a corresponding test case can be useful.
> Nice catch, that does seem likely - I'll see if I can add a test for this.
It doesn't end up causing an infinite recursion as `false` will end up being returned from `compileModuleAndReadAST`, but in that case there's no point returning `true` from `compileModuleImpl` in the first place so I've changed it anyway. Have also added that as a test case, just to make sure.


================
Comment at: clang/lib/Serialization/ASTReader.cpp:2923
           if (!BuildDir || *BuildDir != M->Directory) {
-            if ((ClientLoadCapabilities & ARR_OutOfDate) == 0)
+            if (diagnoseOutOfDate(F.FileName, ClientLoadCapabilities))
               Diag(diag::err_imported_module_relocated)
----------------
bnbarham wrote:
> vsapsai wrote:
> > I'm thinking if in case of finalized modules diagnostic messages are good enough. One concern is it won't be clear why a module wasn't rebuilt. It can be already confusing for precompiled headers and I'm afraid we won't be able to detect `isPCMFinal` code path without a debugger. Though I don't know how bad that would be in practice.
> > 
> > Another concern is that retrying a compilation should succeed as for a new process we have a new InMemoryModuleCache and `isPCMFinal` should return false. So we might have non-deterministic behavior and some of the valid error messages can seem to be non-deterministic and not reliable. I was thinking about adding a note in case we are dealing with `isPCMFinal` to distinguish these cases but not sure it is a good idea.
> > I'm thinking if in case of finalized modules diagnostic messages are good enough. One concern is it won't be clear why a module wasn't rebuilt. It can be already confusing for precompiled headers and I'm afraid we won't be able to detect `isPCMFinal` code path without a debugger. Though I don't know how bad that would be in practice.
> 
> The error messages will mention a module in the module cache, which would be the main way to tell. We could add a note here as you suggest below, but I'm not quite sure what it would say... something about there being two modules with the same name?
> 
> > Another concern is that retrying a compilation should succeed as for a new process we have a new InMemoryModuleCache and `isPCMFinal` should return false. So we might have non-deterministic behavior and some of the valid error messages can seem to be non-deterministic and not reliable. I was thinking about adding a note in case we are dealing with `isPCMFinal` to distinguish these cases but not sure it is a good idea.
> 
> The errors should be deterministic I believe. If one process has the issue then a new one will have the issue as well. For what it's worth, I don't think these crashes are possible from the clang frontend. They require messing around with search paths such that between two compilations in the same process, different modules are found.
I've added an extra note "this is generally caused by modules with the same name found in multiple paths". So the diagnostics would now be:
```
/.../test.m:2:10: fatal error: module 'M' was built in directory '/.../frameworks/M.framework' but now resides in directory 'frameworks2/M.framework'
        @import Top;
         ^
/.../test.m:2:10: note: imported by module 'Top' in '/path/to/module/cache/LGSY9KSYAKN1/Top-1MZE9QJ1AHENQ.pcm
/.../test.m:2:10: note: this is generally caused by modules with the same name found in multiple paths
```

I was thinking of mentioning the finalized module, but that's really just a side effect of the error that should have already been output (ie. `module 'M' was built in...` in this case). So the existence of this note should help point users to the problem + make which path caused the error obvious when looking at the code.


================
Comment at: clang/lib/Serialization/ASTReader.cpp:5929
 
+bool ASTReader::diagnoseOutOfDate(StringRef ModuleFileName,
+                                  unsigned int ClientLoadCapabilities) {
----------------
bnbarham wrote:
> vsapsai wrote:
> > Based on the rest of the code in clang, the expectation for `diagnose...` methods is to emit diagnostic in some cases. Personally, I'm often confused what true/false means for these methods, so I'm thinking about renaming the method to something like isRecoverableOutOfDateModule, canRecoverOutOfDateModule or some such. Feel free to pick a name you believe is appropriate, mine are just examples.
> Fair enough, `canRecoverOutOfDateModule` sounds reasonable to me. Or maybe `canRecoverFromOutOfDate`?
I went with the latter.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105328/new/

https://reviews.llvm.org/D105328



More information about the cfe-commits mailing list