[clang] Pass LangOpts from CompilerInstance to DependencyScanningWorker (PR #93753)

Jan Svoboda via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 3 11:13:30 PDT 2024


jansvoboda11 wrote:

> Thanks for the comments @jansvoboda11 . I am new to all these different moving parts and want to understand better. I have a few questions.
> 
> > If you concurrently scan the same file under two language standards with the same scanning service, it becomes non-deterministic which one gets cached in the filesystem cache.
> 
> That is true. But until your comment, I did not even know that it is possible (and supported) to be able to invoke the same scanning service on the same file under two language options (say c++14 and c++11). How would someone do that? Asking so I can test this out locally and try to come up with a better solution. (Also, why would someone do that?)

You can have a project that has both C and C++ implementation files that end up including the same header files from the C standard library. One can be compiled under C11 (without separator support), the other under C++14 (with separator support).

> > You need to make the language standard a part of the cache key.
> 
> This was kind of one of my concerns that I had called out here: https://discourse.llvm.org/t/looking-for-help-with-accessing-langopts-from-the-actual-compiler-invocation/79228/3. Specifically:
> 
> > would it look a bit off to someone if they were to look at the header for DependencyScanningWorkerFilesystem and see that the ensureDirectiveTokensArePopulated API took a LangOpts specifically
> 
> Given that `LangOpts` is kind of becoming a feature within `DependencyScanningWorkerFilesystem`'s APIs, I am kind of inclined towards having `LangOpts` as part of the cache key for disambiguation - but again, I am very very new to this.

Scanning is often the choke point in builds, so any change that slows it down needs to make up for it in correctness. Adding language options (or just the standard) to the cache key could trigger multiple scans for a single file, which would change the performance quite a bit. Since we have an alternative solution that's still correct, I'd prefer that.

> > An alternative solution (that I prefer) is to set up the scanner in a way that always accepts ' in integer constants.
> 
> Would this be considered "hacky"? Because if we go this way, the Scanner would technically be operating in a different language mode for integers, potentially overriding the language mode arg that was passed in during invocation. I am not opposed to it - just trying to understand the implications better. We do turn on specific `LangOpts` (like `Objc`) for the lexer during the Scanning phase as can be seen here -
> 
> https://github.com/llvm/llvm-project/blob/83646590afe222cfdd792514854549077e17b005/clang/lib/Lex/DependencyDirectivesScanner.cpp#L71-L79
> 
> .
> I guess the general question is - is it acceptable to have the Scanner operating in a language standard different than the passed in language mode and different than the compiler language standard?

I think that is acceptable. It is kinda hacky, but the lexer and preprocessor are largely independent of the language and the standard. When they do depend on those settings, taking the union of the features and letting the compiler trim it down is still a perfectly sound thing to do.

https://github.com/llvm/llvm-project/pull/93753


More information about the cfe-commits mailing list