[PATCH] D148088: [RFC][clangd] Move preamble index out of document open critical path

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri May 19 08:39:04 PDT 2023


kadircet added inline comments.


================
Comment at: clang-tools-extra/clangd/ClangdServer.cpp:90
+    if (Tasks) {
+      std::unique_lock<Semaphore> Lock(*Barrier, std::try_to_lock);
+      Tasks->runAsync("task:" + Path + Version, std::move(Task));
----------------
DmitryPolukhin wrote:
> kadircet wrote:
> > what's the point of this semaphore, if we're only going to `try_to_lock`? this should be done in the `Task` that we're passing to the `Tasks`.
> > 
> > but more importantly, this is going to be a change in behavior, before this patch we had one preamble thread per open file and they'll run indexing (in addition to preamble builds) without any throttling.
> > 
> > if we limit number of indexig tasks to 1, we'll need a queue to ensure progress for every file. semaphore won't be fair especially in the face of contention, we'll also likely handle requests out-of-order (which is redundant but OK correctness-wise as our file-index already accounts for that).
> > 
> > hence I don't think there's any reason to change behaviour here until we see some issues with contention (which I believe is unlikely as we can still only issue preamble indexing requests as quick as we can build preambles, and the latter is slower than the former).
> > we can still only issue preamble indexing requests as quick as we can build preambles, and the latter is slower than the former
> 
> Preamble compliation is not always slower than indexing. For example, if modules are actively used, compilation itself can be very fast but indexing is slow. Said that, I think we can eliminate the semaphore if there is no good reason to keep it.
> For example, if modules are actively used, compilation itself can be very fast but indexing is slow. 

You're right, but that's a mode of operation that incidentally works in clangd as we discussed before.

> I think we can eliminate the semaphore if there is no good reason to keep it.

As mentioned above I don't think we have good enough reasons to have some throttling here, and even if we did, I don't think we can achieve it with a semaphore cheaply, we probably need a queue and multiple consumers to at least ensure forward progress on every file.


================
Comment at: clang-tools-extra/clangd/Preamble.cpp:698-700
+      // While extending the life of FileMgr and VFS, StatCache should also be
+      // extended.
+      Ctx.setStatCache(Result->StatCache);
----------------
kuganv wrote:
> kadircet wrote:
> > we use a "producingfs" when building the preamble, hence there shouldn't be any references to statcache inside the FM&VFS stored in the compiler instance here. have you noticed something to the contrary ?
> > we use a "producingfs" when building the preamble, hence there shouldn't be any references to statcache inside the FM&VFS stored in the compiler instance here. have you noticed something to the contrary ?
> 
> 
> In PrecompiledPreamble::Build, we use StatCacheFS for FileMgr creation. Shouldn’t we keep this Alive? Without this, I am seeing:
> 
> ```
> ==1509807==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000464c00 at pc 0x55f558fd7de2 bp 0x7f18a73ff010 sp 0x7f18a73ff008
> READ of size 8 at 0x606000464c00 thread T48 (NodeServer.cpp1)
> V[23:14:41.667] Ignored diagnostic. /data/users/kugan/fbsource/fbcode/synapse_dev/prod/NodeServer.cpp:2:2:Mandatory header <vector> not found in standard library!
> I[23:14:41.669] Indexed c++20 standard library (incomplete due to errors): 0 symbols, 0 filtered
>     #0 0x55f558fd7de1 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_data() const /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/basic_string.h:234:28
>     #1 0x55f558fd7de1 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::data() const /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/basic_string.h:2567:16
>     #2 0x55f558fd7de1 in llvm::StringRef::StringRef(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /home/kugan/local/llvm-project/llvm/include/llvm/ADT/StringRef.h:101:18
>     #3 0x55f558fd7de1 in clang::clangd::PreambleFileStatusCache::update(llvm::vfs::FileSystem const&, llvm::vfs::Status) /home/kugan/local/llvm-project/clang-tools-extra/clangd/FS.cpp:33:20
>     #4 0x55f558fd925f in clang::clangd::PreambleFileStatusCache::getProducingFS(llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>)::CollectFS::status(llvm::Twine const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/FS.cpp:82:19
>     #5 0x55f55917ddee in clang::clangd::(anonymous namespace)::TimerFS::status(llvm::Twine const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/Preamble.cpp:492:30
>     #6 0x55f55670dcff in clang::FileSystemStatCache::get(llvm::StringRef, llvm::vfs::Status&, bool, std::unique_ptr<llvm::vfs::File, std::default_delete<llvm::vfs::File>>*, clang::FileSystemStatCache*, llvm::vfs::FileSystem&) /home/kugan/local/llvm-project/clang/lib/Basic/FileSystemStatCache.cpp:47:55
>     #7 0x55f556701ab4 in clang::FileManager::getStatValue(llvm::StringRef, llvm::vfs::Status&, bool, std::unique_ptr<llvm::vfs::File, std::default_delete<llvm::vfs::File>>*) /home/kugan/local/llvm-project/clang/lib/Basic/FileManager.cpp:590:12
>     #8 0x55f556700eed in clang::FileManager::getDirectoryRef(llvm::StringRef, bool) /home/kugan/local/llvm-project/clang/lib/Basic/FileManager.cpp:161:20
>     #9 0x55f556701d3e in clang::FileManager::getDirectory(llvm::StringRef, bool) /home/kugan/local/llvm-project/clang/lib/Basic/FileManager.cpp:191:17
>     #10 0x55f55929a800 in clang::clangd::getCanonicalPath[abi:cxx11](clang::FileEntryRef, clang::SourceManager const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/SourceCode.cpp:540:45
>     #11 0x55f559627681 in clang::clangd::SymbolCollector::HeaderFileURICache::toURI[abi:cxx11](clang::FileEntryRef) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/SymbolCollector.cpp:211:24
>     #12 0x55f55961926e in clang::clangd::SymbolCollector::getTokenLocation(clang::SourceLocation) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/SymbolCollector.cpp:425:36
>     #13 0x55f5596233a9 in clang::clangd::SymbolCollector::handleMacroOccurrence(clang::IdentifierInfo const*, clang::MacroInfo const*, unsigned int, clang::SourceLocation) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/SymbolCollector.cpp:753:22
>     #14 0x55f55c1c7e3d in indexPreprocessorMacro(clang::IdentifierInfo const*, clang::MacroInfo const*, clang::MacroDirective::Kind, clang::SourceLocation, clang::index::IndexDataConsumer&) /home/kugan/local/llvm-project/clang/lib/Index/IndexingAction.cpp:229:16
>     #15 0x55f55c1c7e3d in indexPreprocessorMacros(clang::Preprocessor&, clang::index::IndexDataConsumer&) /home/kugan/local/llvm-project/clang/lib/Index/IndexingAction.cpp:236:7
>     #16 0x55f55c1c8290 in clang::index::indexTopLevelDecls(clang::ASTContext&, clang::Preprocessor&, llvm::ArrayRef<clang::Decl const*>, clang::index::IndexDataConsumer&, clang::index::IndexingOptions) /home/kugan/local/llvm-project/clang/lib/Index/IndexingAction.cpp:283:5
>     #17 0x55f55959d350 in clang::clangd::(anonymous namespace)::indexSymbols(clang::ASTContext&, clang::Preprocessor&, llvm::ArrayRef<clang::Decl*>, clang::clangd::MainFileMacros const*, clang::clangd::CanonicalIncludes const&, bool, llvm::StringRef, bool) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/FileIndex.cpp:82:3
>     #18 0x55f55959f040 in clang::clangd::indexHeaderSymbols(llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/FileIndex.cpp:235:10
>     #19 0x55f5595a8119 in clang::clangd::FileIndex::updatePreamble(llvm::StringRef, llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/FileIndex.cpp:464:7
> ```
> 
oh wow, i completely missed that. this actually points to a different data-race issue unfortunately, and we were suppressing it by keeping it alive here.

`PreambleStatCache` is thread-safe only after writes to it completely finishes (i.e. there are no more producingfs instances lying around). Previously, because we were not announcing statcache until indexing was over, we didn't notice this. but now we're actually sending statcache to consumers while indexing still has a producingfs alive.

Hence, in addition to the life extension of the statcache, we actually need to change FS inside filemanagers to be a cosumingfs.


================
Comment at: clang-tools-extra/clangd/Preamble.cpp:106-113
+    CI.setSema(nullptr);
+    CI.setASTConsumer(nullptr);
+    if (CI.getASTReader()) {
+      CI.getASTReader()->setDeserializationListener(nullptr);
+      /* This just sets consumer to null when DeserializationListener is null */
+      CI.getASTReader()->StartTranslationUnit(nullptr);
     }
----------------
kuganv wrote:
> kadircet wrote:
> > kuganv wrote:
> > > kadircet wrote:
> > > > why are we triggering destructors for all of these objects eagerly? if this is deliberate to "fix" some issue, could you mention that in comments?
> > > > why are we triggering destructors for all of these objects eagerly? if this is deliberate to "fix" some issue, could you mention that in comments?
> > > 
> > > Thanks a lot for the review.
> > > If we don't destruct and set it to null, CapturedASTCtx will also have to keep instances such as ASTConsumer including other related callbacks such PreambleCallbacks. This was making the CapturedASTCtx interface and implementation complex.
> > > If we don't destruct and set it to null, CapturedASTCtx will also have to keep instances such as ASTConsumer 
> > 
> > Sorry if I am being dense but I can't actually see any references to those objects in the structs we preseve. AFAICT, Sema, ASTConsumer and ASTReader are members of CompilerInstance, which we don't keep around. Can you point me towards any references to the rest of the objects you're setting to null in case I am missing any?
> > 
> > `ASTMutationListener` seems to be the only one that's relevant. It's indeed exposed through `ASTContext` and our indexing operations can trigger various callbacks to fire. Can you set only that one to nullptr and leave a comment explaining the situation?
> > > If we don't destruct and set it to null, CapturedASTCtx will also have to keep instances such as ASTConsumer 
> > 
> > Sorry if I am being dense but I can't actually see any references to those objects in the structs we preseve. AFAICT, Sema, ASTConsumer and ASTReader are members of CompilerInstance, which we don't keep around. Can you point me towards any references to the rest of the objects you're setting to null in case I am missing any?
> > 
> > `ASTMutationListener` seems to be the only one that's relevant. It's indeed exposed through `ASTContext` and our indexing operations can trigger various callbacks to fire. Can you set only that one to nullptr and leave a comment explaining the situation?
> 
> Thanks for the review.  I will revise based on the feedback. 
> 
> In buildPreamble, we create CppFilePreambleCallbacks in stack to be used in PrecompiledPreamble::Build which becomes part of PrecompilePreambleConsumer. I think this this need to be reset. Probably this has to be done in a different way. If I dont reset, I am seeing crash in some cases. See:
> 
> ```
> 85c0c79f in clang::ASTReader::DecodeIdentifierInfo(unsigned int) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReader.cpp:8695:32
>     #1 0x558285c75266 in clang::ASTReader::readIdentifier(clang::serialization::ModuleFile&, llvm::SmallVector<unsigned long, 64u> const&, unsigned int&) /home/kugan/local/llvm-project/clang/include/clang/Seriali
> zation/ASTReader.h:2126:12
>     #2 0x558285c75266 in clang::ASTRecordReader::readIdentifier() /home/kugan/local/llvm-project/clang/include/clang/Serialization/ASTRecordReader.h:211:20
>     #3 0x558285c75266 in clang::serialization::BasicReaderBase<clang::ASTRecordReader>::readDeclarationName() /home/kugan/local/llvm-project/build/tools/clang/include/clang/AST/AbstractBasicReader.inc:780:58
>     #4 0x558285ce10bc in clang::ASTDeclReader::VisitNamedDecl(clang::NamedDecl*) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:694:26
>     #5 0x558285ce10bc in clang::ASTDeclReader::VisitNamespaceDecl(clang::NamespaceDecl*) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:1785:3
>     #6 0x558285cbc6e8 in clang::ASTDeclReader::Visit(clang::Decl*) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:543:37
>     #7 0x558285d5a133 in clang::ASTReader::ReadDeclRecord(unsigned int) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:4052:10
>     #8 0x558285c73677 in clang::ASTReader::GetDecl(unsigned int) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReader.cpp:7607:5
>     #9 0x558285c73677 in clang::ASTReader::GetLocalDecl(clang::serialization::ModuleFile&, unsigned int) /home/kugan/local/llvm-project/clang/include/clang/Serialization/ASTReader.h:1913:12
>     #10 0x558285bfd741 in clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&)::$_9::operator()(clang::serializ
> ation::ModuleFile*, llvm::ArrayRef<llvm::support::detail::packed_endian_specific_integral<unsigned int, (llvm::support::endianness)2, 1ul, 1ul>>) const /home/kugan/local/llvm-project/clang/lib/Serialization/ASTRe
> ader.cpp:7687:21
>     #11 0x558285bfd741 in clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&) /home/kugan/local/llvm-project/c
> lang/lib/Serialization/ASTReader.cpp:7697:7
>     #12 0x55827e942de7 in clang::ExternalASTSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::SmallVectorImpl<clang::Decl*>&) /home/kugan/local/llvm-project/clang/include/clang/AST/ExternalASTSour
> ce.h:185:5
>     #13 0x55827e942de7 in clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const /home/kugan/local/llvm-project/clang/lib/AST/DeclBase.cpp:1421:11
>     #14 0x55827e94436b in clang::DeclContext::decls_begin() const /home/kugan/local/llvm-project/clang/lib/AST/DeclBase.cpp:1477:5
>     #15 0x5582822787ab in clang::DeclContext::decls() const /home/kugan/local/llvm-project/clang/include/clang/AST/DeclBase.h:2188:48
>     #16 0x5582822787ab in clang::clangd::indexHeaderSymbols(llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-project/clang-tools-extra/cla
> ngd/index/FileIndex.cpp:233:37
>     #17 0x558282281be9 in clang::clangd::FileIndex::updatePreamble(llvm::StringRef, llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-proje
> ct/clang-tools-extra/clangd/index/FileIndex.cpp:464:7
> ```
> 
> 
> 
ah, this one is unfortunately a little bit more involved, and triggers only in the cases where we have modules enabled (which is incidentally working in clangd today, but effect is wider as -std=c++20 implicitly turns on -fmodules).

tl;dr; astcontext is keeping references to pchgenerator in module-enabled builds.

TBH the safest option here seems like clearing external source in astcontext completely. As we can't really risk callbacks to PCHGenerator, even if we fixed all the life-time concerns (ast consumer keeps references to frontendaction alive, so both the callbacks we pass here and the action itself needs to outlive the consumer).

But that's likely going to break preamble-indexing completely for codebases that use modules (we'll likely only index top-level decls from the current modules, but I am not sure). The alternative suggested here strikes some middle ground (there's still going to be behavior change, previously all the modules accessed during indexing would effect the final serialized preamble, they won't anymore) but I am afraid it's too fragile. The dependency between these clang-internals were so subtle that it took ~hours for me to analyze, despite me specifically looking for them, hence I don't think anyone that's unaware can preserve these invariants (i.e. make sure we don't have references to astconsumer in other places.).

I am not sure how these work for you today, but can you check if just setting external source on the astcontext to null is good enough?


================
Comment at: clang-tools-extra/clangd/Preamble.cpp:694
     Result->MainIsIncludeGuarded = CapturedInfo.isMainFileIncludeGuarded();
-    return Result;
+    CapturedCtx.emplace(CapturedInfo.takeLife());
+    return std::make_pair(Result, CapturedCtx);
----------------
DmitryPolukhin wrote:
> kadircet wrote:
> > DmitryPolukhin wrote:
> > > kuganv wrote:
> > > > kadircet wrote:
> > > > > kuganv wrote:
> > > > > > kadircet wrote:
> > > > > > > what about just keeping the callback (with a different signature) and calling it here? e.g.:
> > > > > > > ```
> > > > > > > PreambleCallback(CapturedInfo.takeLife());
> > > > > > > ```
> > > > > > > 
> > > > > > > that way we don't need to change the return type and also control if we received a valid object on every call site (since callback will only be invoked on success)
> > > > > > > what about just keeping the callback (with a different signature) and calling it here? e.g.:
> > > > > > > ```
> > > > > > > PreambleCallback(CapturedInfo.takeLife());
> > > > > > > ```
> > > > > > > 
> > > > > > > that way we don't need to change the return type and also control if we received a valid object on every call site (since callback will only be invoked on success)
> > > > > > 
> > > > > > Apologies for the misunderstanding.  Just to be clear, you prefer indexing in UpdateIndexCallbacks using the thread Tasks that also indexes in indexStdlib? In this implementation, I am calling the index action in preamble thread. I will revise it.
> > > > > > Apologies for the misunderstanding. Just to be clear, you prefer indexing in UpdateIndexCallbacks using the thread Tasks that also indexes in indexStdlib? In this implementation, I am calling the index action in preamble thread. I will revise it.
> > > > > 
> > > > > Yes, i guess that's the reason why I was confused while looking at the code. Sorry if I give the impression that suggests doing the indexing on `PreambleThread`, but I think both in my initial comment:
> > > > > 
> > > > > >> As for AsyncTaskRunner to use, since this indexing task only depends on the file-index, which is owned by ClangdServer, I don't think there's any need to introduce a new taskrunner into TUScheduler and block its destruction. We can just re-use the existing TaskRunner inside parsingcallbacks, in which we run stdlib indexing tasks.
> > > > > 
> > > > > and in the follow up;
> > > > > 
> > > > > >> I think we can just change the signature for PreambleParsedCallback to pass along refcounted objects. forgot to mention in the first comment, but we should also change the CanonicalIncludes to be a shared_ptr so that it can outlive the PreambleData. We should invoke the callback inside buildPreamble after a successful build. Afterwards we should also change the signature for onPreambleAST to take AST, PP and CanonicalIncludes as ref-counted objects again and PreambleThread::build should just forward objects received from PreambleParsedCallback. Afterwards inside the UpdateIndexCallbacks::onPreambleAST we can just invoke indexing on Tasks if it's present or synchronously in the absence of it.
> > > > > 
> > > > > I was pointing towards running this inside the `Tasks` in `UpdateIndexCallbacks`.
> > > > > 
> > > > > ---
> > > > > 
> > > > > There's definitely some upsides to running that indexing on the preamble thread as well (which is what we do today) but I think the extra sequencing requirements (make sure to first notify the ASTPeer and then issue preamble callbacks) we put into TUScheduler (which is already quite complex) is probably not worth it.
> > > > > > Apologies for the misunderstanding. Just to be clear, you prefer indexing in UpdateIndexCallbacks using the thread Tasks that also indexes in indexStdlib? In this implementation, I am calling the index action in preamble thread. I will revise it.
> > > > > 
> > > > > Yes, i guess that's the reason why I was confused while looking at the code. Sorry if I give the impression that suggests doing the indexing on `PreambleThread`, but I think both in my initial comment:
> > > > > 
> > > > > >> As for AsyncTaskRunner to use, since this indexing task only depends on the file-index, which is owned by ClangdServer, I don't think there's any need to introduce a new taskrunner into TUScheduler and block its destruction. We can just re-use the existing TaskRunner inside parsingcallbacks, in which we run stdlib indexing tasks.
> > > > > 
> > > > > and in the follow up;
> > > > > 
> > > > > >> I think we can just change the signature for PreambleParsedCallback to pass along refcounted objects. forgot to mention in the first comment, but we should also change the CanonicalIncludes to be a shared_ptr so that it can outlive the PreambleData. We should invoke the callback inside buildPreamble after a successful build. Afterwards we should also change the signature for onPreambleAST to take AST, PP and CanonicalIncludes as ref-counted objects again and PreambleThread::build should just forward objects received from PreambleParsedCallback. Afterwards inside the UpdateIndexCallbacks::onPreambleAST we can just invoke indexing on Tasks if it's present or synchronously in the absence of it.
> > > > > 
> > > > > I was pointing towards running this inside the `Tasks` in `UpdateIndexCallbacks`.
> > > > > 
> > > > > ---
> > > > > 
> > > > > There's definitely some upsides to running that indexing on the preamble thread as well (which is what we do today) but I think the extra sequencing requirements (make sure to first notify the ASTPeer and then issue preamble callbacks) we put into TUScheduler (which is already quite complex) is probably not worth it.
> > > > 
> > > > Thanks again for the review. Updated the diff such that indexing is now in IndexTasks. AFIK, there will be a single thread that will now manage indexing of preamble for all the opened TUs.  Please let me know if you see any issues there.
> > > > There's definitely some upsides to running that indexing on the preamble thread as well (which is what we do today) but I think the extra sequencing requirements (make sure to first notify the ASTPeer and then issue preamble callbacks) we put into TUScheduler (which is already quite complex) is probably not worth it.
> > > 
> > > @kadircet I'm sorry for interfering with the review but could you please elaborate what are the downsides of current approach when preamble indexing happening off critical path but on the same thread? The benefits of such approach that preamble marked available earlier and compilation of the main file starts earlier in parallel with indexing preamble. Also such approach eliminates potential race condition between preamble thread and indexing thread. Code in [PrecompiledPreamble::Build](https://github.com/llvm/llvm-project/blob/main/clang/lib/Frontend/PrecompiledPreamble.cpp#L551-L577) accesses AST for preamble and if indexing is running in separate thread it will be race condition. If we schedule indexing after PrecompiledPreamble::Build e still will need to be very careful to don't access anything in preamble AST from PreambleThread after the task is scheduled. The only downside of running indexing on PreambleThread is that we cannot start building new preamble while previous one is indexing but I think it might be positive thing to throttle preamble building.
> > > but could you please elaborate what are the downsides of current approach when preamble indexing happening off critical path but on the same thread?
> > 
> > As I mentioned above this makes TUScheduler more complicated now. We need to make sure we first send the preamble to astpeer and then invoke parsed callbacks, the sequencing is subtle and ties our hands if we want to refactor this code in the future as it needs to be preserved.
> > 
> > > the benefits of such approach that preamble marked available earlier and compilation of the main file starts earlier in parallel with indexing preamble. 
> > 
> > that's the case with both approaches, right? surely directly sending the preamble to ASTPeer might be couple milliseconds faster, but we're still indexing preamble in parallel. We might loose couple milliseconds while putting the new task into the queue, but considering the preamble/ast build times i don't think that really matters.
> > 
> > 
> > > Also such approach eliminates potential race condition between preamble thread and indexing thread. Code in PrecompiledPreamble::Build accesses AST for preamble and if indexing is running in separate thread it will be race condition. If we schedule indexing after PrecompiledPreamble::Build e still will need to be very careful to don't access anything in preamble AST from PreambleThread after the task is scheduled. 
> > 
> > sorry I don't fully understand the argument here. in both alternatives we're running indexing after `PrecompiledPreamble::Build` finishes hence they cannot access "live" ast concurrently.
> > Maybe you mean `ParsedAST::build`, i.e. the main ast build? If that's the case, it uses the serialized preamble, not "live" astcontext (and serialized preamble is already thread-safe to read concurrently).
> > We need to make sure we first send the preamble to astpeer and then invoke parsed callbacks, the sequencing is subtle and ties our hands if we want to refactor this code in the future as it needs to be preserved.
> 
> I completely agree that clangd threading has lots of subtle things that makes it easy to violate expected sequence of events. My comment about potential race condition was about very similar thing, it is very easy to keep pointer to ASTContext/FileManager/etc. in a callback and call it after the task was scheduled and it will be race condition. That is why I thought that indexing on the same thread that was used for building preamble AST is safer. I'm not aware of any existing race conditions with this diff. My example was hypothetical what would happen if indexing task was scheduled from AfterExecute, for example, as it was from one of the version of this diff. Just to make it clear, I'm fine with current approach too.
> it is very easy to keep pointer to ASTContext/FileManager/etc. in a callback and call it after the task was scheduled and it will be race condition.

You're right, but I don't think we're making it worse here with either of the approaches. Every time we create new references to those objects, we already have that risk, but IMO it's easier to raise some eyebrows when a new reference is created to a non-threadsafe object vs just re-ordering of some calls. Hence I was more inclined towards this alternative that at least won't make sequencing of operations more constrained.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148088/new/

https://reviews.llvm.org/D148088



More information about the cfe-commits mailing list