[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

Mon May 15 09:05:02 PDT 2023

kadircet added a comment.

In D149733#4342300 <https://reviews.llvm.org/D149733#4342300>, @aaron.ballman wrote:

> But at the same time, it's become more obvious (at least to me) that clangd has features that don't work with all of the invariants in Clang and I don't know that we ever really stopped to figure out whether that's reasonable or not.

Just to be clear, this behaviour we observe is not clangd specific. it's triggered through code completion callbacks that's implemented through Sema, which is used by clangd and libclang but are not part of clangd.
You can basically trigger the same behaviour using `-Xclang -code-completion-at=xxx`, if you trigger it while parsing the lambda you receive an incomplete object `COMPLETION: a : [#auto#]a` (which I believe is the part that breaks the invariants of clang AST, but I am happy to go with the perspective that sema code completion doesn't get to rely on AST nodes being complete because it can trigger in the middle of parsing, which I believe implies all of the code completion users needs to deal with consequences) and if you trigger it after parsing the lambda, you get the proper object `COMPLETION: a : [#void#]a(<#Foo<int>#>)[# const#]`.

Inside clangd we don't do anything special with regards to feeding code to clang. The only difference is we feed it incomplete code most of the time. The expectation from our side is clang generating a ~okay shaped AST & diagnostics rather than crashes. Any crashes that we trigger in our functionality while traversing that AST, we mostly try to handle inside clangd (when we seem to rely on some incidental contracts). But there are also cases like this, which are not really AST handling on the clangd side but some clang pieces triggering a crash and the only reason why they seem related to clangd is we're just more likely to hit them as regular use-case of clang doesn't involve being triggered at every keystroke :D
I feel like this approach is quite minimally intrusive but if you disagree, I am happy to follow up and change our approach going forward.

> That's why I think we need a broader discussion. The problem is not ideological, it's one of maintainability of the primary product. For example, the community could decide "it is not Clang's job to be resilient to this sort of thing". Or we could decide "we need to be resilient to this but only if it doesn't introduce more than X% overhead". And so on. Each time we land another one of these whack-a-mole changes, we potentially make it harder to get to a more principled approach.

Again, this change is only introducing this overhead to USRGeneration which is only used by libclang and clangd, and both of these use cases are suppose to not crash on incomplete code when trying to code-complete. Hence I don't think the maintenance and overhead arguments do actually apply in this case (I know you're talking about the general case, but I believe we should also be aware of the distinction going forward. As despite not affecting rest of the callers neither code-complexity nor runtime overhead-wise, we still held up this particular patch for ~2 weeks).

> (My personal feelings are that invariants about internal object state are going to be hard for us to change or introduce unwarranted overhead in at least some circumstances. In those circumstances, I think the onus is on clangd to determine how to work within those invariants and not on clang to change unless there isn't another reasonable option. Refactoring or adding smoke tests can introduce overhead that typical compilation scenarios should never have to pay the cost for, and we should avoid that as best we can. But I also realize this adds burden to the clangd folks to have compiler performance statistics for changes or refactoring that relate to invariants which may or may not be reasonable.)

We're actually interested in overall performance of the compiler as well, hence I think we'd be fine with tracking implications of such changes whenever we feel like clang is the right layer to have them. But to do best of my knowledge, we don't really have a great way of testing out effects of such changes on compile times. Only useful tool that I know about is https://llvm-compile-time-tracker.com/, are there any other sources that explain how to run such benchmarks?

> The fact that this is happening 100s of times a day for you suggests we should land the changes in this patch so there's less pressure when having the broader discussion about where the division of labor is between clangd and clang. So LGTM!

Thanks a lot!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733