[cfe-dev] [clangd-dev] Delayed typo correction is fragile

Thu Jul 18 14:45:21 PDT 2019

Want to share some of my experience with non-empty DelayedTypos in ~Sema, hopefully it can be useful. Many times there are lingering delayed typos because of multiple typos in an expression. So when we try to fix one typo, it fails and we discard the entire expression which can contain more TypoExpr in it. Correct way seems to check discarded expressions and clean up corresponding TypoExpr. But with the current design there is no simple way to do that and it has limited value when assertions are disabled.

I think the change suggested in https://reviews.llvm.org/D64799 is reasonable and should help with low-value assertion failures.

Thanks,
Volodymyr

> On Jul 16, 2019, at 10:46, Ilya Biryukov via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> Thanks for pointing this out, the codegen does run on every top level declaration.
> However, it does nothing if any errors were reported. 
> 
> That means we could prevent codegen by:
> 1. emitting the diagnostics for uncorrected typos on each top-level declaration, before the codegen kick in,
> 2. checking if there are any "pending" typos in addition to checking for errors before doing the codegen.
> 
> Either should be doable. (1) has the advantage of reporting the errors earlier, making them easier to fix/diagnose.
> 
> However, (1) might not be a little involved. At least I got the impression from talking to various people that some typos are only fixed at template instantiation time.
> The code to figure out at which point the uncorrected typos should be emitted for template instantiations might be a little involved because of this.
> 
> I would be surprised if the proposed assertion(!HasErrors || Typos.empty()) ever fired in practice. It's rare to see only a single compiler error coming from clang, so I would expect almost any typo to induce at least another error right away. That's actually why I'd expect the "broken codegen" to be hardly possible in practice.
> 
> Out of the options we have, I'll probably add checks for (2) to codegen and emit the delayed typos at the end of TU. That seems to be the simplest option, at least.
> Happy to go with (1) or the alternative assertion if people think the proposed approach would lead to too many diagnostics.
> 
> On Tue, Jul 16, 2019 at 7:29 PM Reid Kleckner <rnk at google.com <mailto:rnk at google.com>> wrote:
> On Tue, Jul 16, 2019 at 10:18 AM David Blaikie via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
> End of the TU sounds too late to me - IR generation is done incrementally (at the end of functions, for instance - though I'm not sure that's the only point), so leaving typos in until the end of the TU could lead to the "IR generation getting weird because of pending typo corrections" issue, no?
> 
> Well, we currently assert in ~Sema, which is after the end of the TU, so if we hit today's assert, we've already done incremental codegen without crashing. Diagnosing at end of TU doesn't make the situation worse. The way I understand clang's incremental codegen strategy, we generate code after every top level decl. We sometimes skip incremental codegen if errors have been reported or if a Decl is invalid. I think it's a bug if an error hasn't been reported but a delayed typo expr gets sent to codegen, and we should add asserts to defend against that.
>  
> On Mon, Jul 15, 2019 at 8:19 PM Ilya Biryukov via clangd-dev <clangd-dev at lists.llvm.org <mailto:clangd-dev at lists.llvm.org>> wrote:
> We would like to avoid assertion failures for those, which leads me to the following questions:
> - Is there a way to quickly track down the place that miss the CorrectDelayedTypos* call?
> A common pattern is that an error causes an Expr subtree to be discarded, and the code that does so "forgets" to call CorrectDelayedTypos.
> e.g. https://reviews.llvm.org/rL366200 <https://reviews.llvm.org/rL366200>
> There's usually a diagnostic emitted before the Expr is discarded, so in these cases poking around the diag emit location often sheds light. But my fear is there are tens or hundreds of these bugs, and it's hard to enumerate them.
> 
> At some level, this seems silly - if the Expr doesn't survive, its typos don't need to be corrected to protect CodeGen from them. The diagnostics are probably important though.
> If we could ensure the diagnostics are emitted as Reid says, and reduce the requirement to be that Exprs that survive parsing get typo-corrected, then this might be tractable.
> 
> Another idea is to weaken the assert to validate that either errors have been reported, or there are no delayed typos. That would mean it's OK to forget to diagnose typos if parts of the AST are invalid, since we expect the user to fix their code and recompile, potentially discovering the typo on the next compile.
> 
> 
> -- 
> Regards,
> Ilya Biryukov
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190718/de1ea82e/attachment.html>