[PATCH v2 6/6] Wire up delayed typo correction to DiagnoseEmptyLookup and set up Sema::ActOnIdExpression to use the new functionality.

Wed Nov 19 15:37:40 PST 2014

On Wed, Nov 19, 2014 at 3:36 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Wed, Nov 19, 2014 at 11:03 AM, Kaelyn Takata <rikka at google.com> wrote:
>
>>
>>
>> On Tue, Nov 18, 2014 at 10:50 AM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>> +cfe-commits
>>>
>>> Though I've had trouble applying the summary patch you posted - could
>>> you rebase/repost? Sorry for the delays :/
>>>
>>
>> No worries.
>>
>>>
>>> On Tue, Nov 18, 2014 at 10:49 AM, David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Nov 5, 2014 at 1:10 PM, Kaelyn Takata <rikka at google.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 4, 2014 at 4:54 PM, David Blaikie <dblaikie at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Could you post/send (on/off list, etc) an aggregate patch so I can
>>>>>> play around with the behavior of this patch?
>>>>>>
>>>>>> Some superficial patch review
>>>>>>
>>>>>> lambda passed to CorrectDelayedTyposInExpr has 'else-after-return' -
>>>>>> you could move the short block up to the top:
>>>>>>
>>>>>> if (!...CPlusPlus11)
>>>>>>   return this->VerifyIntegerConstantExpression(E);
>>>>>> /* unindented the longer part of the code */
>>>>>>
>>>>>
>>>>> Fixed. Thanks for catching that!
>>>>>
>>>>>>
>>>>>> I'm slightly surprised at the indentation there too - I think
>>>>>> clang-format has a special case for lambdas as the last function argument,
>>>>>> where it won't indent them extra (so the "});" will be at the same level as
>>>>>> the expression - maybe it only does that if you're not initializing a
>>>>>> variable? Dunno.
>>>>>>
>>>>>
>>>>> I think it was a quirk in the version of clang-format used to
>>>>> originally format this several months ago; clang-format now formats it a
>>>>> bit differently.
>>>>>
>>>>>>
>>>>>> Do you need the 'this->' qualification in that lambda? I assume not.
>>>>>> (& I usually favor just capture everything by reference [&] - but up to you
>>>>>> on that, we're all still certainly feeling out how best to write C++11)
>>>>>>
>>>>>
>>>>> I didn't think the implicit "this" could be used for accessing member
>>>>> functions... I just learned something new. :) (Though "this" still needs to
>>>>> be in the capture list--I didn't capture everything by reference because
>>>>> the only variable that needed to be captured was "this".)
>>>>>
>>>>
>>>> Sure - up to you. I usually don't even think about what's being
>>>> captured and just write [&] by default. (this applies really well to
>>>> functions that will only call the lambda within the function (not storing
>>>> it away to call later - at which point you get into interesting lifetime
>>>> situations, etc) - for those, it's trickier and depends on the lifetime of
>>>> the things being captured as to whether to capture by ref or value, etc).
>>>>
>>>>
>>>>>
>>>>>> EmptyLookupTypoDiags::operator() - some {} on single-statement blocks
>>>>>> (if (Ctx)/else)
>>>>>> Might be worth inverting the if/following block - the if block is
>>>>>> longer/nested than the tail block, so switching them around should reduce
>>>>>> indentation, etc.
>>>>>>
>>>>>
>>>>> Done and done.
>>>>>
>>>>>>
>>>>>> std::string CorrectedStr(TC.getAsString(SemaRef.getLangOpts()));
>>>>>>
>>>>>> Prefer copy/move init rather than direct init:
>>>>>>
>>>>>> std::string CorrectedStr = TC.getAsString(SemaRef.getLangOpts());
>>>>>>
>>>>>
>>>>> Changed.
>>>>>
>>>>>>
>>>>>> Hmm - it might be nice to write EmptyLookupTypoDiags just as a single
>>>>>> free function (taking all those members as parameters instead) then call
>>>>>> that in a lambda - that way you don't have to write all the member
>>>>>> state/ctor/etc logic?
>>>>>>
>>>>>
>>>>> I had to capture variables by value instead of by reference to avoid
>>>>> lifetime issues (and having 74 tests fail) and create a local variable so
>>>>> the LookupResult didn't have to be captured since it cannot be copied, but
>>>>> I did get it working as a static function and a lambda and the switch
>>>>> resulted in a net change of 17 fewer lines (37 added, 54 deleted).
>>>>>
>>>>
>>>> Cool cool.
>>>>
>>>>
>>>>>
>>>>>> lambda in ParseCaseStatement: I'd probably write "return
>>>>>> ExprResult(...)" rather than explicitly specifying the return type. (do you
>>>>>> even need to do either? If CorrectDelayedTyposInExpr is a template, then
>>>>>> its return result should just implicitly convert to ExprResult in the
>>>>>> caller, if it takes a std::function, that should still work too... - I can
>>>>>> explain this more in person, perhaps.
>>>>>>
>>>>>
>>>>> You're right, neither seem to be needed--though I thought I had added
>>>>> the return type because of clang complaining without it.
>>>>>
>>>>>>
>>>>>> Rather than goto in ParseCXXIdExpression, could that be a loop?
>>>>>>
>>>>>
>>>>> The cleanest way I can see to write it as a loop would be to wrap it
>>>>> in a "while (true) {...}" loop, and have an "if (Result.isInvalid() ||
>>>>> Result.get()) break;" right before the (now unconditional) call to
>>>>> UnconsumeToken. The only other way I can see to do the loop would require
>>>>> checking "!Result.isInvalid() && !Result.get()" twice in a row, once to
>>>>> know whether to call UnconsumeToken, and once in the do...while loop to
>>>>> know whether to continue looping.
>>>>>
>>>>
>>>> Hmm - fair point. I'd probably at least still write it using the
>>>> while(true) loop rather than goto, just to demonstrate to the reader what
>>>> kind of control flow is happening here more easily.
>>>>
>>>> As for other ways to write this... one alternative, which I wouldn't
>>>> necessarily espouse as better than your suggested loop:
>>>>
>>>>   bool first = true;
>>>>   do {
>>>>     if (!first)
>>>>       UnconsumeToken()
>>>>     first = false;
>>>>     ...
>>>>   } while (!Result.isInvalid() && !Result.get())
>>>>
>>>> Though calling '!Result.isInvalid() && !Result.get()' twice probably
>>>> isn't the end of the world either, though certainly feels a bit off. Your
>>>> suggestion of:
>>>>
>>>>   while (true) {
>>>>     ...
>>>>     if (...)
>>>>       break;
>>>>     UnconsumeToken();
>>>>   }
>>>>
>>>> seems good. It's probably about as obvious as its going to get, I'd
>>>> imagine. If the body of the loop is long, it might be more clear to pull it
>>>> into a function and that'd make the loop's scope and purpose clearer.
>>>>
>>>
>> I just remembered the other reason why I wrote this as a goto instead of
>> a loop (and why the goto label is "retry"): the body of the "loop" should
>> in most cases only be executed once, would only be executed a second time
>> if there is an error and typo correction suggests a keyword, and should
>> *never* be executed more than twice (i.e. the goto should only ever be
>> called at most once) unless something has gone horribly, impossibly wrong
>> ("wrong" being a typo was encountered, a keyword was suggested,
>> substituting the keyword causes typo correction to be triggered again in
>> the same spot and suggest a keyword again... a situation that shouldn't be
>> possible and for which I'd be willing to add an assertion to prevent it
>> from happening).
>>
>
> An assertion sounds good, in any case.
>
> If it's only a single retry - perhaps breaking out the body of the loop
> into a function that you call twice would make sense?
>
> func()
> if (failed)
>   func(other stuff)
> assert(!failed)
>
> ?
>
>
>>
>>>>
>>>>>>
>>>>>> Hmm - why is the correction applied to the LHS/RHS of the binary
>>>>>> expression, rather than waiting until the full expression is parsed?
>>>>>>
>>>>>
>>>>> If you are referring to within ParseRHSOfBinaryExpression, it's
>>>>> because any error in the RHS will cause the LHS to be replaced with an
>>>>> ExprError without any typos in the LHS being diagnosed, so waiting until
>>>>> the full expression is parsed causes typos to go undiagnosed if there are
>>>>> unrelated parse errors in the full expression. This had been a bit of a
>>>>> pain to figure out and track down why typo correction on certain full
>>>>> expressions wasn't happening for part of the expression...
>>>>>
>>>>
>>>> Hmm, that's interesting - I'd be curious to understand why that is, but
>>>> I guess it's got a good reason.
>>>>
>>>
>> I think it's because errors in the expression are communicated through
>> the same channel as the expression itself: the ExprResult. So for something
>> that combines multiple expressions into one (which it seems that
>> ParseRHSOfBinaryExpression handles both parsing the RHS and combining the
>> LHS and RHS), it can either return the subexpression tree or an
>> ExprError.... which is fine up until something more needs to be done with
>> the subexpression tree once an ExprError has been generated.
>>
>
> Hmm, OK - that makes sense. Could/should we then do the correction on the
> LHS/RHS only when a failure is about to cause the expression to be thrown
> out? (that way if there's no error... oh, but of course there'
>

(fat fingers...)

but of course there's an error, because we're correcting typos)

So, yeah... not sure there's a nicer way to handle that, then.... *wonders*

(still reviewing things)

>
>
>>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 29, 2014 at 12:49 PM, Kaelyn Takata <rikka at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> Among other things, this eliminates at least one redundant error case
>>>>>>> and allows recovery in several cases where it wasn't possible before
>>>>>>> (e.g. correcting a mistyped static_cast<>).
>>>>>>> ---
>>>>>>>  include/clang/Parse/Parser.h         |   9 +++
>>>>>>>  include/clang/Sema/Sema.h            |   4 +-
>>>>>>>  lib/Parse/ParseCXXInlineMethods.cpp  |   1 +
>>>>>>>  lib/Parse/ParseDecl.cpp              |   4 +-
>>>>>>>  lib/Parse/ParseDeclCXX.cpp           |   2 +-
>>>>>>>  lib/Parse/ParseExpr.cpp              |  30 +++++++---
>>>>>>>  lib/Parse/ParseExprCXX.cpp           |  13 ++++-
>>>>>>>  lib/Parse/ParseObjc.cpp              |   5 +-
>>>>>>>  lib/Parse/ParseOpenMP.cpp            |   3 +-
>>>>>>>  lib/Parse/ParseStmt.cpp              |   8 ++-
>>>>>>>  lib/Sema/SemaExpr.cpp                | 110
>>>>>>> ++++++++++++++++++++++++++++++++---
>>>>>>>  lib/Sema/SemaStmt.cpp                |  19 ++++++
>>>>>>>  test/FixIt/fixit-unrecoverable.cpp   |   4 +-
>>>>>>>  test/SemaCXX/typo-correction.cpp     |   7 ++-
>>>>>>>  test/SemaTemplate/crash-10438657.cpp |   2 +-
>>>>>>>  15 files changed, 189 insertions(+), 32 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-commits mailing list
>>>>>>> cfe-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20141119/928774d0/attachment.html>