r177218 - Remove -Wspellcheck and replace it with a diagnostic option.

Mon Mar 25 09:21:11 PDT 2013

On Mon, Mar 25, 2013 at 8:56 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
> On Mar 21, 2013, at 2:49 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>> On Wed, Mar 20, 2013 at 5:57 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>> On Mar 20, 2013, at 2:10 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>> On Sat, Mar 16, 2013 at 5:17 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>>>> On Mar 16, 2013, at 3:54 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>>>>
>>>>> On Sat, Mar 16, 2013 at 10:32 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Mar 16, 2013, at 10:18 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>>
>>>>>>> On Sat, Mar 16, 2013 at 10:13 AM, Dmitri Gribenko <gribozavr at gmail.com>
>>>>>>> wrote:
>>>>>>>> On Sat, Mar 16, 2013 at 3:40 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com>
>>>>>>>> wrote:
>>>>>>>>> Author: akirtzidis
>>>>>>>>> Date: Fri Mar 15 20:40:35 2013
>>>>>>>>> New Revision: 177218
>>>>>>>>>
>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=177218&view=rev
>>>>>>>>> Log:
>>>>>>>>> Remove -Wspellcheck and replace it with a diagnostic option.
>>>>>>>>>
>>>>>>>>> Thanks to Richard S. for pointing out that the warning would show up
>>>>>>>>> with -Weverything.
>>>>>>>>
>>>>>>>> If we are going to start testing clang this way, it would be better to
>>>>>>>> design this first, so that adding new 'testing' diagnostics is easy
>>>>>>>> *and* does not slow down the normal compilation.  I think the second
>>>>>>>> part is addressed already.
>>>>>>>>
>>>>>>>> For example, adding a command line option every time is excessive.
>>>>>>>> This option could be renamed to -fclang-debugging-diagnostics, and all
>>>>>>>> such diagnostics could be placed under a special flag
>>>>>>>> -Wclang-debugging.
>>>>>>>
>>>>>>> I still don't understand the need for this at all. At a glance it
>>>>>>> seems like we're adding a positive diagnostic so we can check for the
>>>>>>> absence of a diagnostic - but we've never had a need to do this in the
>>>>>>> past. "-verify" fails if a diagnostic is emitted where it isn't
>>>>>>> expected so the absence of expected-blah lines is sufficient to test
>>>>>>> that we don't emit a diagnostic.
>>>>>>>
>>>>>>> Am I missing something here? Why are we doing this?
>>>>>>
>>>>>> This code snippet of an objc method
>>>>>>
>>>>>> -(void)objc_method: {
>>>>>> super.x = 0;
>>>>>> }
>>>>>>
>>>>>> would trigger typo-correction for 'super' silently, without emitting any
>>>>>> diagnostic.
>>>>>> For the regression test I added I put:
>>>>>>
>>>>>> typedef int super1;
>>>>>>
>>>>>> so typo-correction "succeeds" in correcting 'super' to 'super1' and errors
>>>>>> are emitted.
>>>>>> For regression testing purposes this would be sufficient though I don't
>>>>>> like that we would be inferring that a typo-correction did not trigger
>>>>>> indirectly (it is possible, though unlikely, that typo-correction would
>>>>>> trigger without resolving to the intended identifier)
>>>>>
>>>>>
>>>>> The way we usually handle this is with both a positive and a negative test:
>>>>>
>>>>> struct X { int x; } super1;
>>>>> -(void)objc_method: {
>>>>> super.x = 0; // expected-no-error
>>>>> }
>>>>> void c_function() {
>>>>> super.x = 0; // expected-error {{did you mean 'super1'}}
>>>>> }
>>>>>
>>>>>>
>>>>>> Beyond regression testing I'd like to have a way to get notified when
>>>>>> typo-correction is silently triggered for general testing.
>>>>>
>>>>>
>>>>> I'm not convinced that this has sufficient value to justify adding a -cc1
>>>>> option for it. Can you elaborate on why this is important?
>>>>>
>>>>>
>>>>> I'm not sure what else I could say beyond repeating myself; typo-correction
>>>>> is expensive, triggering it needlessly is unacceptable, a -cc1 option allows
>>>>> making sure that it is not triggered across full project sources.
>>>>> I don't see much complexity or maintenance involved for justifying not
>>>>> having it.
>>>>
>>>> Part of the issue here is a somewhat philosophical one - this seems
>>>> like a bit of a significant change in the way Clang is tested &
>>>> written thus far. To make that step I'd like to make sure it's
>>>> appropriately considered (yes, this is somewhat a case of a "slippery
>>>> slope" fallacy - if the first step isn't a problem even if the trend
>>>> could be, we should object to the trend when it happens, not the first
>>>> step itself)
>>>>
>>>> Admittedly I'm still not entirely understanding the issue here - you
>>>> mentioned that we were doing typo correction for cases where we
>>>> weren't emitting diagnostics. Which part of your change
>>>> demonstrates/fixes that?
>>>
>>> AFAIK, the way parsing works for 'super' is Sema::ClassifyName is called on it, and if lookup fails to find something, the parser later on handles 'super' itself.
>>> Because lookup failed in Sema::ClassifyName typo-correction would kick in.
>>>
>>>> Your fix seems to be inside the CorrectTypo
>>>> function. Shouldn't there be a change to a caller so we don't call
>>>> this in some non-diagnostic-emitting case if that's happening?
>>>
>>> If you think the check is more appropriate to be in ClassifyName, that is fine by me.
>>
>> Looking at the code a bit (enough to get a bit of a better sense) I'm
>> not sure that's the right place either.
>>
>> Here's the issue: we have a codepath that does typo correction (&
>> emits diagnostics if it sees them, etc) & the immediate alternative
>> codepath doesn't produce a diagnostic. That's problematically
>> asymmetric & could lead to similar bugs in other call sites that don't
>> handle the non-corrected case in error later. It seems to me that's a
>> fundamentally bad API design & we should fix it. (by always emitting a
>> diagnostic down there in ClassifyName - either with a typo correction,
>> or without)
>>
>> Secondly: why aren't we able to classify "super"? Shouldn't we know
>> that's a keyword & resolving it as such in ClassifyName?
>>
>> I think we should probably fix both these issues - I haven't looked
>> enough to know how to fix them or I'd have done it, but that's my
>> feeling at this point.
>>
>>>> We don't generally add flags like this for other performance problems
>>>> - we have many diagnostics that conditionally test whether their flag
>>>> is enabled before doing expensive work to test but don't have compiler
>>>> flags to test whether or not this code is executed in non-diagnostic
>>>> cases, for example.
>>>
>>> Do you believe it is worth having a way to automatically check whether typo correction has been triggered at all on a "clean" project (no typos to correct, so typo-correction will just affect performance).
>>
>> Not necessarily, no. This seems like a perf issue as any other that we
>> have bots/profiles/etc for.
>>
>>> As I pointed out before, typo-correction could silently take up 25-30% of -fsyntax-time on ObjC projects. My viewpoint is that this is expensive enough that it is worth adding a flag to use for making sure we don't regress and catch it if typo-correction is triggered needlessly again.
>>>
>>> If you agree on the "expensive enough" part but disagree on the methodology, could you recommend some other way ?
>>
>> Something I probably wouldn't really object to would be a boolean flag
>> (not a command line argument or anything, just a boolean variable) in
>> Sema (possibly in Debug builds only). Raise the flag during typo
>> correction. Assert that the flag is not raised at the end of
>> compilation if we never emitted any warnings or errors.
>
> This seems like a good idea to me, but why "in Debug builds only" ?

Sorry, I misspoke - I meant assert builds. I tend to conflate the two
unnecessarily/incorrectly. (just that the assert will only be compiled
in in an assert build, so there's no need to have the flag & logic to
raise the flag being compiled in a non-assert build if it'll never be
used, but it's probably not a terribly 'hot' thing so the #ifdefing
around the variable declaration & flag raising is probably a minor
issue - might be worth doing to avoid any current or future "unused
member" warnings in non-assert builds, though)

- David

>
>>
>> This will catch more cases - it'll work on a per-TU basis even in
>> projects that are currently emitting warnings. It will catch cases
>> where we use typo correction in warnings (if there are any?) that are
>> disabled in a particular project.
>>
>> It's a little invasive to have to have something in such a broad scope
>> as the whole Sema object, but seems viable.
>>
>> - David
>