r177218 - Remove -Wspellcheck and replace it with a diagnostic option.

Mon Mar 25 15:01:30 PDT 2013

On Mar 25, 2013, at 9:21 AM, David Blaikie <dblaikie at gmail.com> wrote:

> On Mon, Mar 25, 2013 at 8:56 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>> On Mar 21, 2013, at 2:49 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> 
>>> On Wed, Mar 20, 2013 at 5:57 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>>> On Mar 20, 2013, at 2:10 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>> 
>>>>> On Sat, Mar 16, 2013 at 5:17 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>>>>> On Mar 16, 2013, at 3:54 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>>>>> 
>>>>>> On Sat, Mar 16, 2013 at 10:32 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> On Mar 16, 2013, at 10:18 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>>> 
>>>>>>>> On Sat, Mar 16, 2013 at 10:13 AM, Dmitri Gribenko <gribozavr at gmail.com>
>>>>>>>> wrote:
>>>>>>>>> On Sat, Mar 16, 2013 at 3:40 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> Author: akirtzidis
>>>>>>>>>> Date: Fri Mar 15 20:40:35 2013
>>>>>>>>>> New Revision: 177218
>>>>>>>>>> 
>>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=177218&view=rev
>>>>>>>>>> Log:
>>>>>>>>>> Remove -Wspellcheck and replace it with a diagnostic option.
>>>>>>>>>> 
>>>>>>>>>> Thanks to Richard S. for pointing out that the warning would show up
>>>>>>>>>> with -Weverything.
>>>>>>>>> 
>>>>>>>>> If we are going to start testing clang this way, it would be better to
>>>>>>>>> design this first, so that adding new 'testing' diagnostics is easy
>>>>>>>>> *and* does not slow down the normal compilation.  I think the second
>>>>>>>>> part is addressed already.
>>>>>>>>> 
>>>>>>>>> For example, adding a command line option every time is excessive.
>>>>>>>>> This option could be renamed to -fclang-debugging-diagnostics, and all
>>>>>>>>> such diagnostics could be placed under a special flag
>>>>>>>>> -Wclang-debugging.
>>>>>>>> 
>>>>>>>> I still don't understand the need for this at all. At a glance it
>>>>>>>> seems like we're adding a positive diagnostic so we can check for the
>>>>>>>> absence of a diagnostic - but we've never had a need to do this in the
>>>>>>>> past. "-verify" fails if a diagnostic is emitted where it isn't
>>>>>>>> expected so the absence of expected-blah lines is sufficient to test
>>>>>>>> that we don't emit a diagnostic.
>>>>>>>> 
>>>>>>>> Am I missing something here? Why are we doing this?
>>>>>>> 
>>>>>>> This code snippet of an objc method
>>>>>>> 
>>>>>>> -(void)objc_method: {
>>>>>>> super.x = 0;
>>>>>>> }
>>>>>>> 
>>>>>>> would trigger typo-correction for 'super' silently, without emitting any
>>>>>>> diagnostic.
>>>>>>> For the regression test I added I put:
>>>>>>> 
>>>>>>> typedef int super1;
>>>>>>> 
>>>>>>> so typo-correction "succeeds" in correcting 'super' to 'super1' and errors
>>>>>>> are emitted.
>>>>>>> For regression testing purposes this would be sufficient though I don't
>>>>>>> like that we would be inferring that a typo-correction did not trigger
>>>>>>> indirectly (it is possible, though unlikely, that typo-correction would
>>>>>>> trigger without resolving to the intended identifier)
>>>>>> 
>>>>>> 
>>>>>> The way we usually handle this is with both a positive and a negative test:
>>>>>> 
>>>>>> struct X { int x; } super1;
>>>>>> -(void)objc_method: {
>>>>>> super.x = 0; // expected-no-error
>>>>>> }
>>>>>> void c_function() {
>>>>>> super.x = 0; // expected-error {{did you mean 'super1'}}
>>>>>> }
>>>>>> 
>>>>>>> 
>>>>>>> Beyond regression testing I'd like to have a way to get notified when
>>>>>>> typo-correction is silently triggered for general testing.
>>>>>> 
>>>>>> 
>>>>>> I'm not convinced that this has sufficient value to justify adding a -cc1
>>>>>> option for it. Can you elaborate on why this is important?
>>>>>> 
>>>>>> 
>>>>>> I'm not sure what else I could say beyond repeating myself; typo-correction
>>>>>> is expensive, triggering it needlessly is unacceptable, a -cc1 option allows
>>>>>> making sure that it is not triggered across full project sources.
>>>>>> I don't see much complexity or maintenance involved for justifying not
>>>>>> having it.
>>>>> 
>>>>> Part of the issue here is a somewhat philosophical one - this seems
>>>>> like a bit of a significant change in the way Clang is tested &
>>>>> written thus far. To make that step I'd like to make sure it's
>>>>> appropriately considered (yes, this is somewhat a case of a "slippery
>>>>> slope" fallacy - if the first step isn't a problem even if the trend
>>>>> could be, we should object to the trend when it happens, not the first
>>>>> step itself)
>>>>> 
>>>>> Admittedly I'm still not entirely understanding the issue here - you
>>>>> mentioned that we were doing typo correction for cases where we
>>>>> weren't emitting diagnostics. Which part of your change
>>>>> demonstrates/fixes that?
>>>> 
>>>> AFAIK, the way parsing works for 'super' is Sema::ClassifyName is called on it, and if lookup fails to find something, the parser later on handles 'super' itself.
>>>> Because lookup failed in Sema::ClassifyName typo-correction would kick in.
>>>> 
>>>>> Your fix seems to be inside the CorrectTypo
>>>>> function. Shouldn't there be a change to a caller so we don't call
>>>>> this in some non-diagnostic-emitting case if that's happening?
>>>> 
>>>> If you think the check is more appropriate to be in ClassifyName, that is fine by me.
>>> 
>>> Looking at the code a bit (enough to get a bit of a better sense) I'm
>>> not sure that's the right place either.
>>> 
>>> Here's the issue: we have a codepath that does typo correction (&
>>> emits diagnostics if it sees them, etc) & the immediate alternative
>>> codepath doesn't produce a diagnostic. That's problematically
>>> asymmetric & could lead to similar bugs in other call sites that don't
>>> handle the non-corrected case in error later. It seems to me that's a
>>> fundamentally bad API design & we should fix it. (by always emitting a
>>> diagnostic down there in ClassifyName - either with a typo correction,
>>> or without)
>>> 
>>> Secondly: why aren't we able to classify "super"? Shouldn't we know
>>> that's a keyword & resolving it as such in ClassifyName?
>>> 
>>> I think we should probably fix both these issues - I haven't looked
>>> enough to know how to fix them or I'd have done it, but that's my
>>> feeling at this point.
>>> 
>>>>> We don't generally add flags like this for other performance problems
>>>>> - we have many diagnostics that conditionally test whether their flag
>>>>> is enabled before doing expensive work to test but don't have compiler
>>>>> flags to test whether or not this code is executed in non-diagnostic
>>>>> cases, for example.
>>>> 
>>>> Do you believe it is worth having a way to automatically check whether typo correction has been triggered at all on a "clean" project (no typos to correct, so typo-correction will just affect performance).
>>> 
>>> Not necessarily, no. This seems like a perf issue as any other that we
>>> have bots/profiles/etc for.
>>> 
>>>> As I pointed out before, typo-correction could silently take up 25-30% of -fsyntax-time on ObjC projects. My viewpoint is that this is expensive enough that it is worth adding a flag to use for making sure we don't regress and catch it if typo-correction is triggered needlessly again.
>>>> 
>>>> If you agree on the "expensive enough" part but disagree on the methodology, could you recommend some other way ?
>>> 
>>> Something I probably wouldn't really object to would be a boolean flag
>>> (not a command line argument or anything, just a boolean variable) in
>>> Sema (possibly in Debug builds only). Raise the flag during typo
>>> correction. Assert that the flag is not raised at the end of
>>> compilation if we never emitted any warnings or errors.
>> 
>> This seems like a good idea to me, but why "in Debug builds only" ?
> 
> Sorry, I misspoke - I meant assert builds. I tend to conflate the two
> unnecessarily/incorrectly. (just that the assert will only be compiled
> in in an assert build, so there's no need to have the flag & logic to
> raise the flag being compiled in a non-assert build if it'll never be
> used, but it's probably not a terribly 'hot' thing so the #ifdefing
> around the variable declaration & flag raising is probably a minor
> issue - might be worth doing to avoid any current or future "unused
> member" warnings in non-assert builds, though)

I don't think the #ifdef noise is worth it currently. If a warning comes up in the future then we can evaluate if we should put #ifdefs or just an attribute.
If there are no further objections I'll probably implement your idea early next week.

Thanks for reviewing!

> 
> - David
> 
>> 
>>> 
>>> This will catch more cases - it'll work on a per-TU basis even in
>>> projects that are currently emitting warnings. It will catch cases
>>> where we use typo correction in warnings (if there are any?) that are
>>> disabled in a particular project.
>>> 
>>> It's a little invasive to have to have something in such a broad scope
>>> as the whole Sema object, but seems viable.
>>> 
>>> - David
>>