[PATCH] D45444: [clang-tidy] WIP: implement new check for const-correctness

Jonas Toth via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 17 01:10:47 PDT 2018


JonasToth added a comment.

In https://reviews.llvm.org/D45444#1069488, @shuaiwang wrote:

> I've updated https://reviews.llvm.org/D45679 and I think the `isModified()` function there is now sufficiently covering the cases you've covered here and can be used as a good starting version if you plan to use it here.
>  I copied your const-values test cases and it now passes with the following differences:
>
> - All unused local variables removed, my check will issue a warning for them because technically they're not modified, but I understand why you don't want to cover them. I don't feel strongly about it and I removed it from my copy-pasted test cases because I just to demonstrate `isModified()`


Nice!

> - Better recall in `some_reference_taking`, both `np_local0` and `r1_np_local0` can be caught, similar for `np_local1` and `non_const_ref` in `handle_from_array`.

Yes. But i feel we need to gather more information. For potential checks, it might be nice to control to which scope modification is defined and/or emit `note`s for interesting events like taking a non-const handle which isnt modified, too.

It kinda feels like we are converging to a dependency graph for every variable that contains the information for dependencies and dependents + modification history. I dont think it is bad, but maybe complicated :)
What do you think about returning a `ModificationRecord` that contains the number of direct modification, the number of local non-const handles and escaping non-const handles. 
Including const handles would be nice in the future for more and different analysis.

> - `direct_class_access` `np_local5` triggers with my check (shouldn't it?)

Yes can be const: Godbolt <https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(fontScale:0.8957951999999999,j:1,lang:c%2B%2B,source:'struct+PtrMember+%7B%0A++++int+*ptr%3B%0A%7D%3B%0A%0Aint+i+%3D+42%3B%0A%0Aint+main()+%7B%0A++++const+PtrMember+p%7B.ptr+%3D+%26i%7D%3B%0A++++*p.ptr+%3D+43%3B%0A%7D'),l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:51.5891551584078,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((g:!((h:compiler,i:(compiler:clang_trunk,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',trim:'0'),lang:c%2B%2B,libs:!(),options:'',source:1,wantOptInfo:'1'),l:'5',n:'0',o:'x86-64+clang+(trunk)+(Editor+%231,+Compiler+%231)+C%2B%2B',t:'0')),k:48.4108448415922,l:'4',m:33.333333333333336,n:'0',o:'',s:0,t:'0'),(g:!((h:ast,i:(compilerName:'x86-64+clang+(trunk)',editorid:1,j:1,source:'%23pragma+clang+diagnostic+warning+%22-Weverything%22%0A%23pragma+clang+diagnostic+ignored+%22-Wmissing-prototypes%22%0A%23pragma+clang+diagnostic+ignored+%22-Wlanguage-extension-token%22%0A%0Aint+foo(void)+%7B%0A++++typeof(1+%3D%3D+2)+x+%3D+1%3B%0A++++_Bool+y+%3D+1%3B%0A++++return+x+%3D%3D+-1+%7C%7C+y+%3D%3D+-1%3B%0A%7D%0A'),l:'5',n:'0',o:'x86-64+clang+(trunk)+Ast+Viewer+(Editor+%231,+Compiler+%231)',t:'0')),header:(),l:'4',m:33.333333333333336,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compiler:1,editor:1,wrap:'1'),l:'5',n:'0',o:'%231+with+x86-64+clang+(trunk)',t:'0')),l:'4',m:33.33333333333333,n:'0',o:'',s:0,t:'0')),k:48.4108448415922,l:'3',n:'0',o:'',t:'0')),l:'2',n:'0',o:'',t:'0')),version:4>

> - In `range_for` non-const loop variables triggers with my check (shouldn't they?)



- If the loop copies every value no modification occurs (its not a copy, its initialization. If the constructor would modify its value, the source cant be const either. Is it allowed?)
- if a handle is taken the usual rules apply.

> - In `range_for` local arrays doesn't trigger with my check, because I'm treating ArrayToPointerDecay the same as taking address of a variable, and looping over an array would involve ArrayToPointerDecay when the implicit `__begin`/`__end` are initialized. I added a note inside `isModified` for future improvements.

Given this is implemented here, can you use it?

> - My check over triggers for template (with my failed attempt to fix), but I think it's more of some mistake in the check itself rather than in `isModified`

I removed templates while matching for potential candidates. That can be moved out of the `isModified()`. `isModified()` can only reason about instantiated functions. For the actual template we need concepts.

> Sorry about the confusion.
>  Basically consider this example:
> 
>   class Foo {
>   public:
>     void a() { x = 10; }
>     void b() { // nothing }
>     void c() { a(); }
>     void d() { b(); }
>     
>   private:
>     int x;
>   };
> 
> 
> If we check whether `isModified(dereference(cxxThisExpr()))` for each `CompondStmt(hasParent(functionDecl()))`, we would conclude that:
> 
> - `a()` should stay non-const, because there's `this->x = 10` modifying `*this`
> - `b()` should be changed to const, because nothing modifies `*this`
> - `c()` should stay non-const, because we're invoking non-const member function on `this`
> - `d()` should also stay non-const, with the same reason for c(). (We can in theory be smarter about this one because the definition of b() happens to be inside the same TU but that's out of scope right now)
> 
>   However if we use a per-TU map to record whether `x` can be const, we'll conclude that `x` is modified thus can't be const, missing the one that `b()` is not modifying `x`. To fix that we'll need two-layered map recording that `x` is only modified in `a()` and potentially modified indirectly from `c()` and `d()`, so that in the end we can figure out that `b()` is safe to be marked const. Anyway, all I was trying to say is: let's use the `isModified()` approach as it's simpler & cleaner for future use cases like adding const to member functions. And it feels to me that we've already reached agreement there :)

Yes.

>> @shuaiwang Are you in IRC from time to time? I think it will be better to discuss changes in chat, when i start to introduce your approach here.
> 
> Not really, but I can get myself familiar with it.

We dont need to chat. The review discussion works well. :)

>>> isModified(Expr, Stmt), that checks whether Expr is (potentially) modified within Stmt, is something I believe to be quite useful:
>> 
>> What i understand from your comments and code:
>> 
>> - implement a utility taking `DeclRefExpr` and check if there is modification in some scope (`Stmt`, e.g. `CompoundStmt`) -> `isModified` or `cantBeModifiedWithin`
> 
> It doesn't have to be `DeclRefExpr`, any `Expr` should work, and it's useful to make it accept any `Expr`:
> 
> - It can be used to check whether `dereference(cxxThisExpr())` is modified or not.
> - For pointer types, it can be used to check both `declRefExpr(isPointerType())` and `dereference(declRefExpr(isPointerType()))`, and suggest adding const at different level
> 
>> - match all relevant non-const variable declarations as potential candidates
>> - track all uses of the candidates and check for modification in their scope
>> 
>>   One Note: We could include variables in unnamed namespaces and `static` declared variables. They have TU scope.
> 
> Great catch!
> 
>> My deviations:
>> 
>> - a variable should be considered modified if a non-const handle is create from it, even if that handle does not modify its referenced value. As first step, those handles should be marked const, then the original value can be marked as const. That is required to produce compiling code after potential code-transformation.
> 
> I believe we can issue warning + fixit within one pass:
> 
>   int f() {
>     int a = 10;
>     int& b = a;
>     return b;
>   }
> 
> 
> We should be able to issue warnings for both `a` and `b`, because `b` itself is a `varDecl` without modifying behavior, and when checking `a` we'll just repeat a bit more work that was already done when checking `b` by following the reference chain to be able to find that `a` can be const as well.

We can certainly try it. Speaking from previous experience: There is always some weird way to not catch everything and to get false positives in edge cases, thats why i tried the conservative way first. But i was stumbling around, too. I think the new function can give us more confidence, because its easier to test in isolation.

One interesting thing i found in code: `int &some_var = condition ? var1 : var2`. I am not sure if we already find that as reference taking :)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D45444





More information about the cfe-commits mailing list