[PATCH] D49511: [Sema/Attribute] Check for noderef attribute

Mon Nov 5 18:03:48 PST 2018

leonardchan added inline comments.

================
Comment at: lib/Parse/ParseStmt.cpp:102-104
+  Actions.PushExpressionEvaluationContext(
+      Actions.ExprEvalContexts.back().Context);
   ParenBraceBracketBalancer BalancerRAIIObj(*this);
----------------
leonardchan wrote:
> rsmith wrote:
> > Hmm, we call `ParseStatementOrDeclaration` rather a lot. Can we pull this up into its ultimate callers instead:
> > 
> >   * `Parser::ParseFunctionDefinition`
> >   * `Parser::ParseLateTemplatedFuncDef`
> >   * `Parser::ParseLexedMethodDef`
> >   * `Parser::ParseLexedObjCMethodDefs`
> > 
> > ? Actually, we can do better than that: those functions all call either `ActOnStartOfFunctionDef` or `ActOnStartOfObjCMethodDef` first, and `ActOnFinishFunctionBody` when they're done. We should put the `PushExpressionEvaluationContext` and `PopExpressionEvaluationContext` calls in those functions instead.
> > 
> > We're missing at least one other place: when parsing the initializer for a global variable, in C++ we call `Sema::ActOnCXXEnterDeclInitializer`, which pushes a potentially-evaluated context. But in C, the `InitializerScopeRAII` object (in `Parser::ParseDeclaratoinAfterDeclaratorAndAttributes`) doesn't call into `Sema` and we don't ever push a suitable expression evaluation context.
> > 
> > You can find if any other places are missing by removing `Sema`'s global `ExpressionEvaluationContext` and adding an assert that fires if we try to parse an expression with no `ExpressionEvaluationContext`, and then running the unit tests. (And we should check in that change to ensure this doesn't regress.)
> @rsmith Would it be simpler to instead push and pop at the start and end of `Parser::ParseExternalDeclaration`? I'm currently working on your suggestion of removing the Sema global `ExpressionEvaluationContext` and find that a lot of accesses to the `ExprEvalContexts` stack stem from this method.
> 
> `ActOnStartOfFunctionDef` and `ActOnFinishFunctionBody` still work on anything in a function body, but I keep running into many cases for declarations declared globally that could be easily caught if instead I push and pop contexts at the start and end of `Parser::ParseExternalDeclaration`. Do you think this is a good idea or is there something that I may be missing from pushing and popping here?
> 
> This still accomplishes the goal of not reusing the global Sema context and I will still be able to check for `noderef` on every push and pop this way. 
Made a separate patch (https://reviews.llvm.org/D54014) that does the push and pop for `ActOnStartOfFunctionDef` and `ActOnFinishFunctionBody`.

I can continue doing this for the remaining places that depend on the global Sema context in other patches since I ran into a couple of issues running into fixing everything in one big patch and think fixing each individual place in smaller separate patches would be simpler.

================
Comment at: lib/Sema/SemaExpr.cpp:4289
+
+  if (TypeHasNoDeref(ResultTy)) {
+    LastRecord.PossibleDerefs.insert(E);
----------------
rsmith wrote:
> leonardchan wrote:
> > rsmith wrote:
> > > Do you ensure that the `noderef` attribute will be on the innermost level of the array type? I think you'll need to do so in order to warn on:
> > > 
> > > ```
> > > typedef int A[32];
> > > typedef A __attribute__((noderef)) *B;
> > > int f(B b) { return (*B)[1]; }
> > > ```
> > > 
> > > (Here, we have a pointer to a noderef annotated array of non-noderef-annotated int. So I think we will not emit a warning from the first dereference, because we have a pointer to an array, and we will not emit a warning from the second dereference in the array indexing, because the result type does not have the noderef attribute.)
> > Hmmmm, so normally in order to check what's correct, I usually run these examples through `sparse` since that's the tool that actually checks `noderef` for gcc, but it seems that sparse instead diagnoses a warning on the array indexing instead and nothing for the first dereference.
> > 
> > This shouldn't be though since, as you pointed out, the array does not have `noderef` types. For a simple array,
> > 
> > ```
> > int __attribute__((noderef)) x[10];
> > x[0];
> > ```
> > 
> > `sparse` diagnoses the appropriate warning for the array index. Personally though, I would chalk this up to an edge case that wasn't thought of before in sparse, since something like this isn't handled on their existing validation tests.
> > 
> > Currently, this diagnoses a warning on the first dereference, but I can also understand why it shouldn't warn because accessing `noderef` structs shouldn't warn if the member accessed is an array. The only case though this applies in sparse's tests are with structs and they don't provide a test for dereferencing a pointer to an array.
> > 
> > I think the reason for why the struct example still passes is that if the member is an array, none of the actual data in the struct is being accessed. Instead, the value returned is just a pointer to the start of the array within the struct and equivalent to just adding an offset to the struct pointer. I think that no warning should also be thrown for this example if it is guaranteed that array `A` can similarly be found from simple pointer arithmetic on pointer `B`. I don't think it can unless B were an array or struct also, but I could be wrong or overthinking this.
> Arrays-of-arrays and structs-containing-arrays are the same in this regard: in both cases, accessing the first level (array element or struct member) gives you an array lvalue, and only accessing that second array will actually dereference memory. So I think the treatment of the two cases should be consistent. For member access, you defer the check if the element type is an array, and I think you should do the same thing here.
> 
> There are two consequences of this:
> 
> 1) If the array element type is an array, you should bail out of here and not add `E` to `PossibleDerefs`.
> 2) If the array type (not the element type) has the `noderef` attribute, you should propagate it down to the array element type so that we can catch a later pointer dereference / array indexing operation.
> 
> For point 2, what this means in my example is that the expression `*B` should have type "array of 32 noderef ints" so that further noderef checks apply to it. The easiest way to accomplish this would be to check, when applying noderef to a type, whether the type is an array type, and if so also apply noderef to the array's element type.
There currently doesn't seem to be a simple way, that I could find, for editing the element type of an array type. I could alternatively just recreate the array type when wrapping one with the `noderef` `AttributedType`, but this would be difficult if the array type was surrounded by type sugar since the sugar would need to somehow be recreated on the new `AttributedType`. 

Is it possible to do something like a tree transformation where I can change the element type of an array type itself without having to recreate the whole type with sugar?

================
Comment at: test/Frontend/noderef_on_non_pointers.cpp:4
+/**
+ * Test 'noderef' attribute against other pointer-like types.
+ */
----------------
rsmith wrote:
> I'd like to see a test that we get a wraning for attempting to bind a reference to a dereferenced `noderef` pointer.
> 
> In C++, should it be valid to assign a `noderef` pointer to a dereferenceable pointer without a cast?
Ah, I forgot about this case. Sparse normally has a different warning for casting involving it's attributes (which includes a warning when casting from a noderef pointer to regular pointer) , but I didn't focus on that since it seemed like a whole other feature.

I can see though how it would be problematic if a user unknowingly did something like cast to a dereferenceable pointer. Added a warning and tests for this.

Repository:
  rC Clang

https://reviews.llvm.org/D49511