[PATCH] D70469: [attributes] [analyzer] Add handle related attributes
Gábor Horváth via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Dec 16 09:39:19 PST 2019
xazax.hun marked an inline comment as done.
xazax.hun added inline comments.
================
Comment at: clang/lib/Sema/SemaType.cpp:7424
+ ParsedAttr &Attr) {
+ if (CurType->isFunctionType()) {
+ State.getSema().Diag(Attr.getLoc(), diag::warn_type_attribute_wrong_type)
----------------
aaron.ballman wrote:
> aaron.ballman wrote:
> > NoQ wrote:
> > > xazax.hun wrote:
> > > > aaron.ballman wrote:
> > > > > xazax.hun wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > Is this correct, or should it be `if (!CurType->isFunctionType())`?
> > > > > > > I see now that this is only incorrect for the acquire attribute, but not the other two.
> > > > > > I think we might consider it incorrect for the acquire as well. Because if we let the user put acquire on the function, it is ambiguous where to put the annotation. If we do not let them put on the function they are always forced to put on the return type to express this. But in case this kind of ambiguity is a feature and not a bug I can change the behavior.
> > > > > I guess I was considering the semantics as being acquire on a function type applies to the non-void return value being the handle and release on a function type applies to the only parameter being passed to the function.
> > > > >
> > > > > I don't think it makes sense to apply acquire or release semantics to a handle type itself because that's not really part of the type identity itself (it's not an "acquiring handle" or a "releasing handle" -- I would think it's *both* at the same time), so I think that's been throwing me for a loop with the latest round of patches here. What I was imagining was that you should mark any place a handle is created or released, which works almost-fine for parameters (because you can mark the parameter declarations with an attribute). It doesn't quite work for function pointers, however, because those don't have parameters. e.g., `void (*fp)([[clang::acquire_handle]] int &)` won't do what you expect, I believe. However, for functions which return the acquired handle, there is no good place to write the attribute except on the function itself (because, again, it's not a property of the return type, but of the specific function's return type). For those cases, I imagine you want the function *type* to be what codifies that a handle is returned, because then it will work with function pointers. For *using* a handle, I'm not certain what the benefit is to having an attribute on each use -- that seems likely for a user to forget to add an annotation somewhere and get incorrect behavior. I would normally suggest that the attribute be applied to the type declaration of the handle (e.g., through a typedef), but if you want this to work with handles coming from system headers (like file descriptors or sockets, etc) then will this approach work?
> > > > Hmm. The problem is that a function might do multiple things with multiple handles, for example in Fuchsia there are functions that replacing a handle with another one. So I am afraid naively putting the attributes on the function type might end up being ambiguous sometimes. I totally agree with your argument about the annotation not being part of the type identity. If we do want to put this on the function type maybe we could do something similar to `nonnull` and enumerate parameter indices? What do you think?
> > > >
> > > > The `use` annotation might be a bit misleading. The user does not necessarily need to annotate all uses.
> > > > Let's consider the following function and a call:
> > > > ```
> > > > void f(int *handle);
> > > > close_handle(handle);
> > > > f(&handle); // What will this do?
> > > > ```
> > > >
> > > > The function `f` would be free to do anything, like store the address of the handle in a global data structure, overwrite the closed handle and so on. The purpose of the `use` annotation is to tell the analyzer that none of this will happen, so we can safely diagnose a use after release. So the original idea was to annotate those functions that are guaranteed to never close, overwrite etc the handle.
> > > > For unannotated functions the analyzer will be conservative and assume that the code is OK.
> > > I heard that type attributes are relatively scary and require a lot of work to implement, and `nonnull`/`_Nonnull` is definitely one of the historical examples of why they're scary. Like, you don't want to mess with the type system and find yourself explaining whether it's still the same type or a different type simply because it wears an attribute. Declaration attributes are much more straightforward in this sense.
> > > I heard that type attributes are relatively scary and require a lot of work to implement
> >
> > They're harder to implement than declaration attributes because there's more to think about, but that's not a reason to not use a type attribute when that's the correct semantics we want.
> >
> >
> > The purpose of the use annotation is to tell the analyzer that none of this will happen, so we can safely diagnose a use after release. So the original idea was to annotate those functions that are guaranteed to never close, overwrite etc the handle.
>
> Thank you for the explanation, that helps me understand better.
>
> I think the default for "use" may be backward -- it seems to me that less functions will escape/close a handle than will operate on a handle. Do you have any data to back up my gut feeling? :-D I sort of wonder if we want the model to be: 1) mark a type as being a handle of a specific type (POSIX, fuschia, Win32, etc), 2a) mark a parameter declaration as creating a handle (the parameter type specifies the handle type), 2b) mark a function type as its return value creating a handle (the return type specifies the handle type), 3a) mark a parameter declaration as releasing a handle (the parameter type specifies the handle type), 3b) mark a function type as releasing a handle (all parameters of handle types are considered to be released), 4) mark a parameter declaration as possibly mutating the state of the handle (closing it, changing its identity, escaping it, etc).
>
> Usage would be something like:
> ```
> typedef int PosixHandle [[clang::handle_type("POSIX")]];
>
> PosixHandle open(const char *) [[clang::acquire_handle]];
> bool open2(const char *, [[clang::acquire_handle]] PosixHandle&);
>
> void close(PosixHandle) [[clang::release_handle]];
> void close2([[clang::release_handle]] PosixHandle);
>
> void typical_use(PosixHandle);
> void atypical_use([[clang::modifies_handle]] PosixHandle&);
>
> // Maybe we want to allow these too?
> int open3(const char *) [[clang::acquire_handle("POSIX")]];
> void close3([[clang::release_handle("POSIX") int);
> void atypical_use2([[clang::modifies_handle("POSIX") int&);
> ```
> However, when I think about this, this smells an awful lot like some existing attributes we already support: capabilities. We currently use them for thread safety analysis, but there's no reason they shouldn't work out of the box for what you want (https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#basic-concepts-capabilities). The basic idea behind them is that you specify a capability like "handle" and then specify what operations grant that capability (acquire the handle) or revoke it (release the handle). It may be worth exploring whether we can reuse these annotations for what you're trying to accomplish.
While it is true that less function will escape/close handles, this is also a trade-off. If we assume all unknown functions will close/escape, the check will be less noisy with unannotated/patially annotated codebase. If we assume that unknown function will not escape/close, we will get more false positives unless the whole code base is annotated. I chose this to support gradual introduction of annotations.
Ok. So how the handle works on declarations it is pretty clear. If we put it on a parameter, it affects that parameter, if we put it on a function, it affects the return value. The question is, how to deal with types.
> 2b) mark a function type as its return value creating a handle (the return type specifies the handle type
> 3b) mark a function type as releasing a handle (all parameters of handle types are considered to be released)
This does not really work. In case of Fuchsia, none of the syscalls actually return handles. Handles are acquired using output parameters and there are syscalls that acquire multiple handles. There are also syscalls that replaces handles, i.e. whatever we do at the type level we really need the paramter granularity. Any simplification like if we annotate the function type as such and such will affect all the parameters will not work.
> However, when I think about this, this smells an awful lot like some existing attributes we already support: capabilities.
I did look into capabilities, and while I am not an expert in this area I found the following problems:
* As far as I understand capabilities also only work well with fully annotated code, and gradually adding annotations can result in lots of false positives.
* As far as I understand, they do not have paramter level granularity, which is crucial for Fuchsia's APIs.
I did not want to generalize capability attributes as I would probably need to update all the related analysis as well.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70469/new/
https://reviews.llvm.org/D70469
More information about the cfe-commits
mailing list