[cfe-dev] Clang and Swift

Wed Dec 9 09:51:31 PST 2015

> On Dec 9, 2015, at 6:45 AM, Pierre Gousseau <pierregousseau14 at gmail.com> wrote:
> 
> As Anna has kindly pointed out, APINotes is a possible replacement for my patch proposal at http://reviews.llvm.org/D13731 <http://reviews.llvm.org/D13731>, i.e. to provide the static analyzer with "fake" function definitions with extra attributes that get merged into the AST at the beginning of static analysis (similar idea to that described earlier by Sean S and James YK).   
> 
> As APINotes sounds like a good solution, I would need to make sure this fits our requirements: 
> 1. We need to support new attributes that make sense for static analysis, not just the ones currently used by the static analyzer. 
> 2. We need allow 3rd parties to easily create their own APINotes files. 
It should be quite easy to have 3d parties implement tools to generate the YAML files.

> 3. We need to support C++. 
> 4. We need to support an unlimited number of functions. 
> 5. To have the flexibility of organizing one/many libraries into one or many APINotes files. 
> 6. To have a corresponding attribute to each APINotes feature. 
> 7. To have the capacity to support conditional attributes (as Sean S described earlier). 
I am not sure what exactly “conditional attributes” refers to. However, APINotes do not support using macros inside attributes.

>   
> Do those requirement fit with the APINotes design? 
>   

All of the above (except for # 7) are in agreement with the design of APINotes. However, the APINotes (nor the ModelInjector) support all of these right now.

The major pieces that are missing from APINotes are support for new attributes and C++ support. It is possible to implement a generic support for any new attributes, for example, by adding an attribute field that could be checked and parsed as code by the compiler. For a simplified and much less generic solution, we could design an attribute for static analysis (as I’ve described in Phabricator) and only extend the APINotes format to support that. APINotes currently supports ObjC and C and would need to be extended to handle C++, which is not trivial. (ModelInjector only supports C and will face the same issue.)

As I’ve mentioned in the previous email, there are many reasons to prefer APINotes. Also, we’ve used them in production on a very large scale. 

> Regards, 
>   
> Pierre Gousseau 
> SN-Systems - Sony Computer Entertainment 
> 
> On 7 December 2015 at 21:42, Anna Zaks <ganna at apple.com <mailto:ganna at apple.com>> wrote:
> 
>> On Dec 4, 2015, at 4:31 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote:
>> 
>> 
>> 
>> On Fri, Dec 4, 2015 at 2:46 PM, James Y Knight <jyknight at google.com <mailto:jyknight at google.com>> wrote:
>> Without having looked into it...
>> 
>> If people need a feature like this, I'm wondering if it would make sense to parse a special file containing normalish C/C++/ObjC declarations, and use those to replace or augment attributes from matching declarations in the normal parse.
>> 
>> That's actually a really good idea.
> 
> Adding a file containing C/C++/ObjC declarations with additional attributes would be more flexible in some respects. However, this approach has many potential downsides.
> 
> Since there was no specific proposal on how this approach should be implemented, suppose we wanted to extend the experimental ModelInjector (http://reviews.llvm.org/D13731 <http://reviews.llvm.org/D13731>). ModelInjector was added to the analyzer so that we could feed it summaries of common functions, which are not a part of a TU otherwise. The models are injected very late, during static analysis, so other clients would not be able to use that approach as it stands right now.
> 
> Here are some of the problems with adding augmented C/C++/ObjC declarations:
>  - Compiling the extra declarations might lead to unpredictable errors depending on the code the compiled project contains. 
>  - Common APIs often have slightly different declarations on different platforms. We could address those with ifdefs; that would be messy. Naming the “thing” in the APINotes file is much simpler.
>  - In order to make this usable by other clients, the injection has to be moved up the compilation pipeline. Most likely, we would need to be able to tell which header the declaration comes from as well as associate it with the corresponding model file. This approach would muddle the compilation process much more and will be more difficult to maintain.
>  - What about the distribution model? Suppose we need to use this method to suppress false positives in user code by annotating 3d party APIs. Using the annotations in code approach, we would be redistributing those APIs.
>  - The binary YAML format is more succinct, which might matter more or less depending on how many supplementary annotations one has.
> 
> Cheers,
> Anna.
> 
>> 
>> Yesterday we were talking at the social briefly about this and actually it became pretty clear that a YAML-type format wouldn't scale. A compelling example from the internal patch I was reviewing is something like "if this argument has this bit set, then treat this other argument as having this meaning to the static analyzer"*; the bit itself is of course specified in the source by `#define FOO_MAGIC_OPTION (1 << 7)` or `enum { ..., FOO_MAGIC_OPTION = (1 << 7), ...}` or whatever, and to maintain sanity the YAML file would of course want to use that symbolic constant. Realistically C/C++ source code is the only fully general way to get the right name lookup, macro expansion, constexpr evaluation, ...
>> 
>> So instead of a separate file, it could be something like, your side-table of attributes is really just a C/C++ source file that contains "replacement declarations" (marked with a pragma or attribute or something), that the analyzer would parse first then substitute as appropriate (or something). That would allow you to use the natural API `#include`'s to pick up the symbolic constants.
>> 
>> 
>> * (
>> For reference, what would ultimately be desired is something like:
>> 
>> enum {
>>   ...
>>   FOO_MAGIC_BIT = (1 << 7),
>>   ...
>> };
>> 
>> // If (flags & FOO_MAGIC_BIT), then the analyzer should treat `p` as having some special meaning.
>> void foo(Bar *p, unsigned flags) __attribute__((arg_is_magical_when_bit_is_set(0/*arg `p`*/, 1/*arg`flags`*/, FOO_MAGIC_BIT)));
>> )
>> 
>> 
>> -- Sean Silva
>>  
>> Requiring a special YAML and binary format reader/writer seems unfortunate, especially as it looks nowhere close to being generically useful, beyond exactly what Swift needed.
>> 
>> I must also admit to being a bit surprised that Apple needed to ship a feature like this at all, since I'd have thought you/they would have had full control over their own platform's header files, and could've just committed the changes to that directly. :)
>> 
>> On Fri, Dec 4, 2015 at 5:06 PM, Anna Zaks via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>> 
>>> On Dec 3, 2015, at 5:18 PM, Sean Silva via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>> 
>>> 
>>> 
>>> On Thu, Dec 3, 2015 at 2:45 PM, Douglas Gregor via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>> Hi all,
>>> 
>>> As some of you may have heard, Swift has gone open-source over at swift.org <http://swift.org/>. Swift makes heavy use of Clang for its (Objective-)C interoperability, including loading Clang modules to map (Objective-)C APIs into Swift via Swift’s “Clang importer” and using Clang’s CodeGen to handle C ABI issues (record layout, calling conventions) and use C inline functions directly from Swift [*]. 
>>> 
>>> As an out-of-tree language front-end dependent on Clang, we have a clone of the llvm.org <http://llvm.org/> Clang repository over on GitHub at github.com/apple/swift-clang <http://github.com/apple/swift-clang>. We merge regularly and try to minimize our differences with llvm.org <http://llvm.org/>'s Clang—for more information on how we’re handling this, see swift.org/contributing/#llvm-and-swift <http://swift.org/contributing/#llvm-and-swift>.
>>> 
>>> That said, Swift’s clone of the Clang repository does have some content that isn’t in the llvm.org <http://llvm.org/> Clang repository. Here’s a quick summary of what that content is:
>>> 
>>> * There are several new attributes. We plan to propose these for inclusion into mainline Clang. They’re fairly small additions, some of which have wider applicability than Swift support:
>>> 
>>> 	* ‘noescape’ attribute: indicates that the address provided by a particular function parameter of pointer/reference type won’t escape the function. At present, this is only used to map to Swift’s ‘noescape’ attribute, although we think it makes sense to use this for the LLVM IR “nocapture” parameter attribute as well.
>>> 
>>> 	* ‘objc_subclassing_restricted’ attribute: indicates that a particular Objective-C class cannot be subclassed. Swift uses it in its generated Objective-C headers, but we are interested in making this a first-class Objective-C feature.
>>> 
>>> 	* Swift-specific attributes (‘swift_error', ‘swift_name’, ’swift_private'): these attributes affect the mapping of (Objective-)C declarations into Swift.
>>> 
>>> 	* ‘swift’ unavailability: the existing ‘availability’ attribute is extended with a ‘swift’ platform, so that one can mark something as unavailable in Swift.
>>> 
>>> * API Notes: This represents the bulk of the changes in the repository. API notes solve a not-uncommon problem: we invent some new Clang attribute that would be beneficial to add to some declarations in system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we can’t go around and fix all of the system headers everywhere. With API notes, we can write a separate YAML file that states that we want to add ‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang as part of normal compilation (via a command-line option), Clang will add ‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’. Personally, I don’t like API notes—even with our optimizations, it’s inefficient in compile time and it takes the “truth” out of the headers—but I can see the wider use cases. If the Clang community wants this feature, I can prepare a proper proposal; if not, we’ll keep this code in the Swift clone of Clang and delete it if Swift ever stops needing it.
>>> 
>>> Internally I recently saw a situation that would have benefitted from this sort of thing. Essentially Sean Eveson (CC'd) and his coworkers were prototyping some static analyzer checks that required certain functions in the SDK to be marked up with some info (really not much more than the moral equivalent of an "printf" attribute). Obviously there's a catch-22 with proving the checks are valuable and getting the corresponding API's officially marked up with such attributes (and such updated headers making to all clients, etc.).
>>> 
>>> This sort of feature would help break that catch-22 and avoid the need for ad-hoc hardcoded tables. Having all that PS4-specific data hardcoded was actually the primary barrier to upstreaming the check (or at least getting a proper upstream review of the idea), so having a way to decouple those private annotations would have really been nice!
>>> 
>> 
>> +1 
>> 
>> API Notes is great for adding annotations to declarations when/before changes to header files can be made. I can see how this could be used by the future clang static analyzer checks.
>> 
>> Anna.
>> 
>> 
>>> -- Sean Silva
>>>  
>>> 
>>> * SourceMgrAdapter: An adapter that translates diagnostics from an llvm::SourceMgr to clang::SourceManager. This is used by the API notes YAML compiler to translate its diagnostics into something that goes our through Clang’s SourceManager, but might be useful for other clients that are making use of llvm::SourceMgr for simple handling of source files. Unless API notes gets pulled into llvm.org <http://llvm.org/> Clang or someone else asks for it, I don’t feel like this is important to pull into llvm.org <http://llvm.org/> Clang by itself.
>>> 
>>> 
>>> Any questions? Feel free to contact me!
>>> 
>>> 	Cheers,
>>> 	Doug
>>> 
>>> [*] The actual ideas were discussed at the 2014 Developer Meeting in the “Skip the FFI” talk by Jordan Rose and John McCall (http://llvm.org/devmtg/2014-10/#talk18 <http://llvm.org/devmtg/2014-10/#talk18>)
>>> 
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>> 
>>> 
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>> 
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>> 
>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20151209/daafa730/attachment.html>