[llvm-dev] [RFC] Analysis and runtime check of objc_direct/objc_non_runtime_protocol

Manman Ren via llvm-dev llvm-dev at lists.llvm.org
Mon Oct 12 08:52:20 PDT 2020


Currently diagnostics related to objc_direct/objc_non_runtime_protocol are
done at each compilation unit. Even with these diagnostics, we will see
issues at runtime

   - Due to lack of global scope for our diagnostics. We can catch these
   issues with a global analysis.
   - Due to usages of dynamic APIs that require the metadata. We can catch
   these issues with runtime checks. Builds with runtime check enabled will
   execute extra code at runtime to catch issues.

Global Analysis #1 We will get a runtime error if a method is marked as
direct but some callsites see the method as not direct. One example is with
private and public headers:

// public.h:
@interface A
@end

// private.h:
@interface A (Private)
- (void)foo;
@end

// impl.mm:
#include “public.h”
@implementation A
- (void)foo __attribute__((objc_direct)) {}
@end


// test.mm:
#include “private.h”
void test(A *a) { [a foo]; }

→ When compiling test.mm, foo is not treated as direct, and IR codegen will
generate a call to bjc_msgSend. When compiling impl.mm, foo is direct and
is removed from the ObjC metadata.

We should emit an error message from the global analysis if there exists an
objc_msgSend callsite that targets a direct method. With this global
analysis, we can claim the invariant that within the link unit, all
callsites for a direct method are converted to direct calls.
What information do we need for the global analysis? List of direct methods
(class-name selector-name)
List of objc_msgSend callsites from IRGen (static-receiver-type
selector-name)
Global Analysis #2 We want to make sure a direct method is the single
definition along the class hierarchy globally. This is checked for each TU
via clang diagnostics.

Within a single TU, we have the following diagnostics:

   - can't override a method that is declared direct by a superclass


   - methods that override superclass methods cannot be direct

These diagnostics guarantee that if a method is direct, there is no
override along the class hierarchy when compiling each source file.

If we have a private header and a public header for a class, and we extend
the class using the public header, current clang diagnostics will not
report an issue.

With this global analysis, we can claim the invariant that a direct method
is the single definition along the class hierarchy globally. This will
reduce runtime issues caused by overrides.
What information do we need for the global analysis? List of direct methods
(class-name selector-name)
Class hierarchy information (class-name base-class-name)
Global Analysis #3 A class that declares conformance to a protocol does not
need to see the protocol definition in order to compile successful. Thus we
can end up with a situation as shown below:

// proto.h
__attribute__((objc_non_runtime_protocol))
@protocol Static
- (void) doThing;
@end

// source.m
@protocol Static;
@interface Something : Root<Static>
- (void) doThing;
@end
@implementation Something
- (void) doThing { ... }
@end

Compiling source.m will generate a reference to the protocol metadata for
Static even though it was defined as non-runtime. This will emit a warning
at compile time but a developer might choose to ignore it and face link
errors. This will end up with a linker error for a missing symbol.

A potentially better warning can be given to developers via global analysis
saying protocol "Static" was defined to be non-runtime in file
"SomeFileThatIncludedProtoDotH.m" to and dynamic in "source.m".

This example problem can also expand to protocol inheritance hierarchies.
Given an arbitrary hierarchy of runtime and non-runtime protocols and
forward declarations of protocols, the compiler could potentially emit the
wrong references given misleading forward decls. This can also be much more
accurately diagnosed via global analysis.
Runtime Check The goal of runtime checks is to catch the usages of dynamic
APIs targeting direct methods at runtime, e.g, respondsToSelector,
performSelector, etc. The idea is to intercept Objc runtime’s invocation of
resolveInstanceMethod and resolveClassMethod , which is responsible for
resolving methods that aren’t recorded in the metadata, and check if the
unresolved method is a direct method. It requires we collect the list of
direct method names, and the names of the classes where they’re defined.
There is another category of runtime issues which runtime check can
potentially catch: the direct call calls the incorrect direct method due to

   1. usage of class_addMethod
   2. random type casting at the call site

This can be an incremental improvement outside of this RFC.
Implementation Options We can potentially implement the analysis/runtime
check with and without LTO. It is better to implement runtime check without
LTO as we are expecting developers to use runtime-check build locally. LTO
will slow down developer’s working cycle. For analysis, it is not clear
which option is better.
noLTO Implementation We create a special section containing those
information for each TU. This section for the link unit will have the
combined information. We then post-process this information to emit error
messages.
For each direct method, we create a struct containing the method name and
the name of the class where it’s defined, e.g.

// struct _objc_direct_method {
//   char *class_name;
//   char *method_name
// }

Then each class will have a list of _objc_direct_method structs, which can
be put in a special section, e.g

constant [4 x %struct._objc_direct_method][//the list of the
structs],section "__DATA,__direct_method"

This special section for the link unit will have the combined information
for all TUs in this link unit. At runtime, the intercepted
resolveInstanceMethod and resolveClassMethod invocation can then read and
check against this section. If it finds the unresolved method is a direct
method, it emits an error message of this direct method cannot be used
dynamically at runtime.
ThinLTO Implementation We add information in ModuleSummaryIndex and a
ThinLTO analysis pass to analyze the combined Summary and emit error
messages.


We will upload patches for noLTO/runtime-check if there is no negative
feedback about the RFC! Feedback/suggestions are welcome!

Thanks,
Manman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201012/cb40382c/attachment-0001.html>


More information about the llvm-dev mailing list