[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

Thu Dec 26 11:55:38 PST 2019

FYI I mailed 3 patches this morning that together implement the RFC. PTAL:

D71907: [WPD/VFE] Always emit vcall_visibility metadata for
-fwhole-program-vtables
D71911: [ThinLTO] Summarize vcall_visibility metadata
D71913: [LTO/WPD] Enable aggressive WPD under LTO option

Teresa

On Wed, Dec 11, 2019 at 6:21 AM Teresa Johnson <tejohnson at google.com> wrote:

> Please send any comments. As mentioned at the end I will follow up with
> some patches as soon as they are cleaned up and I create some test cases.
>
> RFC: Safe Whole Program Devirtualization Enablement
> ===================================================
>
> High Level Summary
> ------------------
>
> The goal of the changes described in this RFC is to support aggressive
> Whole Program Devirtualization without requiring -fvisibility=hidden at
> compile time, by pre-enabling bitcode for whole program devirtualization,
> but delaying the decision on whether to apply devirtualization until LTO
> link time. This is needed both because we may not know whether the link
> mode is safe for hidden LTO visibility until link time, and also to allow
> bitcode objects to be shared between links of targets with differing valid
> LTO visibility. This utilizes the !vcall_visibility metadata added for Dead
> Virtual Function Elimination.
>
> The summary of changes required are (these are described in more detail
> later):
>
> 1) When -fwhole-program-vtables is specified, always insert type test
> assumes for virtual calls, and additionally add !vcall_visibility metadata
> to vtable definitions (which will be summarized in the ThinLTO index).
>
> 2) At LTO link time, apply hidden LTO visibility to vtable definition
> vcall_visibility metadata (or summary) when specified by a new link option
> (-lto-whole-program-visibility).
>
> 3) During the LTO link time Whole Program Devirtualization analysis, only
> allow devirtualization when the associated vtable definitions have hidden
> LTO visibility, as derived from the !vcall_visibility metadata (summarized
> in the index for index-only WPD).
>
> 4) Modify the Virtual Function Elimination application in GlobalDCE to
> ignore vtables with !vcall_visibility when they are associated with type
> tests (and not just type checked loads).
>
> Background
> ----------
>
> Whole Program Devirtualization is supported for LTO (both regular and
> Thin) via the -fwhole-program-vtables option. However, it can only be
> safely applied to classes for which LTO can analyze the entire class
> hierarchy, and therefore is restricted to those classes with hidden LTO
> visibility. See https://clang.llvm.org/docs/LTOVisibility.html for more
> information.
>
> The LTO visibility of a class is derived at compile time from the class’s
> symbol visibility. Generally, only classes that are internal at the source
> level (e.g. declared in an anonymous namespace) receive hidden LTO
> visibility. Compiling with -fvisibility=hidden tells the compiler that,
> unless otherwise marked, symbols are assumed to have hidden visibility,
> which also implies that all classes have hidden LTO visibility (unless
> decorated with a public visibility attribute). This results in much more
> aggressive devirtualization.
>
> However, compiling with -fvisibility=hidden is only safe when we know we
> are LTO linking with full view of the class hierarchy. Specifically, this
> is true when a binary is being LTO linked with either all sources being
> bitcode (so that the LTO unit is the same as the linkage unit), or when the
> only translation units being linked as native code are known to not derive
> any classes defined in the LTO unit (e.g. system libraries). Additionally,
> the binary may not dlopen any libraries at runtime that contain classes
> derived from those defined in the main binary.
>
> Assuming we are building and linking a binary that satisfies the above
> constraints (we are LTO linking all translation units as bitcode, except
> certain (e.g. system) libraries or other native objects known to be safe by
> the user or build system, and the binary will not dlopen any libraries
> deriving from the binary’s classes), then it should be safe to compile with
> -fvisibility=hidden, along with -fwhole-program-vtables.
>
> However, there are cases where it is unknown until link time whether we
> are building a target that meets the above constraints. Additionally, we
> may want to build additional targets that do not meet the criteria for safe
> application of -fvisibility=hidden during the same build invocation
> (specifically, because subsets of the code will be linked into shared
> libraries instead of linking all code directly into the binary). Even if
> possible to build two sets of bitcode object files (one with default
> visibility for the unsafely linked targets and one with hidden visibility
> for the safely linked targets), this causes duplication in both time and
> space, which is prohibitive in an environment where it is common to build
> targets with tens of thousands of sources, and multiple targets with
> different link modes simultaneously.
>
> The goals of the changes described in this RFC are to essentially delay
> the application of -fvisibility=hidden until LTO link time, and allow
> bitcode objects to be shared between links of targets with differing link
> modes and therefore differing valid LTO visibility.
>
> Type Information for Devirtualization
> -------------------------------------
>
> LTO whole program devirtualization is driven off of type information in
> the IR. This includes type metadata (on vtable definitions), as well as
> type test intrinsics before virtual calls. The former is safe to emit into
> the IR in all cases, but the latter is currently not. The virtual call
> sites are decorated with an llvm.assume(llvm.type.test(ptr, typeid))
> sequence, which drives the LTO analysis of virtual calls. This sequence is
> an assertion that the given pointer is associated with the given type
> identifier (https://llvm.org/docs/LangRef.html#llvm-type-test-intrinsic).
> It is currently inserted only for classes with hidden LTO visibility as the
> implication of this sequence is that we have full visibility of that type’s
> class hierarchy, and may devirtualize the call based on that knowledge.
> This assumption is not valid if the class does not have hidden LTO
> visibility.
>
> In order to drive later devirtualization, we still need the type
> compatibility information provided by the llvm.type.test, but want to delay
> a decision on whether it is valid to assume that we have full class
> hierarchy visibility, and thus whether devirtualization of that target can
> be safely applied.
>
> Specifically, what we want to know at LTO time is whether the vtable has
> hidden LTO visibility or not, and use that to guide the application of
> devirtualization to the type tested virtual call sites. By default, only
> those with statically guaranteed hidden LTO visibility should be marked as
> such. And as described later, at LTO link time we can optionally decide to
> convert vtables to hidden LTO visibility for more aggressive
> devirtualization when appropriate.
>
> There is already a mechanism in the compiler to describe the vtable
> visibility, which was recently added for Dead Virtual Function Elimination
> (D63932): !vcall_visibility metadata, documented at
> https://llvm.org/docs/TypeMetadata.html#vcall-visibility-metadata. This
> metadata is attached to vtable definitions, currently only when VFE is
> enabled. As described in the documentation, because this is currently only
> used for VFE, it also requires that the corresponding function pointer
> loads use the llvm.type.checked.load intrinsic. This would not be required
> for devirtualization (although the VFE support in GlobalDCE will need
> modification to ignore the metadata when type checked loads not used, more
> on that later).
>
> This RFC proposes adding the !vcall_visibility metadata to vtable
> definitions when -fwhole-program-vtables is specified. Unlike for VFE, the
> function pointer loads can still use normal loads with corresponding type
> test assume sequences (better for optimization).
>
> Additional changes to the LTO compilation steps are detailed below.
>
> Pre-Link LTO Compile
> --------------------
>
> First, type test assume sequences will be inserted when
> -fwhole-program-vtables is specified, and not just for classes with hidden
> LTO visibility.
>
> Second, as mentioned earlier, the !vcall_visibility metadata will be
> inserted under -fwhole-program-vtables. For the purposes of index-only WPD,
> a single-bit flag indicating whether or not the vtable def has hidden LTO
> visibility is added to the GVarFlags on the GlobalVarSummary. Note that we
> can collapse the 3 enum values of the metadata down to a single bit,
> because for the purposes of devirtualization, both
> VCallVisibilityLinkageUnit and VCallVisibilityTranslationUnit can be
> treated the same (we only need to have at least VCallVisibilityLinkageUnit
> to devirtualize). The ModuleSummaryIndex builder will set this new flag
> from the !vcall_visibility metadata on vtable definitions.
>
> Finally, the VFE support in GlobalDCE (which is enabled by default and
> currently triggers automatically in the presence of this metadata), will
> need to be modified to ignore !vcall_visibility metadata inserted for
> devirtualization only, i.e. when there are any type test assume sequences
> for that Type ID. This should be straightforward, as we can scan the type
> tests and remove any vtables decorated with compatible type ids from
> VFESafeVTables. Note that this change will affect the invocation of
> GlobalDCE both here in the pre-link LTO compile as well as later in the LTO
> Backend (where it is applied to a broader set of vtables).
>
> LTO Link Handling
> -----------------
>
> During Whole Program Devirtualization analysis, when looking at the
> vtables corresponding to the summarized virtual calls during
> tryFindVirtualCallTargets, we must consult the vcall_visibility
> information. For hybrid (regular+thin) LTO, the vtable definitions are in
> the regular LTO partition and so the IR can be consulted directly. For
> index-only WPD, we instead consult the flag on the vtable’s
> GlobalVarSummary.
>
> If any of the vtable definitions compatible with a given virtual call have
> public LTO visibility, the devirtualization must be skipped.
>
> By default, only classes that have statically determined hidden LTO
> visibility would be allowed to devirtualize. However, as noted earlier, we
> want to enable more aggressive devirtualization at LTO link time when we
> know that the linking mode guarantees full LTO visibility of any code that
> may derive classes from the bitcode being linked. To do so, we will add a
> new linker option:
>
> For lld, the proposed option is: -lto-whole-program-visibility.
> For gold, the corresponding plugin option would be
> “whole-program-visibility”.
>
> When this option is set, LTO will convert all vtable definitions to have
> hidden LTO visibility before invoking Whole Program Devirtualization. In
> the hybrid LTO case this would mean changing the metadata on the IR. In the
> index-only case this would be done in the summaries.
>
> LTO Backend Handling
> --------------------
>
> No changes are required in the LTO backend’s invocation of Whole Program
> Devirtualization, since any visibility constraints are enforced at LTO link
> time, and the loosening of visibility under the new link option only needs
> to affect the LTO WPD invocation.
>
> As mentioned earlier when describing the pre-link LTO compile changes,
> GlobalDCE will be changed to ignore vtables with !vcall_visibility metadata
> corresponding to type tests (and not just type checked loads).
>
> Status
> ------
>
> These changes have been prototyped and tested with index-only WPD (with
> the exception of the proposed changes to GlobalDCE, at the moment I have
> been testing with -enable-vfe=false). I will be cleaning up the changes and
> sending patches for review in the coming days.
>
> --
> Teresa Johnson | Software Engineer | tejohnson at google.com |
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191226/b1ffa804/attachment-0001.html>