[llvm-dev] [RFC] CFI for indirect calls with ThinLTO

Evgenii Stepanov via llvm-dev llvm-dev at lists.llvm.org
Tue May 16 16:33:00 PDT 2017


On Mon, May 15, 2017 at 6:44 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:
> Thanks for sending this out. A few comments below.
>
> On Mon, May 15, 2017 at 5:17 PM, Evgenii Stepanov via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> this is a proposal for the implementation of CFI-icall [1] with ThinLTO.
>>
>> Jumptables are generated in the merged module. To generate a
>> jumptable, we need a list of functions with !type annotations,
>> including (in non-cross-dso mode) external functions. Unfortunately,
>> LLVM IR does not preserve unused function declarations, and we don’t
>> want to copy the actual function bodies to the merged module.
>>
>> Indirect call targets can be represented in the following way using
>> named metadata:
>>
>> void foo() {}
>> int bar() { return 0; }
>>
>> # Merged module
>> !cfi.functions = !{!1, !3}
>> !1 = !{!"bar", i8 0, !2}
>> !2 = !{i64 0, !"_ZTSFiE"}
>> !3 = !{!"foo", i8 0, !4}
>> !4 = !{i64 0, !"_ZTSFvE"}
>
>
> Presumably there would be no entries in !cfi.functions for functions defined
> in the merged module, as the type metadata would come from the module
> itself.

Right. The same as with vtable CFI, LowerTypeTests will use
!cfi.functions in addition to the regular logic.

>>
>>
>> Each function is described by a tuple of
>> * Promoted name as a string
>
>
> I imagine that we would only promote a function if it is address-taken.
> Otherwise we could be inhibiting optimization significantly.

Yes. Cfi.functions would include all external functions +
address-taken internal functions. We could also do global analysis
(i.e. skip jumptable for hidden non-address-taken functions), but that
needs more information passed to the combined module (or summary).

>
>> * Linkage (see below)
>> * Type(s)
>>
>>
>> A function can have multiple types. In the Cross-DSO mode each
>> function has a second “external” numeric type, and we might want to
>> allow “relaxed” type checking in the future where a function could
>> conform to multiple types. In that case the metadata would look like
>> this:
>>
>> !4 = !{!"bar", i8 0, !5, !6}
>> !5 = !{i64 0, !"_ZTSFiE"}
>> !6 = !{i64 0, i64 751454132325070187}
>>
>> “Linkage” is one of: definition, external declaration, external weak
>> declaration.
>>
>> In the merged “merged” module, !cfi.functions may contain multiple
>> entries for each function. We pick one with the strongest linkage
>> (i.e. the definition, if it is available) in LowerTypeTests.
>
>
> It's unfortunate that this design effectively requires that the
> LowerTypeTests pass recompute the linkage for each symbol, as the linker
> already knows this information (and could, in principle, provide it to the
> pass). But I'm not sure if there's a better way to do it.
>
>>
>>
>> The LTO step emits, for a defined function named “f”:
>> declare void f.cfi()
>> .jumptable:
>>>>     call f.cfi
>>     ...
>> f.cfi-jt = alias .jumptable + offset
>> f = alias f.cfi-jt
>>
>> The same for an external (either weak or strong) declaration of a
>> function named “f”:
>> .jumptable:
>>>>     call f
>>     ...
>> f.cfi-jt = alias .jumptable + offset
>>
>
> One thing to be careful about is summary-based dead stripping: the pass
> needs to be able to query whether any specific function is still live in
> order to avoid introducing undefined symbol references. I think we can do
> that by adding a Live flag to GlobalValueSummaryInfo (which I think should
> also let us fix a number of FIXMEs elsewhere, e.g.
> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#1447
> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/WholeProgramDevirt.cpp#1329),
> and have the pass check the flag for each function.

Sounds good.

>> Weak external linkage is used in the lowering of uses of @f. This is
>> done both in the merged module and in ThinLTO backends. Uses of strong
>> definitions and declarations are replaced with f.cfi-jt. Uses of weak
>> external declarations a replaced with (f ? f.cfi-jt : 0) instead.
>>
>>
>> ThinLTO backends need to know which functions have jumptable entries
>> created for them (they will need to be RAUWed with f.cfi-jt). In the
>> Cross-DSO mode, external functions don’t get jumptable entries. This
>> information is passed back from the LTO step through combined summary.
>> The current idea is to add a new record, FunctionTypeResolution, which
>> would contain a set of function names in the jumptable.
>
>
> It occurred to me that this design could prevent inlining of indirect calls
> via constant propagation. For example, suppose that we have a module that
> looks like this:
>
> define void @f() {
>   ret void
> }
>
> define void @g() {
>   %fp = call i8* @identity(i8* @f)
>   call void %fp()
> }
>
> and a second module:
>
> define i8* @identity(i8* %ptr) {
>   return %ptr
> }
>
> and @identity is imported into the first module. Now I think the first
> module would look like this after optimization:
>
> define void @f.cfi() {
>   ret void
> }
>
> declare void @f.cfi-jt()
>
> define void @g.cfi() {
>   call void @f.cfi-jt()
> }
>
> So we cannot inline f.cfi into g.cfi, as the optimizer does not know that
> f.cfi-jt can be replaced with f.cfi. I'm not sure how likely this would be
> in practice, but something to keep in mind.
>
> Peter
>
>>
>> == Alternatives
>>
>> Function type information can be passed in the summary, as a list of
>> records (name, linkage, type(, type)*).
>> * Type can be either a string or a number. This complicates the encoding.
>> * The code in LowerTypeTests works with !type metadata in the same
>> format as described above. It would need to either recreate the
>> metadata from the summary, or deal with different input formats.
>> I don’t see any advantages to this encoding. Could it be more compact
>> than the metadata approach?
>>
>> [1]
>> https://clang.llvm.org/docs/ControlFlowIntegrity.html#indirect-function-call-checking
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
> --
> Peter


More information about the llvm-dev mailing list