[llvm-dev] ThinLTO + CFI

Vlad Tsyrklevich via llvm-dev llvm-dev at lists.llvm.org
Mon Apr 30 17:43:03 PDT 2018


Hi Dmitry, the direct call patch seems like a good start; however, by
falling through to the F->replaceUsesExceptBlockAddr(FDecl) at the bottom
of LowerTypeTestsModule::importFunction() I believe it will replace all
uses of a function (for a definition outside the TU it's defined in) with
the direct function reference instead of just direct calls. Also, I believe
the logic for replaceDirectCalls() should be more restrictive, currently it
will also rewrite calls where the function use is as a function pointer
argument instead of the call target.

On Mon, Apr 30, 2018 at 1:05 PM <dmitry.mikulin at sony.com> wrote:

> Replacing direct calls to jump table entries with calls to real targets
> recovers most of the performance loss in my benchmark. Dealing with link
> order files is a bit cumbersome though: I can’t keep both, say “foo” and
> “foo.cfi” next to each other in link order, because if both exist, the
> linker reorders the jump table next to the real function. This is not what
> we want when the goal is to get rid of calls through jump tables. I had to
> manually pick the functions which were renamed to “foo.cfi” and replace
> them in the link order file. The process can be automated, but it’s a bit
> flaky if we have to do it to make CFI work.
>
> I attached 2 patches here: one is the direct call replacement and the
> other is moving type test lowering pass to run later in the pipeline.
> Interestingly, running the pass later seems to help with performance as
> well. Though the comment in PassManagerBuilder implies that this pass needs
> to run early. Why is it different from full LTO?
>
>
>
>
>
> > On Apr 27, 2018, at 11:04 AM, via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
> >
> > For the test case below, I get the following IR for main() on entry to
> ThinLTO backend invocation of LowerTypeTestsModule::lower(). This is after
> I moved this pass down in the pipeline so it’s invoked after inlining.
> >
> > The declarations for foo() and bar() are read in at the time of module
> import, Importer.importFunctions() in lto::thinBackend(). They do not have
> type metadata attached to them.
> > In lowerTypeTestCall() we check if the pointer in the type test is of a
> known type, so we look at bitcast and then select operands. foo and bar in
> select are global objects with no type metadata, so the type check cannot
> be guaranteed to be true and it can’t be eliminated. In full LTO case this
> works as expected: both foo and bar are the same known type and type check
> is gone.
> >
> > Maybe the problem is not with renaming but with the missing type
> metadata in this particular case.
> > Though having so many direct calls routed through the jump table still
> seems problematic. Is there a feasible solution?
> >
> > --------------------------------
> > define hidden i32 @main(i32, i8** nocapture readnone) local_unnamed_addr
> #0 !type !3 !type !4 {
> >  %3 = add nsw i32 %0, -2
> >  store i32 %3, i32* @i, align 4, !tbaa !5
> >  %4 = icmp sgt i32 %0, 1
> >  %5 = select i1 %4, i32 ()* @foo, i32 ()* @bar
> >  %6 = bitcast i32 ()* %5 to i8*, !nosanitize !9
> >  %7 = tail call i1 @llvm.type.test(i8* nonnull %6, metadata
> !"_ZTSFivE"), !nosanitize !9
> >  br i1 %7, label %10, label %8, !prof !10, !nosanitize !9
> >
> > ; <label>:8:                                      ; preds = %2
> >  %9 = ptrtoint i32 ()* %5 to i64
> >  tail call void @__ubsan_handle_cfi_check_fail_abort(i8* getelementptr
> inbounds ({ i8, { [4 x i8]*, i32, i32 }, { i16, i16, [13 x i8] }* }, { i8,
> { [4 x i8]*, i32, i32 }, { i16, i16, [13 x i8] }* }*
> @anon.fad58de7366495db4650cfefac2fcd61.1, i64 0, i32 0), i64 %9, i64 undef)
> #4, !nosanitize !9
> >  unreachable, !nosanitize !9
> >
> > ; <label>:10:                                     ; preds = %2
> >  %11 = tail call i32 %5() #5
> >  ret i32 %11
> > }
> >
> > . . .
> > declare hidden i32 @foo() #3
> > declare hidden i32 @bar() #3
> >
> >
> > —————————test case---------------
> > b.c
> > =================
> > typedef int (*fptr_t) (void);
> > fptr_t get_fptr();
> > extern int i;
> >
> > int main(int argc, char *argv[])
> > {
> >  i = argc - 2;
> >  fptr_t fp = get_fptr();
> >  return fp();
> > }
> >
> > v.c
> > ================
> > int i;
> > typedef int (*fptr_t) (void);
> > int foo(void) {  return 11; }
> > int bar(void) {  return 22; }
> > fptr_t get_fptr(void) {  return (i >= 0) ? foo : bar; }
> >
> >
> >> On Apr 26, 2018, at 5:29 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
> >>
> >>> We could probably tolerate a certain amount of unused jump table
> entries. However, I just realized that all non-inline imported calls end up
> going through a jump table entry. Is that correct?
> >>
> >> In fact it is all calls that go through a function pointer type that is
> used anywhere in the program for an indirect call, but depending on your
> program that could be very close to "yes".
> >>
> >>> Moving the type check lowering pass further down the pipeline (after
> inlining) still does not solve the problem because CFI renaming happens
> early and symbols attached to the jump table do not have a matching type.
> >>
> >> As far as I know, renaming happens during the LowerTypeTests pass,
> after the type checks are lowered.
> >> Lowering:
> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#1620
> >> Renaming:
> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#1642
> >> Do you have an example of what you are seeing?
> >>
> >> Peter
> >>
> >> On Thu, Apr 26, 2018 at 4:54 PM,  <dmitry.mikulin at sony.com> wrote:
> >> Hi Peter,
> >>
> >> We could probably tolerate a certain amount of unused jump table
> entries. However, I just realized that all non-inline imported calls end up
> going through a jump table entry. Is that correct? Initially I thought you
> meant calls promoted from indirect. While this can be fixed by replacing
> direct calls to jump tables with direct calls to real targets, I found
> other cases where ThinLTO+CFI has issues.
> >>
> >> In ThinLTO backend, type test lowering happens very early in the
> pipeline, before inlining. When the type check after the call to get_fptr()
> is lowered (in my original example, below), the compiler cannot see that
> both targets belong to the same type and that the type check will always
> return ‘true’ and can be eliminated. Moving the type check lowering pass
> further down the pipeline (after inlining) still does not solve the problem
> because CFI renaming happens early and symbols attached to the jump table
> do not have a matching type.
> >>
> >> I’m trying to think if there’s a way to delay renaming until ThinLTO
> backend type check lowering pass. It would help with solving both problems.
> >>
> >> Thanks.
> >> Dmitry.
> >>
> >>
> >>
> >>>>>>
> >>>>>> a.c
> >>>>>> =============================
> >>>>>> typedef int (*fptr_t) (void);
> >>>>>> fptr_t get_fptr();
> >>>>>> int main(int argc, char *argv[])
> >>>>>> {
> >>>>>> fptr_t fp = get_fptr();
> >>>>>> return fp();
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> b.c
> >>>>>> =============================
> >>>>>> typedef int (*fptr_t) (void);
> >>>>>> int foo(void) { return 11; }
> >>>>>> int bar(void) { return 22; }
> >>>>>>
> >>>>>> static fptr_t fptr = bar;
> >>>>>> static int i = 53;
> >>>>>>
> >>>>>> fptr_t get_fptr(void)
> >>>>>> {
> >>>>>> if (i >= 0)
> >>>>>>   fptr = foo;
> >>>>>> else
> >>>>>>   fptr = bar;
> >>>>>>
> >>>>>> return fptr;
> >>>>>> }
> >>>>>>
> >>
> >>
> >>
> >>
> >>> On Apr 19, 2018, at 6:18 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
> >>>
> >>> Regarding the orderfile, yes, I was thinking more about ordering the
> real functions.
> >>>
> >>> In that case it sounds like your best option may be to implement the
> optimization pass to make direct calls go directly to the real function.
> From a performance perspective I don't think it would make much difference
> if there are unused jump table entries.
> >>>
> >>> Peter
> >>>
> >>> On Thu, Apr 19, 2018 at 6:09 PM, via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
> >>> Teresa, Peter,
> >>>
> >>> Thanks for your help!
> >>> I need to re-run my experiments as the compiler I used did not have
> the latest changes like r327254.
> >>> The fact that the decision about routing calls through jump table
> entries is made early may be problematic. In my experiments with FreeBSD
> kernel, ThinLTO produced thousands jump table entries compared to only
> dozens with full LTO. As for re-ordering jump table entries, I don’t think
> it’s going to work as they are placed in the same section. Including *.cfi
> names into a link order file will take care of re-ordering real functions
> routed through jump table entries, but in our case we need to force some
> functions to be on the same page. So not having jump table entries for the
> functions that don't really need them would be ideal.
> >>>
> >>> Thanks.
> >>> Dmitry.
> >>>
> >>>
> >>>> On Apr 18, 2018, at 6:11 PM, Teresa Johnson <tejohnson at google.com>
> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Apr 18, 2018 at 4:49 PM,  <dmitry.mikulin at sony.com> wrote:
> >>>> Hi Teresa,
> >>>>
> >>>> Thanks for the info!
> >>>> This example is my attempt to reduce FreeBSD kernel to something more
> manageable :)
> >>>>
> >>>> I will take a look at why globals are not being imported in this
> case. What’s the best tool to look into ThinLTO objects and their
> summaries? Most dumping tools don’t seem to like ThinLTO bitcode files…
> >>>>
> >>>> Sadly there isn't a really great way to dump the summaries. =( There
> was a patch awhile back by a GSOC student to dump in YAML format, but there
> was resistance from some who preferred dumping to llvm assembly via
> llvm-dis and support reading in the summary from llvm assembly. It's been
> on my list of things to do, hasn't yet risen high enough in priority to
> work on that. For now, you have to use llvm-bcanalyzer -dump and look at
> the raw format.
> >>>>
> >>>> Teresa
> >>>>
> >>>>
> >>>> Hopefully Peter can chime in regarding CFI related issues.
> >>>>
> >>>> Thanks.
> >>>> Dmitry.
> >>>>
> >>>>
> >>>>> On Apr 17, 2018, at 9:37 AM, Teresa Johnson <tejohnson at google.com>
> wrote:
> >>>>>
> >>>>> Hi Dmitry,
> >>>>>
> >>>>> Sorry for the late reply. For CFI specific code generation, pcc is a
> better person to answer. But on the issue of global variables being
> optimized, that hasn't happened yet. That would be great if you wanted to
> pick that up!
> >>>>>
> >>>>> In your original email example, it seems like the file static i=53
> could be constant propagated since there are no other defs, and the code in
> get_fptr simplified during the compile step, but I assume this is part of a
> more complex example where it is not possible to do this? Also note that
> with r327254 we started importing global variables. Do you know why we
> don't import in your case? I wonder if it has to do with it being CFI
> inserted code?
> >>>>>
> >>>>> Teresa
> >>>>>
> >>>>> On Tue, Apr 17, 2018 at 9:17 AM <dmitry.mikulin at sony.com> wrote:
> >>>>> I watched  Teresa’s talk on ThinLTO from last year’s CppCon, and it
> sounded like adding global variable information to the summaries was in the
> works, or at least in planning. Can someone (Teresa?) please share the
> current status? If it’s part of future plans, are there any specific
> proposals that can be picked up and worked on?
> >>>>>
> >>>>> Thanks!
> >>>>>
> >>>>>
> >>>>>> On Apr 9, 2018, at 6:51 PM, via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I’m working on setting up ThinLTO+CFI for a C application which
> uses a lot of function pointers. While functionally it appears stable, it’s
> performance is significantly degraded, to the tune of double digit
> percentage points compared to regular LTO+CFI.
> >>>>>>
> >>>>>> Looking into possible causes I see that under ThinLTO+CFI iCall
> type checks almost always generate jump table entries for indirect calls,
> which creates another level of indirection for every such call. On top of
> that it breaks the link order layout because real function names point to
> jump table entries. It appears that I’m hitting a limitation in ThinLTO on
> how much information it can propagate across modules, particularly
> information about constants. In the example below, the fact that “i” is
> effectively a constant, is lost under ThinLTO, and the inlined copy of
> b.c:get_fptr() in a.c does not eliminate the conditional, which, for CFI
> purposes requires to generate a type check/jump table.
> >>>>>>
> >>>>>> I was wondering if there was a way to mitigate this limitation.
> >>>>>>
> >>>>>> a.c
> >>>>>> =============================
> >>>>>> typedef int (*fptr_t) (void);
> >>>>>> fptr_t get_fptr();
> >>>>>> int main(int argc, char *argv[])
> >>>>>> {
> >>>>>> fptr_t fp = get_fptr();
> >>>>>> return fp();
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> b.c
> >>>>>> =============================
> >>>>>> typedef int (*fptr_t) (void);
> >>>>>> int foo(void) { return 11; }
> >>>>>> int bar(void) { return 22; }
> >>>>>>
> >>>>>> static fptr_t fptr = bar;
> >>>>>> static int i = 53;
> >>>>>>
> >>>>>> fptr_t get_fptr(void)
> >>>>>> {
> >>>>>> if (i >= 0)
> >>>>>>   fptr = foo;
> >>>>>> else
> >>>>>>   fptr = bar;
> >>>>>>
> >>>>>> return fptr;
> >>>>>> }
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> LLVM Developers mailing list
> >>>>>> llvm-dev at lists.llvm.org
> >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Teresa Johnson |   Software Engineer |      tejohnson at google.com
> |   408-460-2413 <(408)%20460-2413>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Teresa Johnson |    Software Engineer |      tejohnson at google.com
> |   408-460-2413 <(408)%20460-2413>
> >>>
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> --
> >>> Peter
> >>
> >>
> >>
> >>
> >> --
> >> --
> >> Peter
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180501/3157d69e/attachment.html>


More information about the llvm-dev mailing list