[llvm-dev] [RFC] inlining and mismatched function attributes

Martin Liška via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 16 06:26:19 PDT 2021


On 6/24/21 8:52 PM, Nick Desaulniers wrote:
> Hey folks,
> It was suggested that I bring up a peculiarity from an individual code
> review to the list.

Hello.

You raise bunch of very interesting question I must say.

> 
> We have a couple function attributes (in C) where inlining can produce
> unexpected or surprising results.
> 
> Let's take for example, stack protectors, which help me prevent stack
> buffer overruns from rewriting a return address pushed on the stack
> (sometimes). Now let's say I'm an operating system kernel and I need
> to restore register state after suspend (ie. during resume).  That
> register state would conflict with the generated stack protector code,
> so I use a function attribute like __attribute__((no_stack_protector)
> to prevent my compiler from inserting such code in my resume handler
> which has to initialize a new stack canary.  But I wasn't careful to
> check that all of the functions I call from said handler have a
> matching function attribute, and due to inlining, I wind up with a
> stack protector, even though I requested a function not have a stack
> protector!  Suddenly I load the incorrect stack canary value from
> garbage register state, fail the stack check guard, and now I fail to
> resume due to a kernel panic. (True story)

Just adding a note, where Jakub disagrees with blocking of the inlining
for the stact protector attribute:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722#c9

> 
> Another stack protector issue that came up recently has to do with
> forking "zygote-like" processes, and having unique stack protector
> values per process.  Point being, there are a few cases where we want
> stack protectors generally, but need the fine grain control provided
> by function attributes.  Moving code into separate translation units
> in order to use different flags, is tedious and commonly runs into
> issues with LTO in LLVM when flags that aren't encoded in IR are
> dropped by LTO.
> 
> Similar problems crop up again with coverage, and profiling
> instrumentation in "delicate" code.  Sanitizers, shadow call stack,
> and a few others run into similar situations.
> 
> As a developer without knowledge of my toolchain, how do I go about
> debugging this myself?  (Debugging the suspend/resume issue was not
> fun).
> 
> What are some ways developers can fix this?  Well, if I know my call
> chain, I can go through and start marking my callees
> __attribute__((no_stack_protector)) or __attribute__((noinline))
> recursively.  Eventually, I should get to the point where none of my
> callees have stack protectors or are inlined. But that strips off more
> stack protectors perhaps than I'd like, and leaves a lot of code
> unprotected.  This begins to feel like "what color is your function?"
> "Red" functions may only call other "red" functions; "blue" functions
> may only call other "blue" functions.
> https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/
> 
> Maybe my callees have different callers with different constraints.
> Aliases can be used for callees.  Example:
> 
> // has a stack protector
> void callee_bad(void) { ... }
> 
> __attribute__((alias("callee_bad"),noinline))
> void callee_good(void);
> 
> // requires no stack protector
> __attribute__((no_stack_protector))
> void caller(void) {
>    callee_good();
> }
> 
> However, caller() could be inlined into yet another (not defined
> above) call site that does have/need a stack protector.
> 
> And other call sites (that don't care about stack protectors) can just
> call callee_bad like normal.  If I don't have too many callees, it's
> not too bad.  But very quickly it can become difficult as a developer
> to tell which callee was problematic.
> 
> Another possibility is that the compiler could have knowledge of
> certain conflicting function attributes, and skip inline substitution
> of callee into caller upon mismatch.  We provide "remarks" to help
> developers understand why inline substitution failed, if they care to
> know why.  ie.
> $ clang -O1 -O1 -Rpass-missed=inline foo.c
> foo.c:6:10: remark: foo not inlined into bar because it should never
> be inlined (cost=never): <really good reason why here>
> 
> Indeed, that's exactly the tact I took with no_stack_protector in
> commit bc044a88ee3c ("[Inline] prevent inlining on stack protector mismatch").
> 
> It's what GCC is proposing for
> __attribute__((no_profile_instrument_function)) (coverage and
> profiling) in:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573511.html
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223
> and I'm looking to match in
> https://reviews.llvm.org/D104810 .
> 
> We already have precedence for this in LLVM; checkout `CompatRule` in
> llvm/include/llvm/IR/Attributes.td and
> `llvm::getAttributeBasedInliningDecision` in
> llvm/lib/Analysis/InlineCost.cpp.
> 
> I'd argue it's also less surprising for a function not to be inlined
> than for a stack protector or profile/coverage instrumentation to show
> up when a function is explicitly attributed to not have those things.
> These attributes describe a desired property of the function. If
> inlined, we wish the new copies have the same property. However,
> function attribute applies to the whole function, not a subset of
> instructions.
> 
> But, now we're making IR function attributes that could have been
> orthogonal (nossp, noinline) entangled.  That hurts the composition of
> such attributes.  (I will note, nossp does not unconditionally imply
> noinline; it implies "noinline for mismatched callers, and noinline on
> callees that are mismatched, on a per call site basis."
> __attribute__((always_inline)) will override this exception, and we
> don't provide helpful diagnostics in such case; good luck! :-/ )  We
> likely will have such conflicts with additional function attributes in
> the future. Should the inliner scan the caller and callee at every
> call site for such attribute lists? Should the LangRef document such
> inlining behavior?
> 
> Another concern is diverging from GCC here; while we're both
> discussing no_profile_instrument_function it would be good to gather
> more feedback.

About this attribute, it's definitely something different as no_stack_protector
is a release feature, while no_profile_instrument_function is used for PGO.

My understanding of the attribute is that it's used for super-busy functions
which would slow down a training run significantly. So GCC approach would
be to allow inlining of a function after PGO instrumentation.

Martin

> 
> Sometimes, I wish in C we had the ability to express at a given
> callsite that inlining the callee should not occur.
> 
> You can see more discussion
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223 .  Surely there are
> additional solutions we have not yet conceived of. At this point, I
> don't particularly care what color we paint this bikeshed; I just need
> a bikeshed built so that my kernel boots.  We all want all software to
> be better, and can have a gentle-person's disagreement. Thoughts?
> 



More information about the llvm-dev mailing list