# [llvm-dev] [RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model

Chad Rosier via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 4 08:56:56 PDT 2017

```All,

I'm working on an improvement to the inline cost model, but I'm unsure
how to proceed.  Let me begin by first describing the problem I'm trying
to solve.  Consider the following pseudo C code:

*typedef struct element {
unsigned idx;
} element_t;
*

*static inline
unsigned char fn2 (element_t *dst_ptr, const element_t *a_ptr,
const element_t *b_ptr, unsigned char changed) {
if (a_ptr && b_ptr && a_ptr->idx == b_ptr->idx) {
if (!changed && dst_ptr && dst_ptr->idx == a_ptr->idx) {
/* Do something */
} else {
changed = 1;
if (!dst_ptr)
dst_ptr = fn3();
else
dst_ptr->idx = a_ptr->idx;
/* Do something. */
}
} else {
changed = fn4();
}
return changed;
}

unsigned char fn1 (element_t *a_ptr, element_t *b_ptr) {
unsigned char changed = 0;
while (b_ptr) {
if (!a_ptr || a_ptr->idx == b_ptr->idx) {
changed = fn2 (a_ptr, a_ptr, b_ptr, changed);
b_ptr = b_ptr->next;
}
}
return changed;
}*

When the inline cost model computes the inline cost of fn2 it ends up
being much higher than the inline threshold.  A fair amount of the cost
is due to the inlining of fn3 into fn2. However, if fn2 had been inlined
into fn1 the code from fn3 would have been removed as dead/unreachable.

At the fn2 call site notice the first two arguments are equal.  Thus, in
the context of fn2 dst_ptr and a_ptr are equal.  The call site of fn3 is
predicated on dst_ptr being null (i.e., if (!dst_ptr) dst_ptr = fn3()),
but that code is predicated on a_ptr being non-null.  Therefore, we know
the condition !dst_ptr is false (because a_ptr == dst_ptr and a_ptr is
non-null) and the call to fn3 is dead.  I suspect one of JumpThreading,
EarlyCSE, or GVN does the elimination after inlining, so that's what I'd
like to try and model in the inline cost model.  (Note fn2 has multiple
call sides and the property that the first and second arguments are
equal isn't true for each call site, so something like IPSCCP doesn't
actually help, AFAICT).

My first attempt at solving this problem did something similar to what
it tried to prove that dst_ptr was non-null based on a dominating
condition.  The only tricky parts were to deal with hammocks/diamonds
when walking up the CFG (See: https://reviews.llvm.org/D36287 as a
concrete example of how I proposed to get an immediate dominator without
the domtree) and to account for the fact that dst_ptr and a_ptr are equal.

I'm pretty sure I can get this approach to work, however, I'm not
convinced it's really extensible or general.  Should we consider using
the full dominator tree in the inline cost model to capture this?

If you have any thoughts on how to tackle the problem, I would love to