[llvm-dev] An issue with new PM's requirements on call graph changes

Mon Jul 3 20:41:09 PDT 2017

On Mon, Jul 3, 2017 at 4:43 PM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On 06/30/2017 03:02 AM, Chandler Carruth wrote:
>
>> I have hit a fairly isolated practical issue deploying the new PM, but it
>> does point to a latent theoretical issues as well. I see various ways to
>> address it, but want feedback from others before moving forward.
>>
>> The issue is that we can introduce out-of-thin-air calls to known library
>> functions (`SimplifyLibCalls`, etc). These can be introduced in function
>> passes (`InstCombine` in particular) and that seems highly desirable.
>>
>> These all look like one of these cases:
>> 1a) Introducing a new call to an LLVM intrinsic
>> 1b) Replacing an existing call with a call to an LLVM intrinsic
>> 2a) Introducing a new call to a declared library function (but not
>> defined)
>> 2b) Replacing an existing call with a call to a declared library function
>> 3a) Introducing a new call to a defined library function
>> 3b) Replacing an existing call with a call to a defined library function
>>
>> Both #1 and #2 are easy to handle in reality. Intrinsics and declared
>> functions don't impact the PM's call graph because there is no need to
>> order the walk over them. But #3 is a real issue.
>>
>> The only case I have found that actually hits #3 at all hits #3b when
>> building FORTIFY code with the new pass manager because after inlining we
>> do a lot of (really nice) optimizations on library calls to remove
>> unnecessary FORTIFY checks. But this is in *theory* a problem when LTO-ing
>> with libc. More likely it could be a problem when LTO-ing with a vector
>> math library.
>>
>
> This latter case concerns me most. When the vectorizer creates the
> vectorized version of a loop, that's new code (the original code for the
> loop stays in place as a fall back). Further, the vectorizer can (today)
> create calls to vector math library functions, and a setup where we LTO
> with the definitions of those functions is certainly possible (and
> desirable). As a result, this issue does not seem all that theoretical to
> me.

> Moreover, once we have support for OpenMP simd functions, we'll end up in
> exactly this situation on a regular basis, and we can't have intrinsics for
> all of the possible user-defined functions

Can you clarify how OpenMP simd would require introducing out-of-thin-air
references to an open-ended set of user-defined functions?

> (unless the intrinsic just takes a function pointer and we clean it up
> afterwards somehow).

Note that simply referencing a function pointer out-of-thin-air would still
run afoul of the same issue. It would constitute a ref edge and have
similar implications as a direct call, at least as far as the fundamental
problem here is concerned (guaranteeing bottom-up iteration order). So an
intrinsic taking a function pointer wouldn't really circumvent the issue
(if I understand correctly what you're saying).

-- Sean Silva

> In short, I think we do need to correctly handle this situation.
>
> FWIW, I can also see this situation come up in other instrumentation cases
> as well. There are plenty of cases where it is useful to LTO with a runtime
> library.
>
> Thanks again,
> Hal
>
>
>> So what do we do?
>>
>> My initial idea: find all *defined* library functions in the module, and
>> every time we create a ref edge to one of them, synthesize a ref edge to
>> all of them. This should completely solve #3b above. But it doesn't really
>> address #3a at all.
>>
>> Is that OK? It would be very convenient to say that if we want to
>> introduce truly novel and new calls to library functions, we should have an
>> LLVM intrinsic to model those routines.
>>
>> But we actually have an example (I think) of #3a, introducing a call to a
>> library function out of the blue: memset_pattern. =/
>>
>> The only way I see to reasonably handle #3a is to have *every* function
>> implicitly contain a reference edge to every defined library function in
>> the module. This is, needless to say, amazingly wasteful. Hence my email.
>> How important is this?
>>
>> If we need to correctly handle this, I think I would probably implement
>> this by actually changing the *iteration* of reference edges in the graph
>> to just implicitly walk the list of defined library functions so that we
>> didn't burn any space on this. But it will make iteration of reference
>> edges slower and add a reasonable amount of complexity. So I'd like to hear
>> some other opinions before going down either of these roads.
>>
>>
>> Thanks,
>> -Chandler
>>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170703/0ff500f9/attachment-0001.html>