[llvm-dev] RFC: We need to explicitly state that some functions are reserved by LLVM

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Sat Oct 28 18:31:43 PDT 2017


On 10/28/2017 04:45 AM, Michael Kruse wrote:
> 2017-10-28 3:30 GMT+02:00 Hal Finkel <hfinkel at anl.gov>:
>> On 10/27/2017 07:51 PM, Michael Kruse wrote:
>>> 2017-10-27 20:31 GMT+02:00 Hal Finkel via llvm-dev
>>> <llvm-dev at lists.llvm.org>:
>>>> I agree. Marking external functions from system headers seems like a
>>>> reasonable heuristic. We'd need some heuristic because it's not
>>>> reasonable
>>>> for the frontend to know about every function the optimizer knows about.
>>>> Over-marking seems okay, however.
>>> Sorry for the naive question, why is it unreasonable for the frontend
>>> to know about special functions? It is the frontend who defines a
>>> source language function's semantics. Clang also has access (or can be
>>> made to can get access) to TargetLibraryInfo, no?
>>
>> I think this depends on how we want to define the separation of concerns.
>> The optimizer has knowledge about many special functions. This list is
>> non-trivial in size and also varies by target/environment. It is not
>> reasonable to duplicate this list both in the optimizer and in all relevant
>> frontends (which include not only things like Clang but also a whole host of
>> other code generators that produce code directly calling system-library
>> functions). Note that the optimizer sometimes likes to create calls to these
>> functions, based only on its knowledge of the target/environment, without
>> them ever been declared by the frontend.
>>
>> Now, can the list exist in the optimizer and be queried by the frontend?
>> Sure. (*) It's not clear that this is necessary or useful, however. Clang,
>> for example, would need to distinguish between functions declared in system
>> headers and those that don't. This, strictly speaking, does not apply to
>> functions that some from the C standard (because those names are always
>> reserved), but names that come from POSIX or other miscellaneous system
>> functions, can be used by well-formed programs (so long as, in general, they
>> don't include the associated system headers). As a result, Clang might as
>> well mark functions from system headers in a uniform way and let the
>> optimizer do with them what it will. It could further filter that marking
>> process using some callback to TLI, but I see no added value there.
>> Similarly, a custom code generator can mark functions it believes will be
>> resolved to system functions.
>>
>> (*) Although we need to be a bit careful to make sure that all
>> optimizations, including custom ones, plugins, etc. register all of their
>> relevant functions with TLI, and TLI isn't really setup for this (yet).
> Thank you for the answer.
>
>>> The most straightforward solution seems to have an intrinsic for every
>>> function that has compiler magic, meaning every other function is
>>> ordinary without worrying about hitting a special case (e.g. when
>>> concatenating strings to create new function names when outlining).
>>> Recognizing functions names and assuming they represent the semantics
>>> from libs seems "unclean", tying LLVM IR more closely to C and a
>>> specific platform's libc/libm than necessary.
>>>
>>> "malloc" once had an intrinsic. Why was it removed, and recognized by
>>> name instead?
>>
>> You want to have intrinsics for printf, getenv, and all the rest? TLI
>> currently recognizes nearly 400 functions (see
>> include/llvm/Analysis/TargetLibraryInfo.def).
> intrinsics.gen currently already has 6243 intrinsics (most of them
> target-dependent). Would 400 additional ones be that significant?

Yes. Each of those intrinsics needs documentation, code to form the 
intrinsics, code for validation and lowering, etc. You'll end up with 
phase-ordering effects in the face of indirect-to-direct call promotion 
combined with CSE, etc. Plus, if we LTO in libc, then the optimizer 
loses the ability to inline the function implementations without 
additional logic.

  -Hal

>
> Michael

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list