[llvm-dev] RFC: We need to explicitly state that some functions are reserved by LLVM

Sat Oct 28 02:45:47 PDT 2017

2017-10-28 3:30 GMT+02:00 Hal Finkel <hfinkel at anl.gov>:
>
> On 10/27/2017 07:51 PM, Michael Kruse wrote:
>>
>> 2017-10-27 20:31 GMT+02:00 Hal Finkel via llvm-dev
>> <llvm-dev at lists.llvm.org>:
>>>
>>> I agree. Marking external functions from system headers seems like a
>>> reasonable heuristic. We'd need some heuristic because it's not
>>> reasonable
>>> for the frontend to know about every function the optimizer knows about.
>>> Over-marking seems okay, however.
>>
>> Sorry for the naive question, why is it unreasonable for the frontend
>> to know about special functions? It is the frontend who defines a
>> source language function's semantics. Clang also has access (or can be
>> made to can get access) to TargetLibraryInfo, no?
>
>
> I think this depends on how we want to define the separation of concerns.
> The optimizer has knowledge about many special functions. This list is
> non-trivial in size and also varies by target/environment. It is not
> reasonable to duplicate this list both in the optimizer and in all relevant
> frontends (which include not only things like Clang but also a whole host of
> other code generators that produce code directly calling system-library
> functions). Note that the optimizer sometimes likes to create calls to these
> functions, based only on its knowledge of the target/environment, without
> them ever been declared by the frontend.
>
> Now, can the list exist in the optimizer and be queried by the frontend?
> Sure. (*) It's not clear that this is necessary or useful, however. Clang,
> for example, would need to distinguish between functions declared in system
> headers and those that don't. This, strictly speaking, does not apply to
> functions that some from the C standard (because those names are always
> reserved), but names that come from POSIX or other miscellaneous system
> functions, can be used by well-formed programs (so long as, in general, they
> don't include the associated system headers). As a result, Clang might as
> well mark functions from system headers in a uniform way and let the
> optimizer do with them what it will. It could further filter that marking
> process using some callback to TLI, but I see no added value there.
> Similarly, a custom code generator can mark functions it believes will be
> resolved to system functions.
>
> (*) Although we need to be a bit careful to make sure that all
> optimizations, including custom ones, plugins, etc. register all of their
> relevant functions with TLI, and TLI isn't really setup for this (yet).

Thank you for the answer.

>>
>> The most straightforward solution seems to have an intrinsic for every
>> function that has compiler magic, meaning every other function is
>> ordinary without worrying about hitting a special case (e.g. when
>> concatenating strings to create new function names when outlining).
>> Recognizing functions names and assuming they represent the semantics
>> from libs seems "unclean", tying LLVM IR more closely to C and a
>> specific platform's libc/libm than necessary.
>>
>> "malloc" once had an intrinsic. Why was it removed, and recognized by
>> name instead?
>
>
> You want to have intrinsics for printf, getenv, and all the rest? TLI
> currently recognizes nearly 400 functions (see
> include/llvm/Analysis/TargetLibraryInfo.def).

intrinsics.gen currently already has 6243 intrinsics (most of them
target-dependent). Would 400 additional ones be that significant?

Michael