[llvm-dev] RFC: We need to explicitly state that some functions are reserved by LLVM

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 27 11:31:26 PDT 2017


On 10/27/2017 01:10 PM, Xinliang David Li via llvm-dev wrote:
>
>
> On Fri, Oct 27, 2017 at 1:50 AM, David Chisnall via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     This seems slightly inverted.  As I understand it, the root of the
>     problem is that some standards, such as C, C++, and POSIX, define
>     some functions as special and we rely on their specialness when
>     optimising.  Unfortunately, the specialness is a property of the
>     source language and, possibly, environment and not necessarily of
>     the target. The knowledge of which functions are special seems
>     like it ought to belong in the front end, so a C++ compiler might
>     tag a function called _Znwm as special, but to a C or Fortran
>     front end this is just another function and shouldn’t be treated
>     specially.
>
>     Would it not be cleaner to have the front end (and any
>     optimisations that are aware of special behaviour of functions)
>     add metadata indicating that these functions are special? 
>
>
>
> Ideally many of these functions should be annotated as builtin in the 
> system headers.  An hacky solution is for frontend to check if the 
> declarations are from system headers to decide if metadata needs to be 
> applied.

I agree. Marking external functions from system headers seems like a 
reasonable heuristic. We'd need some heuristic because it's not 
reasonable for the frontend to know about every function the optimizer 
knows about. Over-marking seems okay, however.

  -Hal

>
> David
>
>     If the metadata is lost, then this inhibits later optimisations
>     but shouldn’t affect the semantics of the code (it’s always valid
>     to treat the special functions as non-special functions) and
>     optimisations then don’t need to mark them.  This would also give
>     us a free mechanism of specifying functions that are semantically
>     equivalent but have different spellings.
>
>
>
>     David
>
>     > On 27 Oct 2017, at 04:14, Chandler Carruth via llvm-dev
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>     >
>     > I've gotten a fantastic bug report. Consider the LLVM IR:
>     >
>     > target triple = "x86_64-unknown-linux-gnu"
>     >
>     > define internal i8* @access({ i8* }* %arg, i64) {
>     >   ret i8* undef
>     > }
>     >
>     > define i8* @g({ i8* }* %arg) {
>     > bb:
>     >   %tmp = alloca { i8* }*, align 8
>     >   store { i8* }* %arg, { i8* }** %tmp, align 8
>     >   br i1 undef, label %bb4, label %bb1
>     >
>     > bb1:
>     >   %tmp2 = load { i8* }*, { i8* }** %tmp, align 8
>     >   %tmp3 = call i8* @access({ i8* }* %tmp2, i64 undef)
>     >   br label %bb4
>     >
>     > bb4:
>     >   ret i8* undef
>     > }
>     >
>     > This IR, if compiled with `opt
>     -passes='cgscc(inline,argpromotion)' -disable-output` hits a bunch
>     of asserts in the LazyCallGraph.
>     >
>     > The problem here is that `argpromotion` turns a normal looking
>     function `i8* @access({ i8* }* %arg, i64)` and turn it into a
>     magical function `i8* @access(i8* %arg, i64)`. This latter
>     signature is the POSIX `access` function that LLVM's
>     `TargetLibraryInfo` knows magical things about.
>     >
>     > Because *some* library functions known to `TargetLibraryInfo`
>     can have *calls* to them introduced at arbitrary points of
>     optimization (consider vectorized variants of math functions), the
>     new pass manager and its graph to provide ordering over the module
>     get Very Unhappy when you *introduce* a definition of a library
>     function in the middle of the compilation pipeline.
>     >
>     > And really, we do *not* want `argpromotion` to do this. We don't
>     want it to turn some random function by the name of `@free` into
>     the actual `@free` function and suddenly change how LLVM handles it.
>     >
>     > So what do we do?
>     >
>     > One option is to make `argpromotion` and every other pass that
>     mutates a function's signature rename the function (or add a
>     `nobuiltin` attribute to it). However, this seems brittle and
>     somewhat complicated.
>     >
>     > My proposal is that we admit that certain names of functions are
>     reserved in LLVM's IR. For these names, in some cases *any*
>     function with that name will be treated specially by the
>     optimizer. We can still check the signatures when transforming
>     code based on LLVM's semantic understanding of that function, but
>     this avoids any need to check things when mutating the signature
>     of the function.
>     >
>     > This would require frontends to avoid emitting functions by
>     these names unless they should have these special semantics.
>     However, even if they do, everything should remain conservatively
>     correct. But I'll send an email to cfe-dev suggesting that Clang
>     start "mangling" internal functions that collide with target
>     names. I think this is important as I've found a quite surprising
>     number of cases where this happens in real code.
>     >
>     > There is no need to auto-upgrade here, because again, LLVM's
>     handling will remain conservatively correct.
>     >
>     > Does this seem reasonable? If so, I'll send patches to update
>     the LangRef with these restrictions. I'll also take a quick stab
>     at generating some example tables of such names from the .td files
>     used by `TargetLibraryInfo` already. These can't be authoritative
>     because of the platform-specific nature of it, but should help
>     people understand how this area works.
>     >
>     >
>     > One alternative that seems appealing but doesn't actually help
>     would be to make `TargetLibraryInfo` ignore internal functions.
>     That is how the C++ spec seems to handle this for example (C
>     library function names are reserved only when they have linkage).
>     But this doesn't work well for LLVM because we want to be able to
>     LTO an internalized C library. So I think we need the rule for
>     LLVM function names to not rely on linkage here.
>     >
>     >
>     > Thanks,
>     > -Chandler
>     >
>     > _______________________________________________
>     > LLVM Developers mailing list
>     > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171027/051f6b09/attachment.html>


More information about the llvm-dev mailing list