Trying to sum-up the approaches that have been discussed, numbered in the
order I saw them:

1) Mangle internal names to avoid collisions.

2) Only optimize library functions when they have external linkage.

3) Switch optimizations to do cloning rather than mutating functions

4) Mark all library functions declared in system headers with some
attribute and key optimizations on this

#1 doesn't seem to have much appeal.
#3 is interesting and likely a good thing to do but not really sufficient
to fix the root issue.
#4, especially in the mode w here these attributes actually carry the
semantics allowing the name-based heuristics to be isolated in a more
appropriate layer, seems like a very interesting long term path, but
honestly not one I have the time to bring about right now. And I don't
think we can wait for this to fix things.

But I think we can combine some of #4 and some of #2 to get a good solution
here that is practical and achievable:

- Recognize external library functions, much like we already do, but
restrict it to external functions.
- Recognize internal functions *with a builtin attribute* much like we do
external library functions.
- Teach internalize to add the builtin attribute as it changes linkage.

One example of what I *really* want from this even in LTO which motivates
the change to internalize: things like 'readonly' where some spec lets us
optimize callers with this even if the implementation actually writes to
memory. Consider building with -fno-math-errno and LTOing a libc that does
actually set errno in its implementation.

We will also need to constrain optimizations like IPSCCP in the face of
internal builtin (and thus library) functions in order to avoid the printf
-> puts miscompile described by Eli. But we already have this problem in
theory today, and the above won't make it any worse and should even give us
new options to address it such as stripping the builtin attribute (in
addition to cloning, or other techniques).


> I think this is the pragmatic way forwards. For a concise example of
> how broken/surprising the current behaviour is:
> <snip>
> ffloor is legal for AArch64, meaning frintm is produced rather than a
> call to floor. Deleting the 'readnone' attribute from the floor
> function will avoid lowering to ffloor. Compile with -mtriple=arm and
> the generated assembly has completely different semantics (calling
> floor and so aborting).
> I'm not sure if there's a tracking bug for this, but the earliest
> mention I could find with a quick search was
> <https://bugs.llvm.org/show_bug.cgi?id=2141>.
> As John Regehr clarified on Twitter - the potential issues when
> names+arguments clash with C99 standard library functions is
> documented in the LangRef, though it's (at the time of writing)
> stuffed awkwardly under the "Example" subheading for the call
> instruction <http://llvm.org/docs/LangRef.html#id306>.
> I suppose the point is: the issue described by Chandler in this RFC is
> a very strong motivation for changing _something_. The approach
> suggested by David would solve Chandler's bug, but also allow this
> function naming restriction to be lifted altogether which seems like
> an even bigger win.
> I think that the right thing to do is to make the compiler ignore
> well-known functions that have internal linkage.  Treating a symbol with
> internal linkage as “known” is unsafe and incorrect even if it was derived
> from a well-known function, because IPO can transform it (e.g. by constant
> propagating values into the arguments).
> If the use-case for statically linking in libc + internalizing it is
> important, then we need to find another solution to preserve those
> optimizations, it isn’t safe to just blindly assume an internal symbol with
> a well known name is the well known function..
> -Chris
More information about the llvm-dev mailing list