[PATCH] Emit minnum / maxnum for __builtin_fmin/fmax

hfinkel at anl.gov hfinkel at anl.gov
Thu Oct 23 00:28:51 PDT 2014


> Well first, I would think the point of using __builtin is to specifically get the intrinsic, and not use the library function.

The point, at the C level, is to get a certain set of semantics. The __builtin_* calls are marked as readnone in the IR, in theory giving them the desired semantics. As you point out below, this is likely not strong enough....

> I don’t think they are treated the same as libcalls in most situations, but the current situation is not easy to follow. 

For optimizations in SimplifyLibCalls, we try to treat both the same way.

> First, I mostly care about targets that do not have calls and TargetLibraryInfo isn’t particularly helpful or meaningful as is. I think if there is an equivalent intrinsic for the library function, it should always be used instead for consistency.

Interestingly, the only adjustment that TargetLibraryInfo makes for your target is:

  // There are no library implementations of mempcy and memset for r600 and
  // these can be difficult to lower in the backend.
  if (T.getArch() == Triple::r600) {
    TLI.setUnavailable(LibFunc::memcpy);
    TLI.setUnavailable(LibFunc::memset);
    TLI.setUnavailable(LibFunc::memset_pattern16);
    return;
  }

and so this probably does "work" for you -- but I do see your point. A true freestanding implementation might really not have these calls, and TLI might actually reflect that fact, rendering this scheme of using the libc calls (even marked as readnone) as suboptimal.

> The current state where a call is sometimes treated as an intrinsic and sometimes isn't is pretty confusing (It’s also something else I probably need to add support for for minnum / maxnum). Some places end up trying to consider both forms, or some forget one or the other. 

Yes, in theory we try to locate all relevant optimizations inside of SimplifyLibCalls; perhaps in practice this does not quite capture everything.

> Maybe there should be a pass that translates all of the libcalls into equivalent intrinsics?

I think this is a reasonable idea. It could be done inside of SimplifyLibCalls.

> Intrinsics are considered many more places than query TargetLibraryInfo, e.g. isSafeToSpeculativelyExecute checks many intrinsics but no libcalls.

This is a good point. In theory we could do a better job at this for some known libcalls, but you're right, isSafeToSpeculativelyExecute does not return true for readnone functions, and noted in the function:

    return false; // The called function could have undefined behavior or
                  // side-effects, even if marked readnone nounwind.

as a side note, I've been thinking about adding some kind of "safe to speculate" metadata for loads/calls; this is another use case ;) Nevertheless, we could use TLI here too and specifically handle certain known libcalls.

> I’m also not sure how the call form interacts with -fno-builtin.

We nicely mark the functions as 'nobuiltin'. This is a bug. :(

> I don’t think a regular call can / is considered as a candidate for vectorization, but the intrinsics are.

They certainly can be (when marked readnone) and are. The vectorizers handle the translation.

In short, I understand what you're saying, and I support canonicalizing on the intrinsics for these builtin operations. However, this would be a change to an established engineering decision, and I'd not approve this here without some discussion on llvmdev. Can you please write an RFC there? I want to make sure that everyone is on the same page.

http://reviews.llvm.org/D5896






More information about the cfe-commits mailing list