[PATCH] D104801: [MemCpyOpt] Enable memcpy optimization for NVPTX back-end.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 28 05:21:13 PDT 2021


fhahn added a comment.

In D104801#2839313 <https://reviews.llvm.org/D104801#2839313>, @tra wrote:

> In D104801#2838064 <https://reviews.llvm.org/D104801#2838064>, @fhahn wrote:
>
>> So IIUC the behavior you want to express is that certain lib functions actually are considered available up to a point?
>
> No, *intrinsics* are available. We can not currently lower any of them to libcalls and rely on alternative lowering mechanisms.

Sure, the intrinsics should *always* be available on any target regardless of whether they are lowered to lib calls or now.

> I think the pass' use of "is libcall available" is an imperfect proxy for either "should we bother looking for memcpy/memset ops" or for "can we materialize new memcpy/memset". It appears to assume that intrinsics and builtins either both supported or not supported.

Agreed, the current check is not ideal. IIUC the current check acts as a proxy to check whether the backends can lower the intrinsic using a lib call. At the moment, I think this mainly guards against `MemCpyOpt` introducing `llvm.memcpy` calls that later get lowered to library calls in the backend, even if the library function is not marked as available.

Perhaps an alternative to the TLI check would be to provide a generic lowering for `llvm.memcpy` intrinsics when lib calls are not available?  (One thing to note is that I think both Clang and GCC require memcpy and a few others to be available, even in freestanding environments)

>> I'm not sure about having a dedicated TTI hook to specifically enable/disable the pass.
>
> IMO TTI is the standard mechanism to make target-specific information available to otherwise generic passes. There are ~50 passes under lib/Transforms that utilize it.

Yes, but at the moment, I don't think there's much precedence for backends specifically disabling certain passes. This could get messy very quickly, e.g. even if we add only hooks for some of the available passes. Those hooks are also not composable. IMO it would be preferable to model this in a way so other passes that may want to introduce `llvm.memcpy` calls also benefit.

Generalizing the hook to whether the intrinsics can be lowered without lib calls seems a step in the right direction to me and this could be helpful for other passes as well (as you suggested in a later comment). If backends could easily lower any `llvm.memcpy` call without needing to fall back to library calls, that would also be compelling.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104801/new/

https://reviews.llvm.org/D104801



More information about the llvm-commits mailing list