[llvm] [AMDGPU][CodeGen] Improve handling of memcpy for -Os/-Oz compilations (PR #87632)

Mon Apr 15 23:44:36 PDT 2024

================
@@ -59,6 +59,12 @@ unsigned AMDGPUTargetLowering::numBitsSigned(SDValue Op, SelectionDAG &DAG) {
 AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
                                            const AMDGPUSubtarget &STI)
     : TargetLowering(TM), Subtarget(&STI) {
+  // Always lower memset, memcpy, and memmove intrinsics to load/store
+  // instructions, rather then generating calls to memset, mempcy or memmove.
+  MaxStoresPerMemset = MaxStoresPerMemsetOptSize = ~0U;
----------------
Pierre-vh wrote:

> This is not a magic number, it is basically inf. These numbers are used as a threshold that determines whether the three functions will be lowered to load/store or lib call. Since we don't have lib call anyway, we need to lower them in all cases.

I see, `~0U` works for me then. Maybe just add a small comment on top explaining that this is the libcall lowering threshold and it shouldn't be reachable?

https://github.com/llvm/llvm-project/pull/87632