[PATCH] D79279: Add overloaded versions of builtin mem* functions

Fri Jul 31 16:51:46 PDT 2020

rsmith added inline comments.

================
Comment at: clang/docs/LanguageExtensions.rst:2430
+
+These overloads support destinations and sources which are a mix of the
+following qualifiers:
----------------

================
Comment at: clang/docs/LanguageExtensions.rst:2454
+and might be non-uniform throughout the operation.
+
+The builtins can be used as building blocks for different facilities:
----------------
>From the above description, I think the documentation is unclear what the types `T` and `U` are used for. I think the answer is something like:

"""
The types `T` and `U` are required to be trivially-copyable types, and `byte_element_size` (if specified) must be a multiple of the size of both types. `dst` and `src` are expected to be suitably aligned for `T` and `U` objects, respectively.
"""

But... we actually get the alignment information by analyzing pointer argument rather than from the types, just like we do for `memcpy` and `memmove`, so maybe the latter part is not right. (What did you intend regarding alignment for the non-atomic case?) The trivial-copyability and divisibility checks don't seem fundamentally important to the goal of the builtin, so I wonder if we could actually just use `void` here and remove the extra checks. (I don't really have strong views one way or the other on this, except that we should either document what `T` and `U` are used for or make the builtins not care about the pointee type beyond its qualifiers.)

================
Comment at: clang/lib/AST/ExprConstant.cpp:8851
+      // about atomicity, but needs to check runtime constraints on size. We
+      // can't check the alignment runtime constraints.
+      APSInt ElSz;
----------------
We could use the same logic we use in `__builtin_is_aligned` here. For any object whose value the constant evaluator can reason about, we should be able to compute at least a minimal alignment (though the actual runtime alignment might of course be greater).

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2639-2640
   case Builtin::BI__builtin_mempcpy: {
+    QualType DestTy = getPtrArgType(CGM, E, 0);
+    QualType SrcTy = getPtrArgType(CGM, E, 1);
     Address Dest = EmitPointerWithAlignment(E->getArg(0));
----------------
Looking through implicit conversions in `getPtrArgType` here will change the code we generate for cases like:

```
void f(volatile void *p, volatile void *q) {
  memcpy(p, q, 4);
}
```

... (in C, where we permit such implicit conversions) to use a volatile memcpy intrinsic. Is that an intentional change?

================
Comment at: clang/lib/CodeGen/CGExpr.cpp:1167-1168

+  if (E->getType()->isArrayType())
+    return EmitArrayToPointerDecay(E, BaseInfo, TBAAInfo);
+
----------------
Do we still need this? We should be doing the decay in `Sema`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79279/new/

https://reviews.llvm.org/D79279