[PATCH] D79279: Add overloaded versions of builtin mem* functions

JF Bastien via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 4 21:21:32 PDT 2020


jfb added a comment.

In D79279#2195299 <https://reviews.llvm.org/D79279#2195299>, @rjmccall wrote:

> Patch looks basically okay to me, although I'll second Richard's concern that we shouldn't absent-mindedly start producing overloaded `memcpy`s for ordinary `__builtin_memcpy`.

Yeah I think that's a leftover from the first patch. Should I drop it? At the same time, address spaces are currently accidentally "working", should I drop that in a patch before this change? Or leave as-is?



================
Comment at: clang/docs/LanguageExtensions.rst:2446
+order in which they occur (and in which they are observable) can only be
+guaranteed using appropriate fences around the function call. Element size must
+therefore be a lock-free size for the target architecture. It is a runtime
----------------
rjmccall wrote:
> "*The* element size must..."  But I would suggest using "access size" consistently rather than "element size".
I'm being consistent with the naming for IR, which uses "element" as well. I'm not enamored with the naming, but wanted to point out the purposeful consistency to make sure you preferred "access size". Without precedent I would indeed prefer "access size", but have a slight preference for consistency here. This is extremely weakly held preference.

(I fixed "the").


================
Comment at: clang/docs/LanguageExtensions.rst:2454
+and might be non-uniform throughout the operation.
+
+The builtins can be used as building blocks for different facilities:
----------------
rsmith wrote:
> jfb wrote:
> > rsmith wrote:
> > > From the above description, I think the documentation is unclear what the types `T` and `U` are used for. I think the answer is something like:
> > > 
> > > """
> > > The types `T` and `U` are required to be trivially-copyable types, and `byte_element_size` (if specified) must be a multiple of the size of both types. `dst` and `src` are expected to be suitably aligned for `T` and `U` objects, respectively.
> > > """
> > > 
> > > But... we actually get the alignment information by analyzing pointer argument rather than from the types, just like we do for `memcpy` and `memmove`, so maybe the latter part is not right. (What did you intend regarding alignment for the non-atomic case?) The trivial-copyability and divisibility checks don't seem fundamentally important to the goal of the builtin, so I wonder if we could actually just use `void` here and remove the extra checks. (I don't really have strong views one way or the other on this, except that we should either document what `T` and `U` are used for or make the builtins not care about the pointee type beyond its qualifiers.)
> > You're right. I've removed most treatment of `T` / `U`, and updated the documentation.
> > 
> > I left the trivial copy check, but `void*` is a usual escape hatch.
> > 
> > Divisibility is now only checked for `size` / `element_size`.
> Please document the trivial copy check.
Should I bubble this to the rest of the builtin in a follow-up patch? I know there are cases where that'll cause issues, but I worry that it would be a pretty noisy diagnostic (especially if we instead bubble it to C's `memcpy` instead).


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2639-2640
   case Builtin::BI__builtin_mempcpy: {
+    QualType DestTy = getPtrArgType(CGM, E, 0);
+    QualType SrcTy = getPtrArgType(CGM, E, 1);
     Address Dest = EmitPointerWithAlignment(E->getArg(0));
----------------
rsmith wrote:
> jfb wrote:
> > rsmith wrote:
> > > Looking through implicit conversions in `getPtrArgType` here will change the code we generate for cases like:
> > > 
> > > ```
> > > void f(volatile void *p, volatile void *q) {
> > >   memcpy(p, q, 4);
> > > }
> > > ```
> > > 
> > > ... (in C, where we permit such implicit conversions) to use a volatile memcpy intrinsic. Is that an intentional change?
> > I'm confused... what's the difference that this makes for the pre-existing builtins? My intent was to get the `QualType` unconditionally, but I can conditionalize it if needed... However this ought to make no difference:
> > ```
> > static QualType getPtrArgType(CodeGenModule &CGM, const CallExpr *E,
> >                               unsigned ArgNo) {
> >   QualType ArgTy = E->getArg(ArgNo)->IgnoreImpCasts()->getType();
> >   if (ArgTy->isArrayType())
> >     return CGM.getContext().getAsArrayType(ArgTy)->getElementType();
> >   if (ArgTy->isObjCObjectPointerType())
> >     return ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType();
> >   return ArgTy->castAs<clang::PointerType>()->getPointeeType();
> > }
> > ```
> > and indeed I can't see the example you provided change in IR from one to the other. The issue I'm working around is that getting it unconditionally would make ObjC code sad when `id` is passed in as I outlined above.
> The example I gave should produce a non-volatile memcpy, and used to do so (we passed `false` as the fourth parameter to `CreateMemCpy`). With this patch, `getPtrArgType`will strip off the implicit conversion from `volatile void*` to `void*` in the argument type, so `isVolatile` below will be `true`, so I think it will now create a volatile memcpy for the same testcase. If that's not what's happening, then I'd like to understand why not :)
> 
> I'm not saying that's necessarily a bad change, but it is a change, and it's one we should make intentionally if we make it at all.
Oh yes, sorry I thought you were talking about something that `getPtrArgType` did implicitly! Indeed the C code behaves differently in that it doesn't just strip `volatile` anymore.

I'm not super thrilled by the default C behavior, and I think this new behavior removes a gotcha, and is in fact what I was going for in the first iteration of the patch. Now that I've separated the builtin I agree that it's a bit odd... but it seems like the right thing to do anyways? But it no longer matches the C library function to do so.

FWIW, this currently "works as you'd expect":
```
void f(__attribute__((address_space(32))) void *dst, const void *src, int sz) {
    __builtin_memcpy(dst, src, sz);
}
```
https://godbolt.org/z/dcWxcK

and I think that's completely accidental because the C library function doesn't (and, as John pointed out earlier, the builtin is meant to act like the C function).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79279/new/

https://reviews.llvm.org/D79279



More information about the cfe-commits mailing list