[cfe-dev] [RFC] volatile mem* builtins
JF Bastien via cfe-dev
cfe-dev at lists.llvm.org
Wed May 6 16:27:53 PDT 2020
> On May 6, 2020, at 4:23 PM, John McCall <rjmccall at apple.com> wrote:
> On 6 May 2020, at 18:40, JF Bastien wrote:
> Hi fans of volatility!
> I’d like to add volatile overloads to mem* builtins, and authored a patch: https://reviews.llvm.org/D79279 <https://reviews.llvm.org/D79279> <https://reviews.llvm.org/D79279 <https://reviews.llvm.org/D79279>>
> The mem* builtins are often used (or should be used) in places where time-of-check time-of-use security issues are important (e.g. copying from untrusted buffers), because it prevents multiple reads / multiple writes from occurring at the untrusted memory location. The current builtins don't accept volatile pointee parameters in C++, and merely warn about such parameters in C, which leads to confusion. In these settings, it's useful to overload the builtin and permit volatile pointee parameters. The code generation then directly emits the existing volatile variant of the mem* builtin function call, which ensures that the affected memory location is only accessed once (thereby preventing double-reads under an adversarial memory mapping).
> Side-note: yes, ToCToU avoidance is a valid use for volatile <https://wg21.link/p1152r0#uses <https://wg21.link/p1152r0#uses>>.
> My patch currently only affects:
> There’s a bunch more “mem-like” functions such as bzero, but those 3 are the ones I expect to be used the most at the moment. We can add others later.
> John brought up the following: __builtin_memcpy is a library builtin, which means its primary use pattern is #define tricks in the C standard library headers that redirect calls to the memcpy library function. So doing what you're suggesting to __builtin_memcpy is also changing the semantics of memcpy, which is not something we should do lightly. If we were talking about changing a non-library builtin function, or introducing a new builtin, the considerations would be very different.
> I can instead add __builtin_volatile_* functions which are overloaded on at least one pointee parameter being volatile.
> So, to be clear, you would like there to be some way to request a volatile memcpy (etc.). You don’t need it to specifically be __builtin_memcpy (etc.) — i.e. you’re not relying on this automatically triggering when users call memcpy — you just need some way to spell it.
> A few thoughts:
> A memcpy/memmove is conceptually a load from one address and a store to another. It is potentially valuable to know that e.g. only the store is volatile. We can’t express that in today’s LLVM intrinsics, but it’s certainly imaginable that we could express it in the future, the same way that we added the ability to record different alignments for both sides. So I think it would be nice if whatever we do here allows us to pick up on the difference, e.g. by triggering based on the qualification of the source/dest pointers.
Indeed, that’s what I went with overloading.
> There are other qualifiers that can meaningfully contribute to the operation here besides volatile, such as restrict and (more importantly) address spaces. And again, for the copy operations these might differ between the two pointer types.
> In both cases, I’d say that the logical design is to allow the pointers to be to arbitrarily-qualified types. We can then propagate that information from the builtin into the LLVM intrinsic call as best as we’re allowed. So I think you should make builtins called something like __builtin_overloaded_memcpy (name to be decided) and just have their semantics be type-directed.
Ah yes, I’d like to hear what others think of this. I hadn’t thought about it before you brought it up, and it sounds like a good idea.
> I do think it would treacherous to actually apply these semantics to memcpy via __builtin_memcpy, though.
Yeah I think you’ve convinced me.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev