[llvm-dev] RFC: Add atomic versions of the llvm.memcpy, llvm.memmove and llvm.memset intrinsics

Fri Nov 11 06:13:26 PST 2016

Hi,

LLVM's memory intrinsics are quite useful for performing various optimizations
with frequently used memory operations. Unfortunately this intrinsics are not applicable for
languages with guaranteed atomicity for their memory accesses (like Java for example).

In order to overcome this limitation I'm thinking about adding set of intrinsics
which will execute as a series of unordered atomic memory accesses.
To be more specific here is the definition I'm thinking about:

  declare void @llvm.memcpy_atomic.p0i8.p0i8.i32(i8* <dest>, i8* <src>, i32 <num_elements>,
                                                 i32 <element_size>, i32 <align>, i1 <isvolatile>)

It closelly mimicks original memcpy intrinsic. Only difference is that now we explicitly
specify element_size. Semantically memcpy_atomic is equivalent to the explicit IR loop
in which each load and store is marked as unordered atomic. This definition should give
sufficient freedom to the optimizer while allowing us to transform pre-existing IR loops
into this intrinsics. 'memcpy_atomic' will be lowered into '__memcpy_atomic' library call
(I'm not really certain about choosing function name). 'memset_atomic' and 'memmove_atomic'
both can be defined in a similar way.

It's tempting to model atomic behaviour by adding additional argument to the existing
intrinsics. However by doing so we will need to teach all relevant optimizations and
every backend on how to respect this new argument. This would not only be considerable
amount of work but it will also be quite error prone.

What do folks thik? Does this design makes sense? Would it be usefull for anyone
else developing for languages with similar to Java constraints?

 — Igor