[llvm-dev] GC-parseable element atomic memcpy/memmove

Mon Sep 28 10:56:00 PDT 2020

In general, I am supportive of this direction.  It seems like an 
entirely reasonable solution.  I do have some comments below, but 
they're mostly of the "how do we generalize this?" variety.

First, let's touch on the attribute.

My first concern is naming; I think the use of "statepoint" here is 
problematic as this doesn't relate to lowering strategy needed (e.g. 
statepoints), but the conceptual support (e.g. a safepoint).  This could 
be resolved by simply tweaking to require-safepoint.

But that brings us to a broader point.  We've chosen to build in the 
fact intrinsics don't require safepoints.  If all we want is for some 
intrinsics *to* require safepoints, why isn't this simply a tweak to the 
existing code?  callsGCLeafFunction already has a small list of 
intrinsics which can have safepoints.

I think you can completely remove the need for this attribute by a) 
adding the atomic memcpy variants to the exclude list in 
callsGCLeafFunction, and b) using the existing "gc-leaf-function" on 
most calls the frontend generates.

Second, let's discuss the signature for the runtime function.

I think you should use a signature for the runtime call which takes base 
pointers and offsets, not base pointers and derived pointers.  Why?  
Because passing derived pointers in registers for arguments presumes 
that the runtime knows how to map a record in the stackmap to where a 
callee might have shuffled the argument to.  Some runtimes may support 
this, others may not.  Given using the offset scheme is just as simple 
to implement, being considerate and minimizing the runtime support 
required seems worthwhile.

On x86, the cost of a subtract (to produce the offset in the worst 
case), and an LEA (to produce the derived pointer again inside the 
runtime routine) is pretty minimal.  Particular since the former is 
likely to be optimized away and the later folded into the addressing mode.

Finally, it's also worth noting that some (but not all) GCs can convert 
from an interior derived pointer to the base of the containing object.  
With the memcpy family we know that either the pointers are all interior 
derived, or the length must be zero. This is not true for all GCs and 
thus we don't want to rely on it.

Philip

On 9/18/20 4:51 PM, Artur Pilipenko via llvm-dev wrote:
> TLDR: a proposal to add GC-parseable lowering to element atomic
> memcpy/memmove instrinsics controlled by a new "requires-statepoint”
> call attribute.
>
> Currently llvm.{memcpy|memmove}.element.unordered.atomic calls are
> considered as GC leaf functions (like most other intrinsics). As a
> result GC cannot occur while copy operation is in progress. This might
> have negative effect on GC latencies when large amounts of data are
> copied. To avoid this problem copying large amounts of data can be
> done in chunks with GC safepoints in between. We'd like to be able to
> represent such copy using existing instrinsics [1].
>
> For that I'd like to propose a new attribute for
> llvm.{memcpy|memmove}.element.unordered.atomic calls
> "requires-statepoint". This attribute on a call will result in a
> different lowering, which makes it possible to have a GC safepoint
> during the copy operation.
>
> There are three parts to the new lowering:
>
> 1) The calls with the new attribute will be wrapped into a statepoint
> by RewriteStatepointsForGC (RS4GC). This way the stack at the calls
> will be GC parceable.
>
> 2) Currently these intrinsics are lowered to GC leaf calls to the symbols
> __llvm_{memcpy|memmove}_element_unordered_atomic_<element_size>.
> The calls with the new attribute will be lowered to calls to different
> symbols, let's say
> __llvm_{memcpy|memmove}_element_unordered_atomic_safepoint_<element_size>.
> This way the runtime can provide copy implementations with safepoints.
>
> 3) Currently memcpy/memmove calls take derived pointers as arguments.
> If we copy with safepoints we might need to relocate the underlying
> source/destination objects on a safepoint. In order to do this we need
> to know the base pointers as well. How do we make the base pointers
> available in the copy routine? I suggest we add them explicitly as
> arguments during lowering.
>
> For example:
> __llvm_memcpy_element_unordered_atomic_safepoint_1(
>   dest_base, dest_derived, src_base, src_derived, length)
>
> It will be up to RS4GC to do the new lowering and prepare the arguments.
> RS4GC knows how to compute base pointers for a given derived pointer.
> It also already does lowering for deoptimize intrinsics by replacing
> an intrinsic call with a symbol call. So there is a precedent here.
>
> Other alternatives:
> - Change llvm.{memcpy|memmove}.element.unordered.atomic API to accept
>   base pointers + offsets instead of derived pointers. This will
>   require autoupgrade of old representation. Changing API of a generic
>   intrinsic to facilitate GC-specific lowering doesn't look like the
>   best idea. This will not work if we want to do the same for non-atomic
>   intrinsics.
> - Teach GC infrastructure to record base pointers for all derived
>   pointer arguments. This looks like an overkill for single use case.
>
> Here is the proposed implementation in a single patch:
> https://reviews.llvm.org/D87954
> If there are no objections I will split it into individual reviews and
> add langref changes.
>
> Thoughts?
>
> Artur
>
> [1] An alternative approach would be to make the frontend generate a
> chunked copy loop with a safepoint inside. The downsides are:
> - It's harder for the optimizer to see that this loop is just a copy
>   of a range of bytes.
> - It forces one particular lowering with the chunked loop inlined in
>   compiled code. We can't outline the copy loop into the copy routine.
>   With the intrinsic representation of a chunked copy we can choose
>   different lowering strategies if we want.
> - In our system we have to outline the copy loop into the copy routine
>   due to interactions with deoptimization.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200928/2d689ed2/attachment.html>