[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

Wed Nov 18 16:04:14 PST 2020

On 18 Nov 2020, at 7:39, Florian Hahn wrote:
>> On Nov 18, 2020, at 12:08, Ola Fosheim Grøstad via llvm-dev 
>> <llvm-dev at lists.llvm.org> wrote:
>>
>> My experience with LLVM is limited, but I am trying to figure out how 
>> to add optimizations for automatic reference counting. The GC 
>> documentation mentions that patch-points could be useful, but it does 
>> not state how they would be useful. If this is a FAQ, please let me 
>> know...
>>
>> So this is my idea at this point:
>>
>> The context is a C++ like language with an aggregate type that is 
>> always reference counted. The typesystem differentiate between 
>> pointers to objects that is shared between threads and those that 
>> does not. I also want a pass that turn shared_ptr to nonshared_ptr if 
>> it can be proven.
>>
>> So what I want to do is to wrap up all the "events" that are relevant 
>> as intrinsics and run some simplification passes, then use the 
>> pointer capture/escape analysis that LLVM has to turn shared_ptrs to 
>> nonshared_ptrs and to elide nonatomic/atomic acquire/release. So 
>> basically, the intrinsics will be the type-annotation also.
>>
>> The compilation will then follow this pattern:
>> 1. generate LLVM IR
>> 2. simplification passes
>> 3. pass for turning shared_ptr to nonshared_ptr
>> 4. pass for eliding acquire/release
>> 5, pass that substitute the custom intrinsics to function call
>> 6. full optimization passes
>>
>> I think about having the following intrinsics:
>>
>> ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation
>> ptr = cast_to_shared_irreversible(ptr)  // basically a gateway to 
>> other threads
>> nonhared_acquire(ptr)
>> nonshared_release(ptr)
>> shared_acquire(ptr)
>> shared_release(ptr)
>>
>> I also want weak_ptr at  a later stage, but leave it out for now to 
>> keep the complexity manageble.
>>
>> Is this idea completely unreasonable?

The main problem for this sort of optimization is that it is difficult 
to do on an IR like LLVM’s, where the semantic relationships between 
values that exist in the program have been lowered into a sequence of 
primitive operations with no remaining structural relationship.  What 
you really want is an IR that preserves those relationships of value 
definition and use, allowing you to say e.g. that one value is a copy of 
another, that ownership of a value is passed into something, that a 
value is used for a certain duration but then is no longer used, and so 
on.  With this sort of representation, the optimization turns into 
fairly straightforward value-forwarding and lifetime manipulations.  
Dealing with unrelated operations and retroactively attempting to infer 
relationships, as LLVM IR must, turns it into a fundamentally difficult 
analysis that often relies on semantic knowledge that isn’t expressed 
in the IR, so that you’re actually reasoning about what happens under 
a “well-behaved” frontend.

John.