[cfe-dev] ARC + array literal patch feedback request

Tue Mar 19 22:44:10 PDT 2013

On Mar 14, 2013, at 6:44 PM, Jesse Rusak <me at jesserusak.com> wrote:
> As a way to get my feet wet with Clang, I'm interested in solving what I think is a bug in array literals under ARC. I have a start on a patch, but I'd love for someone to tell me if I'm going in a reasonable direction. The motivating code is:
> 
> @implementation AClass {
>    id ivar;
> }
> 
> - (void)test {
> 	id local = ivar;
> 	[self aMethod];
> 	@[local, ivar];
> }
> 
> //...
> 
> The bug is that the object stored in `local`, which is retained before the call to `aMethod`, is released before the call to build the array (in -O1). If `aMethod` modifies `ivar`, then this can cause the `local` value to be deallocated before it's passed to the array. It seems that the array literal code gen produces a local buffer that the objects are stored in, and that buffer is passed to +[NSArray arrayWithObjects:count:]. Between the store to that buffer and the call, there's a window where those objects could be released, since the ARC optimizer doesn't know about the value stored in the temporary buffer.
> 
> The solution I've put together is to mark that temporary buffer as strong and have a cleanup for it afterwards. (And ditto for dictionary literals and their key buffer.) This seems like it's the "right thing" to do, but will be extra work at runtime. The alternatives seem to be to make ARC understand that the value is still needed, or to require `local` in the above code be marked with objc_precise_lifetime, which seems surprising to me.
> 
> I'm attaching a (tiny) patch with my approach and the emitted LLVM before and after. I'd love to hear any feedback about the solution in general and any particulars about the code. If it seems reasonable, I'll work on adjusting the array literal tests so they pass again in the hopes of getting it committed.

This is very interesting.

If we look at this under the formal model (but break down the
array literal into lower-level language concepts), then it's clear that
we're violating the model.  The object is held in a variable with
imprecise lifetime, but the last use of that variable was to copy its
contents into a cell of an __unsafe_unretained buffer, at which point
dependence is cut and we're no longer required to keep it alive.
So something needs to change here.

However, we'd really like to avoid retaining and releasing every value
in the array — it'd be different if we could transfer retains into the
NSArray, but that's not how the API we're using works.  In general,
it's likely to be very difficult for the optimizer to remove most of those
retains and releases — and if it can, it'll probably turn it right back
into the code it's getting right now.

So instead I would suggest just creating a "barrier" to tell the ARC
optimizer that the function still formally depends on the values stored
in the array until after that call is complete.  This should be quite
easy:  just collect all the objects in the array and then emit a call
like this right after the message-send:
  call void @clang.arc.use_objects(i8* %value_from_local, i8* %value_from_ivar) nounwind

The ARC optimizer would then see that the value is still used and know
not to move the release before this call.

You would then remove all calls to that function in the late ARC pass.

John.