[PATCH] D97915: [Coroutines] Handle overaligned frame allocation
Yuanfang Chen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 29 16:48:07 PDT 2021
ychen added a comment.
> Thanks for clarifying. Let's solve the semantics problem first.
> With the introduction about 'raw frame', I think it's necessary to introduce this concept in the section 'Switched-Resume Lowering' or even the section 'Introduction' in the document. Add a section to tell the terminology is satisfied too.
Done.
> Then why we defined both 'llvm.coro.raw.frame.ptr.offset' and 'llvm.coro.raw.frame.ptr.addr' together? It looks like refer to the same value finally. It looks like 'llvm.coro.raw.frame.ptr.offset' are trying to solve the problem about memory leak. But I think we could use llvm.coro.raw.frame.ptr.addr directly instead of traversing the frame (Maybe we need to add an intrinsic `llvm.coro.raw.size`). Then we can omit a field in the frame to save space.
("llvm.coro.raw.frame.ptr.offset" is an offset from coroutine frame address instead of raw frame pointer)
Apologies for the confusion. I've briefly explained it here https://reviews.llvm.org/D102145#2752445 I think it is not clear. "llvm.coro.raw.frame.ptr.addr" is conceptually "the address of a coroutine frame field storing the `raw frame pointer`" only after `insertSpills` in CoroFrame.cpp. Before that, "llvm.coro.raw.frame.ptr.addr" is actually an alloca storing the `raw frame pointer` (try grepping "alloc.frame.ptr" in this review page). Using "llvm.coro.raw.frame.ptr.offset" instead of "llvm.coro.raw.frame.ptr.addr" is doable which looks like below, please check line 31. The downside is that the write to coroutine frame is not through an alloca but a direct write. It is unusual because all fields in the frame are stored as 1. special/header fields 2. alloca 3. splills. Doing the write indirectly as Alloca makes me comfortable. The tradeoff is one extra intrinsic "llvm.coro.raw.frame.ptr.addr". What do you think?
19 coro.alloc.align: ; preds = %coro.alloc.check.align
20 %3 = sub nsw i64 64, 16
21 %4 = add i64 128, %3
22 %call1 = call noalias nonnull i8* @_Znwm(i64 %4) #13
23 %mask = sub i64 64, 1
24 %intptr = ptrtoint i8* %call1 to i64
25 %over_boundary = add i64 %intptr, %mask
26 %inverted_mask = xor i64 %mask, -1
27 %aligned_intptr = and i64 %over_boundary, %inverted_mask
28 %diff = sub i64 %aligned_intptr, %intptr
29 %aligned_result = getelementptr inbounds i8, i8* %call1, i64 %diff
30 call void @llvm.assume(i1 true) [ "align"(i8* %aligned_result, i64 64) ]
31 store i8* %call1, i8** %alloc.frame.ptr, align 8
; Replace line 31 with below, and must makes sure line 46~line 48 is skipped.
; %poff = call i32 @llvm.coro.raw.frame.ptr.offset.i32()
; %addr = getelementptr inbounds i8, i8* %aligned_result, i32 %poff
; %addr1 = bitcast i8* %addr to i8**
; store i8* %call1, i8** %addr1, align 8
32 br label %coro.init.from.coro.alloc.align
33
34 coro.init.from.coro.alloc.align: ; preds = %coro.alloc.align
35 %aligned_result.coro.init = phi i8* [ %aligned_result, %coro.alloc.align ]
36 br label %coro.init
37
38 coro.init: ; preds = %coro.init.from.entry, %coro.init.from.coro.alloc.align, %cor
o.init.from.coro.alloc
39 %5 = phi i8* [ %.coro.init, %coro.init.from.entry ], [ %call.coro.init, %coro.init.from.coro.alloc ], [ %aligned_result
.coro.init, %coro.init.from.coro.alloc.align ]
40 %FramePtr = bitcast i8* %5 to %f0.Frame*
41 %resume.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 0
42 store void (%f0.Frame*)* @f0.resume, void (%f0.Frame*)** %resume.addr, align 8
43 %6 = select i1 true, void (%f0.Frame*)* @f0.destroy, void (%f0.Frame*)* @f0.cleanup
44 %destroy.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 1
45 store void (%f0.Frame*)* %6, void (%f0.Frame*)** %destroy.addr, align 8
46 %7 = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 2
47 %8 = load i8*, i8** %alloc.frame.ptr, align 8
48 store i8* %8, i8** %7, align 8
49 br label %AllocaSpillBB
50
51 AllocaSpillBB: ; preds = %coro.init
52 %.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 4
53 %ref.tmp.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 5
54 %agg.tmp.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 6
55 %ref.tmp5.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 7
56 %agg.tmp8.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 8
57 %__promise.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 10
58 br label %PostSpill
> Then I am a little confused for the design again, since we would treat the value for CoroBegin as the address of coroutine frame in the past and it looks like to be the raw frame now. Let me reconsider if it is OK.
The returned value of CoroBegin is still coroutine frame not a raw frame even if the frame is overaligned. You could check the above code.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D97915/new/
https://reviews.llvm.org/D97915
More information about the llvm-commits
mailing list