[PATCH] D126723: [pseudo] GC GSS nodes, reuse them with a freelist

Wed Jun 1 03:21:13 PDT 2022

hokein added inline comments.

================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:87
     NewHeads.clear();
+    MaybeGC();
     glrReduce(PendingReduce, Params,
----------------
sammccall wrote:
> hokein wrote:
> > I would just call it before the `NewHeads.clear()`, and run the `gc` on the `NewHeads` directly.
> > 
> > It simplifies the implementation (no need to traverse three pending actions, and no duplicated elements in Roots). The downside is that some dead heads (heads with no available actions on `Terminal.symbol()`) will not be GCed in this run, but I think it is negligible, as those will be GCed in next run.
> Oops, I forgot that Pending* are all empty at the beginning of this loop. I should probably add an assertion for that.
> 
> We can call it right at the top of the loop, yes? It doesn't need to be between AddSsteps and NewHeads.clear().
> 
> > run the gc on the NewHeads directly
> Unfortunately not: the GC naturally consumes the argument (it becomes the stack for the DFS) so we need a copy.
> We could move the copy into gc() but:
>  - this way we can reuse the same vector every time and don't have to allocate
>  - soon we'll have error-recovery candidates as additional roots we want to pass in
> Oops, I forgot that Pending* are all empty at the beginning of this loop. I should probably add an assertion for that.

Yes. I would probably do it after glrReduce and glrShift calls.

> We can call it right at the top of the loop, yes? It doesn't need to be between AddSsteps and NewHeads.clear().

Yes, exactly.  We could also do that at the end of the loop.

> > run the gc on the NewHeads directly
> Unfortunately not: the GC naturally consumes the argument (it becomes the stack for the DFS) so we need a copy.
> We could move the copy into gc() but:
>  - this way we can reuse the same vector every time and don't have to allocate
>  - soon we'll have error-recovery candidates as additional roots we want to pass in

oh, I didn't notice that. I think making a copy is fine.

================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:391
+                              llvm::ArrayRef<const Node *> Parents) {
+  ++NodeCount;
+  Node *Result = new (allocate(Parents.size()))
----------------
sammccall wrote:
> hokein wrote:
> > The NodeCount becomes vague now, we update in both addNode and allocate methods, I think we should have two separate counts.
> - Oops, that duplicated ++NodeCount is just a mistake.
> - names are vague now, both "NodeCount" and "Allocated" are dubious. Maybe I'll rename to `NodesCreated` and `Alive`, WDYT?
> - are there actually more metrics we want to record? we have nodes-created, currently-live-nodes (Allocated.size()), nodes-destroyed (`NodeCount - Allocated.size()`). We could track max-live-node but I think Arena.bytes() is fine for our purposes.
> - do we want to expose any of those numbers through APIs? I couldn't work out what I'd actually do with them.
> names are vague now, both "NodeCount" and "Allocated" are dubious. Maybe I'll rename to NodesCreated and Alive, WDYT?

Sounds good to me.

> are there actually more metrics we want to record? we have nodes-created, currently-live-nodes (Allocated.size()), nodes-destroyed (NodeCount - Allocated.size()). We could track max-live-node but I think Arena.bytes() is fine for our purposes.

I think these are sufficient, should provide enough details about GSS-memory-related information. The other things I will like to record is the max-active-heads, but that's a different bit.

> do we want to expose any of those numbers through APIs? I couldn't work out what I'd actually do with them.

I think we should avoid to expose an API per field, we probably just need a single API which returns a Stats struct containing all these things. We can do it later.

================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:403
+  ++NodeCount;
+  if (FreeList.size() <= Parents)
+    FreeList.resize(Parents + 1);
----------------
sammccall wrote:
> hokein wrote:
> > What's the practical size of the `FreeList`? I think we can just find one, and use it as initial default size, to save some cost of resize. 
> It's whatever the max #parents is - maybe 10 or something?
> I don't think it's worth adding a hard-coded guess to save ~1 resize of a single vector per parse!
I would image there will be multiple resize calls, if the path like `allocate(1), allocate(3), ..., allocate(max)`, which I think it is practical? unless the first call is `allocate(max)` which is unlikely I think.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126723/new/

https://reviews.llvm.org/D126723