[PATCH] D123045: [clang][extract-api] Fix small issues with SymbolGraphSerializer
Daniel Grumberg via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Apr 5 03:18:02 PDT 2022
dang marked 2 inline comments as done.
dang added a comment.
In D123045#3428030 <https://reviews.llvm.org/D123045#3428030>, @QuietMisdreavus wrote:
> In D123045#3427992 <https://reviews.llvm.org/D123045#3427992>, @zixuw wrote:
>> In D123045#3427699 <https://reviews.llvm.org/D123045#3427699>, @QuietMisdreavus wrote:
>>> After a quick scan comparing the current output of these symbol graphs with the primary library used for reading them <https://github.com/apple/swift-docc-symbolkit>, the last thing i can spot that's "off" is that the "function signature" is currently being serialized under a `parameters` field instead of the required `functionSignature`.
>> Hmm, I was looking at the format specification at https://github.com/apple/swift-docc-symbolkit/blob/0a45209833f4a151212c1aa38e13cfc03b9462e4/openapi.yaml#L307, and I searched the term `functionSignature` in the spec but found no property with that name (except for the `FunctionSignature` schema that the `parameters` property is referring to). But anyway this should be a easy fix.
> It seems like the specification and implementation have diverged. The parser in swift-docc-symbolkit is looking for a `functionSignature` field by virtue of how it names its "coding key" <https://github.com/apple/swift-docc-symbolkit/blob/0a45209833f4a151212c1aa38e13cfc03b9462e4/Sources/SymbolKit/SymbolGraph/Symbol/Symbol.swift#L106>. By comparison, here is the functionSignature emitter in the Swift symbol-graph generator <https://github.com/apple/swift/blob/89176021f314ff30d265712048038016cf3a7c12/lib/SymbolGraphGen/Symbol.cpp#L241>.
Yeah we should fix that I will do it as part of this patch since it seems like a small change.
Comment at: clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp:510
> zixuw wrote:
> > QuietMisdreavus wrote:
> > > zixuw wrote:
> > > > What's the cost/would it worth it to wrap the `emplace_back`s and `pop_back`s of `PathComponentContext` in meaningful APIs like `enterPathComponentContext` and `exitPathComponentContext`?
> > > > That way the code is more self-explanatory and easier to read. It took me some time and mental resources to figure out why the `pop_back` is placed here.
> > > What's the use of having the `emplace_back` call inside `serializeAPIRecord` but to pop it outside? It seems like it's easier to mess up for new kinds of records.
> > The reason to `emplace_back` the path component inside `serializeAPIRecord` is that the `pathComponent` field of the `Record` is serialized in there so you want to include the name of the symbol itself in the path component list.
> > The `pop_back` is delayed til all other additional serialization for a specific kind of record, for example `serializeEnumRecord` handles all the enum cases after processing the enum record itself using `serializeAPIRecord`. So in order to correctly include the enum name in the path components of all the enum cases, the enum name has to stay in `PathComponentContext` until all members are serialized.
> > This is exactly the reason why I wanted a clearer API to make it easier to see.
> Hmm now that I thought about this, it seems that it would be easier to understand and avoid bugs if we lift `PathComponentContext.emplace_back`/`enterPathComponentContext` out of `serializeAPIRecord`, because we have access to the record name before that anyway.
> So we establish a pre-condition of `serializeAPIRecord` that the correct path components would be set up in `PathComponentContext` before the call so we could still serialize the field inside that method. And in specific `serialize*Record` methods we push the record name, and pop out at the end.
> This way the push and pop operations would appear in pairs in a single block, saving the confusion and mental work of jumping across functions to see how `PathComponentContext` is evolving.
If you think we should have a specialized API I am happy to do this. I figured it was self-explanatory by the name of `PathComponentContext` but it clearly isn't so needs addressing. I put the `emplace_back` call in `serializeAPIRecord` since all the specific serialization methods call it. I thought it would make it impossible to forget to add them. @zixuw is correct in the reason why the pop is outside we don't want to pop before we have serialized the sub records by agree it is ugly and potentially error prone. I can see three ways forward for improving the ergonomics of this since this seems to be problematic:
- Provide `serializeAPIRecord` a continuation to serialize the sub records or additional fields, that way we can move the pop inside `serializeAPIRecord` I am not a fan personally because we get into JS style closure hell if we start doing this...
- Use a visitor pattern where the visitor would be responsible for managing `PathComponentContext` and do the appropriate push/pops in the `visit` methods. We would need to add a APIRecordVisitor type, and appropriate visit methods for each APIRecord. This would make sense because the records should really know how to visit themselves.
- Just add a specialized API although it seems it would really easy to forget to remove path components.
Let me know what you think is the way forward.
Comment at: clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp:657
Root["symbols"] = std::move(Symbols);
- Root["relationhips"] = std::move(Relationships);
+ Root["relationships"] = std::move(Relationships);
> oops 😬
Yeah that's on us for not catching this earlier.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the cfe-commits