[clangd-dev] LSP Cancellations - making it universal and measurable?

Wed Sep 12 06:53:21 PDT 2018

Sorry for the last long reply, didn't see your message...

On Wed, Sep 12, 2018, 15:23 Ilya Biryukov <ibiryukov at google.com> wrote:

> After thinking more about it, I actually find my own arguments
> unconvincing and fully support the suggested API changes.
> It occurred to me I was driven by the design that couples async task
> lifetimes to their cancellation mechanics.
> But looking at it more closely, it does not seem that's the right design.
> The scope of cancellable pieces of work can be arbitrary large and tying
> them to the lifetimes of actual async computations does not sound useful.
> The "implicit dataflow" concerns that I raised are also mostly
> circumvented by the fact that we can simply track were cancellation
> functions are created and stored.
>
Ack, I think that was the bit I didn't explain well previously.

I take my arguments back, sorry for the overly long and defensive
> discussion, I guess I was overly defensive of the solution that I proposed.
>
Not at all, this has helped me (us both?) clarify thinking. And I don't
think these changes are so dramatic - the same mechanisms exposed in
different ways.
(FWIW I didn't have any idea about how cancellation APIs should work until
we had a concrete design/implementation to examine, so I hope I didn't come
across too... stompy)

Happy to help with the proposed changes too.
>
> On Wed, Sep 12, 2018 at 1:33 PM Ilya Biryukov <ibiryukov at google.com>
> wrote:
>
>> It's a good thing we kept from doing intrusive changes to all of the
>> APIs, prototyping only for code completion. Hopefully the refactoring won't
>> be too painful.
>> It's up to you to decide which approach to take at the end. Here's my
>> take on trying to convince that the current API make sense
>>
>> I can see the benefits of the proposed API if we were designing
>> cancellation for distributed web services where everything is an RPC and
>> there is a clear need to propagate the cancellation information through RPC
>> call chains, possibly in different processes or even on different machines.
>> It feels clangd is not like that, it's a standalone tool (or a library)
>> which at the end that manages its own async tasks. Propagating cancellation
>> handles does not seem like a big problem there, everything is in a single
>> process and data-flow is mostly simple.
>>
>> E.g. if what we want is to move the cancellation handling to
>> JSONRPCDispatcher, we could easily do it by making the dispatched methods
>> return the TaskHandles that we need to store.
>> It's a simple refactoring, spanning only one layer.
>> I doubt the other use-case that we have (aforementioned internal editor
>> that embeds clangd) will be much work either, though I may be wrong.
>>
>> > I think the biggest difference is whether the cancelability of an
>> operation is represented in its static type (e.g. function return type). Is
>> this what you mean?
>> Yep, I meant encoding what we want in the type systems.
>>
>> > I guess I don't see that as a bad thing?
>> My preference for encoding in type systems is that I can easily trace the
>> created task back to whoever performs cancellations.
>> It feels like this is gonna be slightly trickier in the new approach,
>> since not only the consumers, but also the producers of cancellable tasks
>> can now be anywhere up the context call chain.
>>
>> > Cancellation is a cross-cutting concern that you mostly want to ignore,
>> except at layers where: a) you can meaningfully abort early to save some
>> work, or b) you want to start a chunk of work and maybe cancel it later.
>> Agree that the cancellation *checking* cuts though all the layers and
>> requiring passing objects through all of the layers is a pain.
>> Not so sure about the "producers" of tasks. We can't possibly keep
>> functions that start cancellable work unchanged when adding cancellation.
>> In both the new proposal and the current implementation, we'll have to:
>> - Allow the functions that can be cancelled to report they were cancelled
>> to the callers. I guess both approaches would use the CancelledError for
>> this.
>> - Mark the functions that support cancellation somehow. Either with a
>> comment (the new proposal) or with an explicit return type (the current
>> implementation).
>> The cancellation tasks would be allowed to be created further up the call
>> chain, but that adds some "magic" to the code and makes the clients
>> responsible for creating the tasks properly and writing all the boilerplate.
>>
>> > Examples of why this seems right:
>> > - there's no reason that a cancelable span should be coupled 1:1 with
>> an LSP method
>> Totally agree, why do you feel the cancellable span is coupled to an LSP
>> method in the current API?
>> > - whether a function will *respond* to cancellation is always a
>> *dynamic* property. This is a QoI issue, not a contract.
>> Totally agree.
>>
>> > - It allows us to support cancellation of all methods without adding a
>> bunch of bookkeeping to each one (or equivalent in templates)
>> I don't think bookkeeping is much of a pain, most methods will just
>> return TaskHandles and at the level where it's appropriate to cancel, where
>> one will need to store them.
>> I guess the arguments are about the same as with other things that could
>> potentially be passed through Context.
>> I would prefer to use Context as little as possible and in that case I
>> would prefer explicit data flow, but can see your point if you don't think
>> that way.
>> > - it allows embedders to support cancelling methods at the layer it
>> makes sense without plumbing cancellation objects around
>> Point taken, I view the fact that we can trace the data flow back to the
>> cancellation sites as a benefit.
>> > - it avoids divergences in APIs based on whether we've implemented
>> early-exit for this operation or not
>> In the current implementation, the APIs that we believe should be
>> cancellable should just always indicate they can be cancelled (i.e. create
>> a new task), even if they won't do early exit in practice.
>> > - putting values in the return type that are not the result of the
>> computation surprises the reader
>> Providing results in a callback also makes the code slightly tricky.
>> Having cancellation boilerplate does not make the client code any better
>> either. (At any layer really).
>> Granted that it's probably easier to just support cancelling everything
>> somewhere up the call chain with the proposed implementation, but I would
>> argue it's not much harder in the current approach, the only thing that's
>> required is plumbing the TaskHandle up to that layer. Does not feel like a
>> lot of work (but, again, we seem to disagree here).
>>
>>
>> Overall, if you feel strongly we should prefer the new approach over the
>> current implementation, happy to submit to your point of view.
>> I feel it's not much work to support the current implementation in the
>> callers that we have, but if you don't see the benefit in the current APIs
>> and strongly feel the new ones are better, let's just do the new ones.
>>
>>
>> On Wed, Sep 12, 2018 at 11:59 AM Sam McCall <sammccall at google.com> wrote:
>>
>>> Guess I was right about which piece would be controversial.
>>> (For those missing context, Cider is a non-public editor that embeds
>>> ClangdServer)
>>>
>>> On Wed, Sep 12, 2018 at 11:28 AM Ilya Biryukov <ibiryukov at google.com>
>>> wrote:
>>>
>>>> Hi Sam,
>>>>
>>>>
>>>> A few things that come to mind:
>>>> >  - other users of ClangdServer would set up cancellation in the same
>>>> way: by creating a task handle and calling setCurrentTask() before invoking
>>>> a request. Or they can not do so if they don't support cancellation
>>>> (isCancelled() returns false if there's no task).
>>>>
>>>> My stance is that explicit APIs for cancellation are better approach.
>>>>
>>> I'd like to clarify what you mean by "explicit", because the
>>> *cancellation* API is explicit in both cases, and the *coordination*
>>> between the cancellation and the cancel-check is (to a first approximation)
>>> magic in both cases.
>>>
>>> I think the biggest difference is whether the cancelability of an
>>> operation is represented in its static type (e.g. function return type). Is
>>> this what you mean?
>>>
>>>
>>>> They clearly state the contract of the API, it's impossible to
>>>> accidentally reuse the existing task handle (ClangdServer is creating a new
>>>> task every time) multiple times for different tasks from the client code
>>>> and they are (arguably) easier to use from Cider.
>>>> The "task in the context" approach, OTOH, means users will have to read
>>>> through the comments to even discover that the cancellation is there.
>>>>
>>> I guess I don't see that as a bad thing? Cancellation is a cross-cutting
>>> concern that you mostly want to ignore, except at layers where: a) you can
>>> meaningfully abort early to save some work, or b) you want to start a chunk
>>> of work and maybe cancel it later.
>>> Examples of why this seems right:
>>>  - there's no reason that a cancelable span should be coupled 1:1 with
>>> an LSP method
>>>  - whether a function will *respond* to cancellation is always a
>>> *dynamic* property. This is a QoI issue, not a contract.
>>>
>>> I may be missing the reasons on why the proposed approach is better. Any
>>>> suggestions I'm missing?
>>>>
>>> - It allows us to support cancellation of all methods without adding a
>>> bunch of bookkeeping to each one (or equivalent in templates)
>>> - it allows embedders to support cancelling methods at the layer it
>>> makes sense without plumbing cancellation objects around
>>> - it avoids divergences in APIs based on whether we've implemented
>>> early-exit for this operation or not
>>> - putting values in the return type that are not the result of the
>>> computation surprises the reader
>>>
>>> > 1) task should record the time of requested cancellation
>>>> +1
>>>> > 3) TUScheduler should be cancellation-aware
>>>> +1
>>>>
>>>> > 4) We should hide the Task object.
>>>> > (again, this borrows heavily from go)
>>>> Go was definitely designed with cancellation (I assume of both RPCs and
>>>> general computations?) in mind,
>>>>
>>> For what it's worth, it wasn't - Context and cancellation was very much
>>> bolted on in the same way it was in clangd.
>>> The only real difference is they decided not to use TLS for Context. But
>>> they decided not to signal cancelability explicitly in the API for the
>>> usual reasons (intermediate stack frames don't care).
>>>
>>>
>>>> but my view on this is that in C++ it's more idiomatic to make the
>>>> contracts like this explicit in the API. TaskHandle exists to make sure the
>>>> callers of async APIs get an explicit object they can poke with, see the
>>>> first comment about explicit vs implicit APIs.
>>>>
>>>> > 5) Random thought: we could support deadlines (automatic cancellation
>>>> after some time), this is useful in hosted scenarios though probably not
>>>> for standalone workstation use.
>>>> +1, unless it's trivial to do it on Cider side. In which case, maybe we
>>>> could only support this in Cider to avoid adding the code to OS version
>>>> that we won't use outside our environment.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 12, 2018 at 10:57 AM Sam McCall <sammccall at google.com>
>>>> wrote:
>>>>
>>>>> Hey Ilya and Kadir,
>>>>>
>>>>> Was trying to understand how much we (can) win from cancellation, and
>>>>> what's involved in instrumenting a LSP method for cancellation.
>>>>>
>>>>> Have a couple of proposals, wanted to discuss first them rather than
>>>>> sending a patch because they're separate but interact.
>>>>>
>>>>> 1) task should record the time of requested cancellation. For
>>>>> analysis, there are 3 interesting timestamps - task start, (optional)
>>>>> cancellation, and end. The creator of the task currently can get start and
>>>>> end (via context cleanup), but not cancellation time. This allows us to
>>>>> measure how much we can improve cancellation (by bailing out "lame duck"
>>>>> tasks earlier).
>>>>> Implementation is easy, just change the atomic<bool> to
>>>>> atomic<steady_clock::rep>
>>>>>
>>>>> 2) we should support cancellation of any method, even if early bailout
>>>>> isn't yet implemented.
>>>>> Main benefit: we can then measure the lame duck time for all methods
>>>>> (as above), and know where to implement early bailout. Side benefit: more
>>>>> uniform/less API churn.
>>>>> Implementation (I have a prototype):
>>>>>  - LSP $/cancelRequest would move to JSONRPCDispatcher so it can cut
>>>>> across all methods. Cleanup of TaskHandles would be handled with a context
>>>>> destructor.
>>>>>  - other users of ClangdServer would set up cancellation in the same
>>>>> way: by creating a task handle and calling setCurrentTask() before invoking
>>>>> a request. Or they can not do so if they don't support cancellation
>>>>> (isCancelled() returns false if there's no task).
>>>>> This is very similar to golang context cancellation.
>>>>>
>>>>> 3) TUScheduler should be cancellation-aware
>>>>> This seems like an easy, cross-cutting win, but we should measure.
>>>>>
>>>>> 4) We should hide the Task object - it adds API noise and it offers
>>>>> too many facilities to the wrong actors.
>>>>> (maybe this is controversial, and somewhat less related).
>>>>>  - Task::isCancelled() appears to be redundant, you never want to
>>>>> check another task's status and checking for cancellation in a tight loop
>>>>> is an antipattern that doesn't seem worth optimizing for
>>>>>  - cancel() is only for the creator of a task (current API enforces
>>>>> this)
>>>>>  - TaskHandle/ConstTaskHandle/getCurrentTask just exist to support
>>>>> exposing these details
>>>>>  - most actions have obvious extensions to cases where there is no
>>>>> active task, but the current API makes this awkward
>>>>> So this would leave us with something like (modulo names):
>>>>> bool isTaskCancelled();
>>>>> using Canceler = std::function<void()>;
>>>>> pair<Context, Canceler> startCancelableTask();
>>>>> (again, this borrows heavily from go
>>>>>
>>>>> 5) Random thought: we could support deadlines (automatic cancellation
>>>>> after some time), this is useful in hosted scenarios though probably not
>>>>> for standalone workstation use.
>>>>>
>>>>> WDYT?
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Ilya Biryukov
>>>>
>>>
>>
>> --
>> Regards,
>> Ilya Biryukov
>>
>
>
> --
> Regards,
> Ilya Biryukov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/clangd-dev/attachments/20180912/75e214c4/attachment-0001.html>