[PATCH] D83814: [clangd] Add Random Forest runtime for code completion.

Thu Sep 10 11:08:59 PDT 2020

usaxena95 added a comment.

Hi @jkorous

> Do you guys intend to open-source also the training part of the model pipeline ?

Open sourcing the training part (both dataset generation and using an open sourced DecisionForest based framework for training) has been on our radar. Although gathering capacity for this task has been difficult lately.

> Publish a model trained on generic-enough training set so it could be reasonably used on "any" codebase?

Although the current model has not been trained on a generic codebase, but since the features involved doesn't capture code style/conventions/variable names, it is likely that it performs well on generic code bases as well. This remains to be tested.

> Do you still intend to support the heuristic that is currently powering clangd in the future?

Currently we are planning to use this model behind a flag. Initially we would be focusing on comparing the two. Since maintaining and developing signals is easier for an ML model, we might end up deprecating the heuristics.

Thanks,
Utkarsh.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83814/new/

https://reviews.llvm.org/D83814