[all-commits] [llvm/llvm-project] a8b55b: [clangd] Use Decision Forest to score code complet...

UTKARSH SAXENA via All-commits all-commits at lists.llvm.org
Mon Sep 28 09:59:52 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: a8b55b6939a5962d5b2bf1a57980562d6f3045e5
      https://github.com/llvm/llvm-project/commit/a8b55b6939a5962d5b2bf1a57980562d6f3045e5
  Author: Utkarsh Saxena <usx at google.com>
  Date:   2020-09-28 (Mon, 28 Sep 2020)

  Changed paths:
    M clang-tools-extra/clangd/CodeComplete.cpp
    M clang-tools-extra/clangd/CodeComplete.h
    M clang-tools-extra/clangd/Quality.cpp
    M clang-tools-extra/clangd/Quality.h
    M clang-tools-extra/clangd/tool/ClangdMain.cpp
    M clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp

  Log Message:
  -----------
  [clangd] Use Decision Forest to score code completions.

By default clangd will score a code completion item using heuristics model.

Scoring can be done by Decision Forest model by passing `--ranking_model=decision_forest` to
clangd.

Features omitted from the model:
- `NameMatch` is excluded because the final score must be multiplicative in `NameMatch` to allow rescoring by the editor.
- `NeedsFixIts` is excluded because the generating dataset that needs 'fixits' is non-trivial.

There are multiple ways (heuristics) to combine the above two features with the prediction of the DF:
- `NeedsFixIts` is used as is with a penalty of `0.5`.

Various alternatives of combining NameMatch `N` and Decision forest Prediction `P`
- N * scale(P, 0, 1): Linearly scale the output of model to range [0, 1]
- N * a^P:
  - More natural: Prediction of each Decision Tree can be considered as a multiplicative boost (like NameMatch)
  - Ordering is independent of the absolute value of P. Order of two items is proportional to `a^{difference in model prediction score}`. Higher `a` gives higher weightage to model output as compared to NameMatch score.

Baseline MRR = 0.619
MRR for various combinations:
N * P = 0.6346, advantage%=2.5768
N * 1.1^P = 0.6600, advantage%=6.6853
N * **1.2**^P = 0.6669, advantage%=**7.8005**
N * **1.3**^P = 0.6668, advantage%=**7.7795**
N * **1.4**^P = 0.6659, advantage%=**7.6270**
N * 1.5^P = 0.6646, advantage%=7.4200
N * 1.6^P = 0.6636, advantage%=7.2671
N * 1.7^P = 0.6629, advantage%=7.1450
N * 2^P = 0.6612, advantage%=6.8673
N * 2.5^P = 0.6598, advantage%=6.6491
N * 3^P = 0.6590, advantage%=6.5242
N * scaled[0, 1] = 0.6465, advantage%=4.5054

Differential Revision: https://reviews.llvm.org/D88281




More information about the All-commits mailing list