[all-commits] [llvm/llvm-project] f3d9f0: [clang-doc] Improve complexity of Index construction

Paul Kirth via All-commits all-commits at lists.llvm.org
Mon Feb 23 15:56:08 PST 2026


  Branch: refs/heads/users/ilovepi/clang-doc-simplify-indexing
  Home:   https://github.com/llvm/llvm-project
  Commit: f3d9f0441f3b8abf5855ab4c0dc6e6b1f41aeb82
      https://github.com/llvm/llvm-project/commit/f3d9f0441f3b8abf5855ab4c0dc6e6b1f41aeb82
  Author: Paul Kirth <paulkirth at google.com>
  Date:   2026-02-23 (Mon, 23 Feb 2026)

  Changed paths:
    M clang-tools-extra/clang-doc/Generators.cpp
    M clang-tools-extra/clang-doc/JSONGenerator.cpp
    M clang-tools-extra/clang-doc/MDGenerator.cpp
    M clang-tools-extra/clang-doc/Representation.cpp
    M clang-tools-extra/clang-doc/Representation.h
    M clang-tools-extra/clang-doc/YAMLGenerator.cpp
    M clang-tools-extra/unittests/clang-doc/ClangDocTest.cpp
    M clang-tools-extra/unittests/clang-doc/GeneratorTest.cpp

  Log Message:
  -----------
  [clang-doc] Improve complexity of Index construction

The existing implementation ends up with an O(N^2) algorithm due to
repeated linear scans during index construction. Switching to a
StringMap allows us to reduce this to O(N), since we no longer need to
search the vector.

The `BM_Index_Insertion` benchmark measures the time taken to insert N
unique records into the index.

| Scale (N Items) | Baseline (ns) | Patched (ns) | Speedup | Change |
|----------------:|--------------:|-------------:|--------:|-------:|
| 10              | 9,977         | 11,004       | 0.91x   | +10.3% |
| 64              | 69,249        | 69,166       | 1.00x   | -0.1%  |
| 512             | 1,932,714     | 525,877      | 3.68x   | -72.8% |
| 4,096           | 92,411,535    | 4,589,030    | 20.1x   | -95.0% |
| 10,000          | 577,384,945   | 12,998,039   | 44.4x   | -97.7% |

The patch delivers significant improvements to scalability. At 10,000
items, index construction is **~44 times faster**, confirming the
complexity reduction from O(N^2) to O(N). The crossover point where the
new map-based approach beats the vector-based approach appears to be
around N=64.

Since the index is typically larger than 64 for files of non trivial
complexity, and users will typically be building documentation for an
entire project with many files, all normal usage should benefit from
this change.

Other benchmarks show minor regressions, though in a typical build of
LLVM documentation index construction takes up a larger amount of
runtime than any of these other components.


  Commit: b204f736e11a4fb130ebe2d9ff48dad8d04bce6b
      https://github.com/llvm/llvm-project/commit/b204f736e11a4fb130ebe2d9ff48dad8d04bce6b
  Author: Paul Kirth <paulkirth at google.com>
  Date:   2026-02-23 (Mon, 23 Feb 2026)

  Changed paths:
    M clang-tools-extra/clang-doc/JSONGenerator.cpp
    M clang-tools-extra/clang-doc/MDGenerator.cpp
    M clang-tools-extra/clang-doc/Representation.cpp
    M clang-tools-extra/clang-doc/YAMLGenerator.cpp

  Log Message:
  -----------
  Format


Compare: https://github.com/llvm/llvm-project/compare/0f9ef9e718c0...b204f736e11a

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list