[PATCH] D51585: [clangd] Define a compact binary serialization fomat for symbol slab/index.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 3 04:12:29 PDT 2018


sammccall created this revision.
sammccall added a reviewer: ioeric.
Herald added subscribers: cfe-commits, kadircet, arphaman, mgrang, jkorous, MaskRay, ilya-biryukov, mgorny.

This is intended to replace the current YAML format for general use.
It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML:

  llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M

It's also simpler/faster to read and write.

The format is a RIFF container (chunks of (type, size, data)) with:

- a compressed string table
- simple binary encoding of symbols (with varints for compactness)

It can be extended to include occurrences, Dex posting lists, etc.

There's no rich backwards-compatibility scheme, but a version number is included
so we can detect incompatible files and do ad-hoc back-compat.

Alternatives considered:

- compressed YAML or JSON: bulky and slow to load
- llvm bitstream: confusing model and libraries are hard to use. My attempt produced slightly larger files, and the code was longer and slower.
- protobuf or similar: would be really nice (esp for back-compat) but the dependency is a big hassle
- ad-hoc binary format without a container: it seems clear we're going to add posting lists and occurrences here, and that they will benefit from sharing a string table. The container makes it easy to debug these pieces in isolation, and make them optional.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D51585

Files:
  clangd/CMakeLists.txt
  clangd/RIFF.cpp
  clangd/RIFF.h
  clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
  clangd/index/Index.cpp
  clangd/index/Index.h
  clangd/index/Serialization.cpp
  clangd/index/Serialization.h
  clangd/tool/ClangdMain.cpp
  unittests/clangd/CMakeLists.txt
  unittests/clangd/RIFFTests.cpp
  unittests/clangd/SerializationTests.cpp
  unittests/clangd/SymbolCollectorTests.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D51585.163687.patch
Type: text/x-patch
Size: 38216 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20180903/14397607/attachment-0001.bin>


More information about the cfe-commits mailing list