[PATCH] D53427: [clangd] Replace StringRef in SymbolLocation with a char pointer.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 14 06:38:46 PST 2018


sammccall added inline comments.


================
Comment at: clang-tools-extra/trunk/clangd/index/Index.h:71
+  // via llvm::StringRef.
+  const char *FileURI = "";
 };
----------------
alexfh wrote:
> hokein wrote:
> > alexfh wrote:
> > > If the users of the SymbolLocation have a way to get a common "context", you could keep a list of filenames in this context and just store an index into this list (which could be even shorter). If the structure is used where no context is available, you could still keep a list of StringRefs somewhere (probably close to where the string data pointed by the StringRefs lives) and keep a `StringRef *` here. At a cost of one indirection you'd avoid calling strlen.
> > > 
> > > Not sure what trade-off is better for this particular use case though.
> > `SymbolLocation` is a public interface, and the underlying data is stored in arena (which is not visible to clients), so clients don't know any `Context` except `SymbolLocation`.
> > 
> >  Using `StringRef *` is an interesting idea -- keeping a list of `StringRefs` needs extra space, the number should be relatively small (only the number of different files), so we will spend  extra `<num of files> * sizeof(StringRef)` bytes (for llvm, ~128 KB memory) , compared with the raw pointer solution.
> Another option mentioned by Ilya offline is similar to MS's BSTR: keep the length along with the string data in the arena. Something like:
> 
> ```
> class InlineString {
>   size_t Length;
>   char Data[0];  // Open-ended array with a null-terminated string in it.
>  public:
>    operator StringRef() const { return StringRef(&Data[0], Length); }
>    ...
> };
> ```
> 
> And then keep `const InlineString *` here.
Yeah, operator `StringRef()` doesn't really work well unfortunately (I tried a `StringRefz` which just wrapped a char pointer, in a bid to avoid accidentally using pointer semantics).

Conversion rules are too complicated and existing APIs are already maximally constrained, so you can't satisfy all of {functions that take StringRef only, functions that take Twine only, functions that overload for StringRef/char*/std::string, functions that *also* overload for Twine, ...}


Repository:
  rL LLVM

https://reviews.llvm.org/D53427





More information about the cfe-commits mailing list