[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

Jan Svoboda via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 8 04:52:15 PST 2021


jansvoboda11 marked 2 inline comments as done.
jansvoboda11 added inline comments.


================
Comment at: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h:54
+  /// and invoke the given function right after the move.
+  SmallVectorMemoryBuffer(
+      SmallVectorImpl<char> &&SV,
----------------
dexonsmith wrote:
> jansvoboda11 wrote:
> > I'm not happy with introducing new (hacky) constructor.
> > 
> > But, the best alternative I could come up with is to avoid using `SmallVectorMemoryBuffer`, store the `SmallString` with minimized contents somewhere in the filesystem and refer to it via regular `MemoryBuffer`. This seems needlessly complicated, so hacky constructor it is...
> > 
> > Side question: is the hassle with implicit null-termination (expected by Clang's lexer) worth the complications? What are the benefits anyway?
> It'd be better to add a `RequiresNullTerminator` argument to the constructor, matching the MemoryBuffer factory functions, which (if set) does the `push_back`/`pop_back` dance, and in either case passes it through to `init()`.
> ```
> lang=c++
>   SmallVectorMemoryBuffer(SmallVectorImpl<char> &&SV, bool RequiresNullTerminator = true)
>       : SV(std::move(SV)), BufferName("<in-memory object>") {
>     if (RequiresNullTerminator) {
>       this->SV.push_back(0);
>       this->SV.pop_back(0);
>     }
>     init(this->SV.begin(), this->SV.end(), RequiresNullTerminator);
>   }
> ```
> 
> If you do that, you should update the (few) existing callers to explicitly pass `false`. Note that `RequiresNullTerminator` defaults to true in all the other `MemoryBuffer` APIs so it'd be confusing to default it to `false` here.
> 
> > Side question: is the hassle with implicit null-termination (expected by Clang's lexer) worth the complications? What are the benefits anyway?
> 
> Not sure whether it's a micro-optimization for getting better codegen or if it just predates `StringRef`, which made it much easier to reason about a byte range as a single unit (rather than tracking two separate pointers, which is more error-prone). I somewhat doubt it's worth it, which is why when I wrote the minimizer I used `StringRef`, but I don't have data and it'd be a lot of work to change. (The minimizer is fast enough as-is, but it does less work than the full Clang lexer. It's plausible that the optimizations matter there?)
You're right that exposing the `RequiresNullTerminator` parameter is much clearer than passing in a random lambda. I've extracted the change into D115331.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043



More information about the llvm-commits mailing list