[PATCH] D104137: Optimize lld::elf::ScriptLexer::getLineNumber by avoiding repeated work
Colin Cross via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 11 11:16:20 PDT 2021
ccross created this revision.
ccross added reviewers: srhines, pirama, MaskRay.
Herald added subscribers: arichardson, emaste.
ccross requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
getLineNumber() was counting the number of line feeds from the start
of the buffer to the current token. For large linker scripts this
became a performance bottleneck. For one 4MB linker script over 4
minutes was spent in getLineNumber's StringRef::count.
Store the line number from the last token, and only count the additional
line feeds since the last token.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D104137
Files:
lld/ELF/ScriptLexer.cpp
lld/ELF/ScriptLexer.h
Index: lld/ELF/ScriptLexer.h
===================================================================
--- lld/ELF/ScriptLexer.h
+++ lld/ELF/ScriptLexer.h
@@ -40,6 +40,9 @@
bool inExpr = false;
size_t pos = 0;
+ size_t lastLineNumber = 0;
+ size_t lastLineNumberOffset = 0;
+
protected:
MemoryBufferRef getCurrentMB();
Index: lld/ELF/ScriptLexer.cpp
===================================================================
--- lld/ELF/ScriptLexer.cpp
+++ lld/ELF/ScriptLexer.cpp
@@ -56,7 +56,28 @@
return 1;
StringRef s = getCurrentMB().getBuffer();
StringRef tok = tokens[pos - 1];
- return s.substr(0, tok.data() - s.data()).count('\n') + 1;
+
+ // For the first token, or when going backwards, start from the beginning of
+ // the buffer.
+ size_t line = 1;
+ size_t start = 0;
+
+ const size_t tokOffset = tok.data() - s.data();
+
+ // If this token is after the previous token start from the previous token.
+ if (lastLineNumberOffset > 0 && tokOffset >= lastLineNumberOffset) {
+ start = lastLineNumberOffset;
+ line = lastLineNumber;
+ }
+
+ // Add the number of linefeeds since the start of the region of interest.
+ line += s.substr(start, tokOffset - start).count('\n');
+
+ // Store the line number of this token for reuse.
+ lastLineNumberOffset = tokOffset;
+ lastLineNumber = line;
+
+ return line;
}
// Returns 0-based column number of the current token.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D104137.351508.patch
Type: text/x-patch
Size: 1411 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210611/a4a20f77/attachment.bin>
More information about the llvm-commits
mailing list