Introduce StringBuilder utility around raw_string_ostream
Alp Toker
alp at nuanti.com
Thu Jun 12 08:16:15 PDT 2014
On 12/06/2014 16:54, Chandler Carruth wrote:
> How about an even simpler approach.
>
> Put a std::string inside raw_string_ostream, and a default constructor
> which binds the reference to its internal std::string. I'm moderately
> confident that our optimizers will remove the "dead" std::string that
> is left in the class by users that want to leverage external storage.
>
> This has the advantage of not introducing any new types or patterns,
> merely removing the requirement to pass in a std::string to provide
> storage by using internal storage.
>
> That's getting far enough down in the simplicity (for folks learning
> about the APIs) to make me quite happy.
Okay, but that's still causing std::string heap allocations.
With my approach I can do:
typedef SmallStringBuilder<128> StringBuilder;
That turns a whole lot of allocations and strlens across the board into
no-ops with a simple centralised change (patch attached, based on Yaron
Keren's suggestion). This really *does* make an impact and there are
secondary benefits to consider like reduced fragmentation (though I
don't have figures, it's clearly going to help). Stack allocation always
wins for small strings outside of recursion scenarios.
And the patch also enforces clean storage reset() which is a really
tricky operation to do manually with an ostream (requires sequential
flush, storage reset, and sync -- do we really expect people to
hand-write that without bugs?)
And it'll be visible to further PGO optimisations (anything that tunes a
SmallString will apply here too).
On the other hand the approach you suggested just adds some syntactic
sugar, at the cost of tying the code closer to std::string heap storage
-- exactly what I'm incrementally working to avoid.
Alp.
>
>
> On Thu, Jun 12, 2014 at 2:39 PM, Alp Toker <alp at nuanti.com
> <mailto:alp at nuanti.com>> wrote:
>
>
> On 12/06/2014 14:17, Yaron Keren wrote:
>
> This looks like a very useful class. It should work for the
> similar SmallString pattern, right?
>
> SmallString<256> OutName;
> llvm::raw_svector_ostream Out(OutName);
> cast<ItaniumMangleContext>(CGM.getCXXABI().getMangleContext()).mangleCXXVTT(RD,
> Out);
> Out.flush();
> StringRef Name = OutName.str();
>
> ->
>
> StringBuilder Out;
> cast<ItaniumMangleContext>(CGM.getCXXABI().getMangleContext()).mangleCXXVTT(RD,
> Out);
> StringRef Name = Out.str();
>
>
> Yeah, that'd work. But for cases like the above we should first
> provide a SmallString<> backed specialization that permits
> StringBuilder<256> Out;
>
> The initial patch I posted is aimed more at the std::string-backed
> cases. A stack-allocating specialization should follow nicely from
> that.
>
> Alp.
>
>
>
> Yaron
>
>
>
> 2014-06-12 8:01 GMT+03:00 Alp Toker <alp at nuanti.com
> <mailto:alp at nuanti.com> <mailto:alp at nuanti.com
> <mailto:alp at nuanti.com>>>:
>
>
>
> On 12/06/2014 07:41, David Blaikie wrote:
>
> (would find this a little easier to review if it were
> in Phab)
>
> One off-the-cuff thought: If it's not already
> documented, you
> might
> want to put a comment in raw_string_ostream's ctor that
> declares that
> it /really shouldn't/ touch the string (since you're
> passing in an
> object in the derived class that won't itself be
> constructed until
> after the raw_string_ostream's ctor has finished). Or,
> better
> yet, if
> there's a way to build a raw_string_ostream without a
> string, then
> attach it later (in the constructor of StringBuilder) that
> might be
> good.
>
>
> Sure, I'll add a comment. (But I don't think it's a big
> concern to
> justify any major shuffle -- the ctor is right there
> inline to see
> a few lines above, and it seems unlikely a stream adapter
> would
> ever touch its storage unless written to.)
>
>
> (insert various bikeshedding about the name - though
> I'm not
> really
> sure raw_ostream ever needed to adopt the C++ standard
> library
> naming
> convention in the first place, but I haven't looked
> closely)
>
>
> So, I thought about that..
>
> Having tried raw_string_builder_ostream,
> string_builder_ostream,
> raw_string_builder, I ended up with StringBuilder as the
> only one
> that looked beautiful in use.
>
> Hard to say why, but it's probably because it has a dual
> role as
> storage and streaming interface unlike the ostream adapter
> classes, making the distinction significant.
>
> Alp.
>
>
>
> On Wed, Jun 11, 2014 at 9:25 PM, Alp Toker
> <alp at nuanti.com <mailto:alp at nuanti.com>
> <mailto:alp at nuanti.com <mailto:alp at nuanti.com>>> wrote:
>
> Replaces the pattern:
>
> std::string buf;
> raw_string_ostream os(buf);
> ...
> os.flush();
> use(buf);
>
> with:
> StringBuilder os;
> ...
> use (os.str());
>
> Benefits:
>
> * Provides an inherently safe and opaque
> interface for
> building std::string
> instances
> * Opens up the possibility of changing the
> underlying
> storage in future.
>
> Patch also converts various uses, some of which were
> accessing the storage
> without flushing.
>
> Alp.
>
> --
> http://www.nuanti.com
> the browser experts
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> <mailto:llvm-commits at cs.uiuc.edu
> <mailto:llvm-commits at cs.uiuc.edu>>
>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
> -- http://www.nuanti.com
> the browser experts
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> <mailto:llvm-commits at cs.uiuc.edu
> <mailto:llvm-commits at cs.uiuc.edu>>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> --
> http://www.nuanti.com
> the browser experts
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
--
http://www.nuanti.com
the browser experts
-------------- next part --------------
diff --git a/include/llvm/Support/raw_ostream.h b/include/llvm/Support/raw_ostream.h
index 4d342da..ae2c113 100644
--- a/include/llvm/Support/raw_ostream.h
+++ b/include/llvm/Support/raw_ostream.h
@@ -15,6 +15,7 @@
#define LLVM_SUPPORT_RAW_OSTREAM_H
#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/SmallVector.h"
#include "llvm/Support/Compiler.h"
#include "llvm/Support/DataTypes.h"
@@ -463,6 +464,12 @@ class raw_svector_ostream : public raw_ostream {
/// current_pos - Return the current position within the stream, not
/// counting the bytes currently in the buffer.
uint64_t current_pos() const override;
+
+protected:
+ explicit raw_svector_ostream(SmallVectorImpl<char> &O, std::nullptr_t)
+ : OS(O){};
+ void init();
+
public:
/// Construct a new raw_svector_ostream.
///
@@ -497,22 +504,26 @@ public:
/// StringBuilder - A raw_ostream that builds an std::string. This is
/// a raw_string_ostream with storage.
-class StringBuilder : public raw_string_ostream {
- std::string Buffer;
+template <unsigned InternalLen>
+class SmallStringBuilder : public raw_svector_ostream {
+ SmallVector<char, InternalLen> Buffer;
protected:
// There's no need to explicitly flush a StringBuilder.
- using raw_string_ostream::flush;
+ using raw_svector_ostream::flush;
public:
- StringBuilder() : raw_string_ostream(Buffer) {}
+ SmallStringBuilder() : raw_svector_ostream(Buffer, 0) { init(); }
void clear() {
flush();
Buffer.clear();
+ resync();
}
};
+typedef SmallStringBuilder<128> StringBuilder;
+
} // end llvm namespace
#endif
diff --git a/lib/Analysis/Analysis.cpp b/lib/Analysis/Analysis.cpp
index 24f57cd..3c07e8c 100644
--- a/lib/Analysis/Analysis.cpp
+++ b/lib/Analysis/Analysis.cpp
@@ -87,7 +87,7 @@ LLVMBool LLVMVerifyModule(LLVMModuleRef M, LLVMVerifierFailureAction Action,
report_fatal_error("Broken module found, compilation aborted!");
if (OutMessages)
- *OutMessages = strdup(MsgsOS.str().c_str());
+ *OutMessages = strndup(MsgsOS.str().data(), MsgsOS.str().size());
return Result;
}
diff --git a/lib/IR/Core.cpp b/lib/IR/Core.cpp
index 2431d70..ee4ce48 100644
--- a/lib/IR/Core.cpp
+++ b/lib/IR/Core.cpp
@@ -62,6 +62,11 @@ void LLVMShutdown() {
/*===-- Error handling ----------------------------------------------------===*/
+static char *LLVMCreateMessage(StringRef Message) {
+ assert(Message.find('\0') == Message.npos);
+ return strndup(Message.data(), Message.size());
+}
+
char *LLVMCreateMessage(const char *Message) {
return strdup(Message);
}
@@ -113,7 +118,7 @@ char *LLVMGetDiagInfoDescription(LLVMDiagnosticInfoRef DI) {
StringBuilder Msg;
DiagnosticPrinterRawOStream DP(Msg);
unwrap(DI)->print(DP);
- return LLVMCreateMessage(Msg.str().c_str());
+ return LLVMCreateMessage(Msg.str());
}
LLVMDiagnosticSeverity LLVMGetDiagInfoSeverity(LLVMDiagnosticInfoRef DI){
@@ -199,7 +204,7 @@ LLVMBool LLVMPrintModuleToFile(LLVMModuleRef M, const char *Filename,
char *LLVMPrintModuleToString(LLVMModuleRef M) {
StringBuilder os;
unwrap(M)->print(os, nullptr);
- return strdup(os.str().c_str());
+ return LLVMCreateMessage(os.str());
}
/*--.. Operations on inline assembler ......................................--*/
@@ -273,7 +278,7 @@ char *LLVMPrintTypeToString(LLVMTypeRef Ty) {
StringBuilder os;
assert(unwrap(Ty) != nullptr && "Expecting non-null Type");
unwrap(Ty)->print(os);
- return strdup(os.str().c_str());
+ return LLVMCreateMessage(os.str());
}
/*--.. Operations on integer types .........................................--*/
@@ -520,7 +525,7 @@ char* LLVMPrintValueToString(LLVMValueRef Val) {
StringBuilder os;
assert(unwrap(Val) != nullptr && "Expecting non-null Value");
unwrap(Val)->print(os);
- return strdup(os.str().c_str());
+ return LLVMCreateMessage(os.str());
}
void LLVMReplaceAllUsesWith(LLVMValueRef OldVal, LLVMValueRef NewVal) {
diff --git a/lib/IRReader/IRReader.cpp b/lib/IRReader/IRReader.cpp
index c1f6c71..cb64592 100644
--- a/lib/IRReader/IRReader.cpp
+++ b/lib/IRReader/IRReader.cpp
@@ -110,7 +110,7 @@ LLVMBool LLVMParseIRInContext(LLVMContextRef ContextRef,
if (OutMessage) {
StringBuilder os;
Diag.print(nullptr, os, false);
- *OutMessage = strdup(os.str().c_str());
+ *OutMessage = strndup(os.str().data(), os.str().size());
}
return 1;
}
diff --git a/lib/LTO/LTOCodeGenerator.cpp b/lib/LTO/LTOCodeGenerator.cpp
index c7c516f..debfbd1 100644
--- a/lib/LTO/LTOCodeGenerator.cpp
+++ b/lib/LTO/LTOCodeGenerator.cpp
@@ -568,10 +568,13 @@ void LTOCodeGenerator::DiagnosticHandler2(const DiagnosticInfo &DI) {
DiagnosticPrinterRawOStream DP(Msg);
DI.print(DP);
+ // Null-terminate the C string.
+ Msg << '\0';
+
// If this method has been called it means someone has set up an external
// diagnostic handler. Assert on that.
assert(DiagHandler && "Invalid diagnostic handler");
- (*DiagHandler)(Severity, Msg.str().c_str(), DiagContext);
+ (*DiagHandler)(Severity, Msg.str().drop_back().data(), DiagContext);
}
void
diff --git a/lib/Support/raw_ostream.cpp b/lib/Support/raw_ostream.cpp
index f55838e..0cc9427 100644
--- a/lib/Support/raw_ostream.cpp
+++ b/lib/Support/raw_ostream.cpp
@@ -704,6 +704,10 @@ void raw_string_ostream::write_impl(const char *Ptr, size_t Size) {
// and we only need to set the vector size when the data is flushed.
raw_svector_ostream::raw_svector_ostream(SmallVectorImpl<char> &O) : OS(O) {
+ init();
+}
+
+void raw_svector_ostream::init() {
// Set up the initial external buffer. We make sure that the buffer has at
// least 128 bytes free; raw_ostream itself only requires 64, but we want to
// make sure that we don't grow the buffer unnecessarily on destruction (when
diff --git a/lib/Target/TargetMachineC.cpp b/lib/Target/TargetMachineC.cpp
index cf04faa..d1c683d 100644
--- a/lib/Target/TargetMachineC.cpp
+++ b/lib/Target/TargetMachineC.cpp
@@ -241,9 +241,9 @@ LLVMBool LLVMTargetMachineEmitToMemoryBuffer(LLVMTargetMachineRef T,
formatted_raw_ostream Out(Code);
bool Result = LLVMTargetMachineEmit(T, M, Out, codegen, ErrorMessage);
- std::string &Data = Code.str();
- *OutMemBuf = LLVMCreateMemoryBufferWithMemoryRangeCopy(Data.c_str(),
- Data.length(), "");
+ StringRef Buffer = Code.str();
+ *OutMemBuf = LLVMCreateMemoryBufferWithMemoryRangeCopy(Buffer.data(),
+ Buffer.size(), "");
return Result;
}
More information about the llvm-commits
mailing list