Runtime stack overflow prevention for Twines
Marco Alesiani
marco.diiga at gmail.com
Fri May 16 06:34:14 PDT 2014
Compile-time prevention of runtime stack overflow when concatenating Twines
(infinite recursion). Attached patch, test case and a plain English
explanation of the problem.
--
*__________________________*
*Marco Alesiani*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140516/b3ec6afe/attachment.html>
-------------- next part --------------
From 02bb519b4b09614e0d3b0e48208bbbc5772be43a Mon Sep 17 00:00:00 2001
From: Alesiani Marco <marco.diiga at gmail.com>
Date: Thu, 15 May 2014 21:12:14 +0200
Subject: [PATCH] Runtime stack overflow prevention for Twines
Compile-time prevention of runtime stack overflow when concatenating
Twines (infinite recursion)
---
include/llvm/ADT/Twine.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/llvm/ADT/Twine.h b/include/llvm/ADT/Twine.h
index a54fd74..356ce93 100644
--- a/include/llvm/ADT/Twine.h
+++ b/include/llvm/ADT/Twine.h
@@ -182,6 +182,10 @@ namespace llvm {
assert(isValid() && "Invalid twine!");
}
+ /// Since the intended use of twines is as temporary objects, assignments
+ /// when concatenating might cause undefined behavior or stack corruptions
+ Twine& operator=(const Twine &Other) LLVM_DELETED_FUNCTION;
+
/// isNull - Check for the null twine.
bool isNull() const {
return getLHSKind() == NullKind;
--
1.9.3
-------------- next part --------------
== Explanation and test case for Twine stack overflow problem ==
As the documentation stands, Twines are inherently dangerous and they should only be used as temporary arguments
to a function. No temporaries should be stored as in the following
________________________________________________________________
void foo(const Twine &T);
...
StringRef X = ...
unsigned i = ...
const Twine &Tmp = X + "." + Twine(i);
foo(Tmp);
________________________________________________________________
anyway there's a different case much more subtle that doesn't seem to create problems at first and (tested on a Win8
machine - VS2012) sometimes only shows up in different project configurations (e.g. Release but not in Debug):
________________________________________________________________
const char *ptr1 = "_text1";
const char *ptr2 = "_text2";
const char *ptr3 = "_text3";
llvm::Twine data = llvm::Twine("data") + ptr1 + ptr2 + ptr3;
function_call(data); // function_call(Twine& tw);
________________________________________________________________
This is a totally different case where data consistency is put at risk and only at runtime one might encounter (or
not) a quite substantial variety of errors, mostly stack overflows.
The reason is explained by the following C++ test-case which causes a runtime stack-overflow on a Win8 VS2012 system
________________________________________________________________
#include <iostream>
using namespace std;
class Twine;
enum Kind {twineType, stringType, emptyKind};
union Child {
const char *cStringPtr;
const Twine* twine;
};
// Simplified model of the Twine class - relevant excerpts
class Twine {
public:
Child LHS;
Kind LHSKind;
Child RHS;
Kind RHSKind;
Twine(const char* pt) {
LHSKind = Kind::stringType;
LHS.cStringPtr = pt;
RHSKind = Kind::emptyKind;
}
explicit Twine(Child _LHS, Kind _LHSKind,
Child _RHS, Kind _RHSKind)
: LHS(_LHS), RHS(_RHS), LHSKind(_LHSKind), RHSKind(_RHSKind) {
}
Twine Twine::concat(const Twine &Suffix) const {
Child NewLHS, NewRHS;
NewLHS.twine = this;
NewRHS.twine = &Suffix;
Kind NewLHSKind = Kind::twineType, NewRHSKind = Kind::twineType;
if(RHSKind == Kind::emptyKind) { // If LHS is unused, just copy the POD stuff and forget about the object (save indirection)
NewLHS = LHS;
NewLHSKind = LHSKind;
}
if(Suffix.RHSKind == Kind::emptyKind) { // If RHS is unused, just copy the POD stuff and forget about the object (save indirection)
NewRHS = Suffix.LHS;
NewRHSKind = Suffix.LHSKind;
}
return Twine(NewLHS, NewLHSKind, NewRHS, NewRHSKind);
}
};
Twine operator+(const Twine &LHS, const Twine &RHS) {
return LHS.concat(RHS);
}
int main(int argc, char* argv[])
{
Twine dt("A");
dt = dt + "B"; // First time this works since the temporary Child data is entirely copied (POD = deep copy)
dt = dt + "C"; // Second time this doesn't work! dt's Child LHS (which contained the pointer to the data) is
// substituted with a pointer to itself. No more data.
// dt.str(); // Infinite recursion when exploring LHS and stack overflow
return 0;
}
________________________________________________________________
Whether or not this is considered a bug, it can introduce subtle memory problems in an application and aside
from the "Twine is inherently dangerous" line, there's no mention of it. The proposed fix attempts to avoid such
a (likely) unintended use.
As for the "Rule of Three", I believe the issue only stands for already-initialized reuse of objects, thus: only
for assignment operators.
More information about the llvm-commits
mailing list