Runtime stack overflow prevention for Twines

Marco Alesiani marco.diiga at
Fri May 16 06:34:14 PDT 2014

Compile-time prevention of runtime stack overflow when concatenating Twines
(infinite recursion). Attached patch, test case and a plain English
explanation of the problem.



 *Marco Alesiani*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
From 02bb519b4b09614e0d3b0e48208bbbc5772be43a Mon Sep 17 00:00:00 2001
From: Alesiani Marco <marco.diiga at>
Date: Thu, 15 May 2014 21:12:14 +0200
Subject: [PATCH] Runtime stack overflow prevention for Twines

Compile-time prevention of runtime stack overflow when concatenating
Twines (infinite recursion)
 include/llvm/ADT/Twine.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/llvm/ADT/Twine.h b/include/llvm/ADT/Twine.h
index a54fd74..356ce93 100644
--- a/include/llvm/ADT/Twine.h
+++ b/include/llvm/ADT/Twine.h
@@ -182,6 +182,10 @@ namespace llvm {
       assert(isValid() && "Invalid twine!");
+    /// Since the intended use of twines is as temporary objects, assignments 
+    /// when concatenating might cause undefined behavior or stack corruptions
+    Twine& operator=(const Twine &Other) LLVM_DELETED_FUNCTION;
     /// isNull - Check for the null twine.
     bool isNull() const {
       return getLHSKind() == NullKind;

-------------- next part --------------
== Explanation and test case for Twine stack overflow problem ==

As the documentation stands, Twines are inherently dangerous and they should only be used as temporary arguments 
to a function. No temporaries should be stored as in the following
void foo(const Twine &T);
StringRef X = ...
unsigned i = ...
const Twine &Tmp = X + "." + Twine(i);

anyway there's a different case much more subtle that doesn't seem to create problems at first and (tested on a Win8 
machine - VS2012) sometimes only shows up in different project configurations (e.g. Release but not in Debug):
 const char *ptr1 = "_text1";
 const char *ptr2 = "_text2";
 const char *ptr3 = "_text3";
 llvm::Twine data = llvm::Twine("data") + ptr1 + ptr2 + ptr3;
 function_call(data); // function_call(Twine& tw);
This is a totally different case where data consistency is put at risk and only at runtime one might encounter (or 
not) a quite substantial variety of errors, mostly stack overflows.

The reason is explained by the following C++ test-case which causes a runtime stack-overflow on a Win8 VS2012 system

#include <iostream>
using namespace std;

class Twine;

enum Kind {twineType, stringType, emptyKind};

union Child {
  const char *cStringPtr;
  const Twine* twine;

// Simplified model of the Twine class - relevant excerpts
class Twine {

  Child LHS;
  Kind LHSKind;
  Child RHS;
  Kind RHSKind;

  Twine(const char* pt) {
    LHSKind = Kind::stringType;
    LHS.cStringPtr = pt;
    RHSKind = Kind::emptyKind;

  explicit Twine(Child _LHS, Kind _LHSKind,
    Child _RHS, Kind _RHSKind)
    : LHS(_LHS), RHS(_RHS), LHSKind(_LHSKind), RHSKind(_RHSKind) {

  Twine Twine::concat(const Twine &Suffix) const {
    Child NewLHS, NewRHS;
    NewLHS.twine = this;
    NewRHS.twine = &Suffix;
    Kind NewLHSKind = Kind::twineType, NewRHSKind = Kind::twineType;

    if(RHSKind == Kind::emptyKind) { // If LHS is unused, just copy the POD stuff and forget about the object (save indirection)
      NewLHS = LHS;
      NewLHSKind = LHSKind;
    if(Suffix.RHSKind == Kind::emptyKind) { // If RHS is unused, just copy the POD stuff and forget about the object (save indirection)
      NewRHS = Suffix.LHS;
      NewRHSKind = Suffix.LHSKind;
    return Twine(NewLHS, NewLHSKind, NewRHS, NewRHSKind);


Twine operator+(const Twine &LHS, const Twine &RHS) {
  return LHS.concat(RHS);

int main(int argc, char* argv[])

  Twine dt("A");
  dt = dt + "B"; // First time this works since the temporary Child data is entirely copied (POD = deep copy)

  dt = dt + "C"; // Second time this doesn't work! dt's Child LHS (which contained the pointer to the data) is 
                 // substituted with a pointer to itself. No more data.

  // dt.str(); // Infinite recursion when exploring LHS and stack overflow

  return 0;

Whether or not this is considered a bug, it can introduce subtle memory problems in an application and aside 
from the "Twine is inherently dangerous" line, there's no mention of it. The proposed fix attempts to avoid such 
a (likely) unintended use.

As for the "Rule of Three", I believe the issue only stands for already-initialized reuse of objects, thus: only 
for assignment operators.

More information about the llvm-commits mailing list