[llvm-commits] Twines

Daniel Dunbar daniel at zuster.org
Thu Jul 23 18:26:48 PDT 2009


On Thu, Jul 23, 2009 at 11:41 AM, Chris Lattner<clattner at apple.com> wrote:
> On Jul 23, 2009, at 12:26 AM, Daniel Dunbar wrote:
>> There are also a number of hacks to make introducing Twines easier,
>> which should be removed:
>> 1. Twine shouldn't actually have a conversion operator to std::string.
>> 2. The patch hacks in support for writing StringRef's to
>> std::ostream, we should move clients to raw_ostream instead.
>> 3. We probably don't want Twine.h in Value.h, clients should
>> explicitly import it if they want to get implicit string
>> concatenation.
>
>
> You're planning to do this as a follow-on, right?

Yes to #1 and #2. Chris and I discussed #3 and decided that including
Twine.h in Value.h was reasonable. I need to do some more prep on #s 1
and 2 to know exactly how much work is involved -- hopefully it
doesn't amount to converting everything to raw_ostream. :)

> It would also be nice to remove a bunch of the Value::*Name* methods.

Definitely.

> Please also add a short mention of this in the ProgrammersManual.html
> file (datastructure section).

Certainly. I have adding both StringRef and Twine docs on my TODO
list, I'll probably try to have patches for all this stuff before I
actually bring it in.

> +++ include/llvm/ADT/Twine.h    (revision 0)
>
> +  /// A Twine is a kind of rope, it represents a concatenated string
> by a
> +  /// binary-tree, where the string is the preorder of the nodes.
> Since the
>
> "by a" -> "with a" or "as a"


> +  // FIXME: We could normalize "" (in its C string, StringRef, and
> std::string)
> +  // spellings to Empty. I decided this was not worth the effort,
> complexity, or
> +  // increased code size; clients rarely concatenate empty strings?
>
> I agree, don't worry about empty strings.

Ok. One case Chris and I did decide to handle is to make sure that a call to
--
void f0(const Twine &Name = "");
--
can be optimized so that there isn't a reference to an empty string.
This is important, since otherwise we would want to write such
functions using Twine() as a default value, and I much prefer the
readability of the former.

> +    /// LHSKind - The NodeKind of the left hand side, \see
> getLHSKind().
> +    unsigned LHSKind : 8;
> +    /// RHSKind - The NodeKind of the left hand side, \see
> getLHSKind().
> +    unsigned RHSKind : 8;
>
> Please declare as "NodeKind LHSKind : 8;"

Ok. I get paranoid about using enumerations in bitfields but this case is safe.

> +++ lib/Support/Twine.cpp       (revision 0)
> @@ -0,0 +1,83 @@
> +//===-- Twine.h - Fast Temporary String Concatenation
> ---------------------===//
>
> Please s/.h/.cpp.

Ok.

> +std::string Twine::str() const {
> +  std::string Res;
> +  raw_string_ostream OS(Res);
> +  print(OS);
> +  return Res;
> +}
>
> Do you anticipate the performance of this method to matter?  If so,
> you could do a pre-traversal to get a bound for the length of the
> resultant string, then do a reserve, then get the contents.

I thought about that, but I think the current method is likely faster.
raw_ostream has a large buffer, so I think the negative effects of
chasing the rope twice would be worse on average than the cost of
occasionally concatenating multiple times to an std::string. Of
course, it could be rewritten to use a small vector, but if we cared
we should just do that inside raw_string_ostream. Actually I think I
just realized that is a nice idea for a different reason...

> +  case Twine::CStringKind:
> +    OS << static_cast<const char*>(Ptr); break;
> +  case Twine::StdStringKind:
> +    OS << *static_cast<const std::string*>(Ptr); break;
> +  case Twine::StringRefKind:
> +    OS << *static_cast<const StringRef*>(Ptr); break;
> +  case Twine::TwineKind:
> +    static_cast<const Twine*>(Ptr)->print(OS); break;
>
> Please put the breaks on their own line unless the whole statement is
> on one line with the 'case'.

Ok.

> +++ include/llvm/ADT/StringRef.h        (working copy)
> @@ -171,6 +171,16 @@
>
>    /// @}
>
> +  struct StringRefWrapper {
> +    const StringRef &Str;
> +    StringRefWrapper(const StringRef &_Str) : Str(_Str) {}
> +  };
> +
> +  template<typename T>
> +  T &operator<<(T &OS, const StringRefWrapper &RHS) {
> +    OS.write(RHS.Str.data(), RHS.Str.size());
> +    return OS;
> +  }
>  }
>
> Why do you need StringRefWrapper?  I assume this is due to ambiguity,
> please add a comment.

Yes, its a temporary hack just to minimize the changes in any one
patch. I'll add an explanation and FIXME, but my hope is to bring it
in then eliminate it.

> +++ tools/bugpoint/ExtractFunction.cpp  (working copy)
> @@ -353,7 +353,8 @@
>      // If the BB doesn't have a name, give it one so we have
> something to key
>      // off of.
>      if (!BB->hasName()) BB->setName("tmpbb");
> -    BlocksToNotExtractFile << BB->getParent()->getName() << " "
> +    // FIXME: Remove explicit .str() call once ambiguous overload
> goes away.
> +    BlocksToNotExtractFile << BB->getParent()->getName().str() << " "
>                             << BB->getName() << "\n";
>    }
>    BlocksToNotExtractFile.close();
> Index: unittests/ADT/StringRefTest.cpp
> ===================================================================
> --- unittests/ADT/StringRefTest.cpp     (revision 76864)
> +++ unittests/ADT/StringRefTest.cpp     (working copy)
> @@ -70,7 +70,8 @@
>
>    std::string Storage;
>    raw_string_ostream OS(Storage);
> -  OS << StringRef("hello");
> +  // FIXME: Remove .str() when overload is fixed.
> +  OS << StringRef("hello").str();
>    EXPECT_EQ("hello", OS.str());
>  }
>
> When will the fixme's be resolved?

I hope to resolve them immediately, just in separate patches.

 - Daniel




More information about the llvm-commits mailing list