[LLVMdev] Correct use of StringRef and Twine

David Blaikie dblaikie at gmail.com
Thu Jul 21 00:00:13 PDT 2011


> And for arguments, generally always use Twine as the default, it allows construction of complex things, and is still efficient when passed the equiv of a StringRef (with the toStringRef method).  The only annoying thing about it is that the API to do this requires a temporary SmallVector to scribble in, which makes it more difficult to use.
>
> It seems that there should be a good way to fix this API problem.

This is the problem I'm still trying to figure out. It seems that
while Twine is efficient to pass, it's not easy to use efficiently & I
don't think it'd be appropriate (correct me if I'm wrong) to go adding
in temporary SmallVectors all over the place to try to consume Twine
parameters in any code that needs to handle string contents after
migrating the argument types to Twines.

One place I experimented with fixing tonight (after trying broader
goals - like changing all StringRef args in clang only, say) was to
add a Twine(char) ctor to enable llvm::Triple's ctor to take Twines
rather than StringRefs, and then do Twine op+ to build the Data
member.

The problem I see with this is that the current implementation of
Triple's ctor is still more efficient than the simple Twine version:

: Data(x)
{
Data += '-';
Data += x;
Data += '-';
Data += 'z';
}

(essentially), as opposed to:

: Data((x + '-' + y + '-' + z).str())

Which requires an extra string copy of the final value in all cases...
actually, now that I think about it, since it's returning into a ctor,
this might be nicely optimized - so perhaps this is the "right way" to
write this code. [diff attached]

So then here's another example (can't find the exact piece of code I
had been working on, but taking
tools/clang/lib/Basic/Targets.cpp::getCPUDefineSuffix as an example
anyway - just something using a StringSwitch, really). If this
function were to take a const Twine& and pass it along to
StringSwitch, ultimately StringSwitch (being the sink for the Twine -
the code needing to read the individual character elements) would be
responsible for allocating a temporary SmallVectorImpl, passing that
in, getting a StringRef & then using that for its work.

Another example of a string sink is, say, tools/llvmc/src/Hooks.cpp. I
found this while looking for uses of "+= '-'" to use the char support
for Twine I'd just added. But I can't upgrade the Arg argument from
const std::string& to const Twine& easily since it needs to actually
manipulate the string - find, access elements, and substr it.

Should this work just be done case by case? If so, I don't think I'll
end up with much Twine usage, probably just a lot of StringRef usage &
lots of str() calls.
If Twine were able to be a more drop-in replacement (by providing a
StringRef in a more convenient manner - or, ideally, subsuming
StringRef's functionality entirely I think (& simply allocating a
backing buffer when necessary - the suggesting you mentioned was
dangerous, though I'm still tossing it around as something to try))
it'd be more practical to use it as the go-to argument for anything
that needs a string.

Looking forward to hearing anyone's thoughts on this.

- David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twine_triple.diff
Type: application/octet-stream
Size: 3201 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110721/eab246db/attachment.obj>


More information about the llvm-dev mailing list