Introduce StringBuilder utility around raw_string_ostream

Alp Toker alp at nuanti.com
Thu Jun 12 05:58:07 PDT 2014


On 12/06/2014 15:42, Chandler Carruth wrote:
>
> On Thu, Jun 12, 2014 at 1:31 PM, Alp Toker <alp at nuanti.com 
> <mailto:alp at nuanti.com>> wrote:
>
>     Okay, but that's aiming for a hypothetical LLVM project where
>     developers are paying attention to things like string storage. And
>     that they're aiming not only for correctness, but successfully
>     choosing efficient backing to avoid heap allocations.
>
>
>     In reality, there are a bunch of developers trying to write
>     various compiler backends, frontends and analyses. Their patches
>     get peer reviewed by people who history has shown don't care about
>     the intricacies of stack string allocation.
>
>
> Your view and experience with the LLVM project differs from mine, I 
> have seen the reverse of what you describe.
>
>
>     The enforced safety in this patch comes without any cost,
>
>
> Having yet another way to build up a string using a streaming 
> interface is a cost.
>
>
>         , and remove the misuses of flush when we find them (either
>         through an audit or in code review). Now, if there are simple
>         ways to make the interfaces of the stream classes better so
>         that this is easier to get right and harder to get wrong, I'm
>         all for that.
>
>
>     Simpler? The proposed interface is 13 lines of code :-)
>
>     I'm pretty sure a lot of people will appreciate this. It's just
>     too easy to access the string when it's sitting around in an
>     inconsistent state accessible to the caller, and it's a fact that
>     review didn't catch the errors found in-tree.
>
>
> How many such bugs are in the tree right now? That might be a 
> compelling data point for arguing that we need this.

I didn't do an exhaustive search, here are a few:

Invalid access:

     std::string getEdgeDestinations() {
       std::string EdgeDestinations;
       raw_string_ostream EDOS(EdgeDestinations);
       Function *F = Blocks.begin()->first->getParent();
       for (Function::iterator I = F->begin(), E = F->end(); I != E; ++I) {
         GCOVBlock &Block = *Blocks[I];
         for (int i = 0, e = Block.OutEdges.size(); i != e; ++i)
           EDOS << Block.OutEdges[i]->Number;
       }
       return EdgeDestinations;
     }

Invalid access:

   std::string getNodeLabel(const BasicBlock *Node,
                            const BlockFrequencyInfo *Graph) {
     std::string Result;
     raw_string_ostream OS(Result);

     OS << Node->getName().str() << ":";
     switch (ViewBlockFreqPropagationDAG) {
     case GVDT_Fraction:
       Graph->printBlockFreq(OS, Node);
       break;
     case GVDT_Integer:
       OS << Graph->getBlockFreq(Node).getFrequency();
       break;
     case GVDT_None:
       llvm_unreachable("If we are not supposed to render a graph we 
should "
                        "never reach this point.");
     }

     return Result;
   }

Encourages inefficient use (see updated version):

static std::string createJSONText(size_t MemoryMB, unsigned ValueSize) {
   std::string JSONText;
   llvm::raw_string_ostream Stream(JSONText);
   Stream << "[\n";
   size_t MemoryBytes = MemoryMB * 1024 * 1024;
   while (JSONText.size() < MemoryBytes) {
     Stream << " {\n"
            << "  \"key1\": \"" << std::string(ValueSize, '*') << "\",\n"
            << "  \"key2\": \"" << std::string(ValueSize, '*') << "\",\n"
            << "  \"key3\": \"" << std::string(ValueSize, '*') << "\"\n"
            << " }";
     Stream.flush();
     if (JSONText.size() < MemoryBytes) Stream << ",";
     Stream << "\n";
   }
   Stream << "]\n";
   Stream.flush();
   return JSONText;
}

Works only thanks to consecutive use of .str() and .clear():

bool Lint::runOnFunction(Function &F) {
   Mod = F.getParent();
   AA = &getAnalysis<AliasAnalysis>();
   DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
   DataLayoutPass *DLP = getAnalysisIfAvailable<DataLayoutPass>();
   DL = DLP ? &DLP->getDataLayout() : nullptr;
   TLI = &getAnalysis<TargetLibraryInfo>();
   visit(F);
   dbgs() << MessagesStr.str();
   Messages.clear();
   return false;
}

There are more in clang, and that's where I came up with the need.


>
> If lots of people show up on the review thread saying they strongly 
> prefer the approach of yet-another-stream interface

I'm not going to rally the troops, the code hopefully speaks for itself :-)

Alp.


> which embeds storage rather than wrapping external storage (because 
> this is a stream interface, not a builder, and the stream interface 
> already has a '.str()' method), then great. But my judgement is that 
> we don't need this, we need more care when using raw_string_ostream.



-- 
http://www.nuanti.com
the browser experts




More information about the llvm-commits mailing list