[LLVMdev] FoldingSetNodeID operations inefficiency

Wed Apr 23 03:22:53 PDT 2008

Hi,

While profiling LLVM using my test-cases with huge MBBs, I noticed that 
FoldingSetNodeID operations (ComputeHash,insertion,etc) may become 
really inefficient for the nodes, which have very many operands.

I can give you an example of what is meant by "very many". In my 
test-case (you can fetch it from here 
http://llvm.org/bugs/attachment.cgi?id=1275), which is just one HUGE MBB 
with tens of thousends of instructions, there is a TokenFactor node N 
that has ... 3200 operands! (These operands are PendingLoads created by 
the SelectionDAGLowering::getLoadFrom and then copied into the 
TokenFactor node by SelectionDAGLowering::getRoot called from the 
SelectionDAGLowering::visitStore).

During the code selection phase, the SelectionDAG::ReplaceAllUsesWith is 
called many times and in many cases it tried to update this node, by 
replacing some of its 3200 operands by a different operand. After each 
such operand replacement, this snippen of code is typically executed 
(see SelectionDAG.cpp):

// Now that we have modified U, add it back to the CSE maps.
// If it already exists there, recursively merge the results together.
  if (SDNode *Existing = AddNonLeafNodeToCSEMaps(U)) {
      ReplaceAllUsesWith(U, Existing, UpdateListener);
       // U is now dead.  Inform the listener if it exists and delete it.
       if (UpdateListener)
         UpdateListener->NodeDeleted(U);
       DeleteNodeNotInCSEMaps(U);
   } else {
// If the node doesn't already exist, we updated it.
// Inform a listener if it exists.
       if (UpdateListener)
         UpdateListener->NodeUpdated(U);
  }

So, basically CSE map should be updated. To achieve that, a hash of the 
updated node N is computed (and its internal FoldingSetNodeID::Bits 
array contains about 6400 elements!) and the node is replaced.

My observation shows that amount of FoldingSetNodeID operations on this 
node N is almost quadratic. It manifests itself by consuming about 20% 
of the compilation time only for these operations.

Question:
Does anyone see an opportunity to solve this issue?
My current ideas are:
  - avoid recomputation of the hash values for nodes that were not 
changed since the hash was computed last time
  - do some sort of lazy CSE maps update. Accumulate operand changes for 
a given node and then do only one update. But I don't see yet, how it 
can be done practically
  - Optimize the SelectionDAG::ReplaceAllUsesWith to do bulk updates, 
i.e. updating many user nodes of the given node at once or updating all 
operands of the given node at once. Again, I  don't see how to do it 
practically.
  - May be the issue is related to the strange TokenFactor node with the 
3200 operands. May be creation of the node with so many operands can be 
avoided?

Any ideas are appriciated!

-Roman

      Lesen Sie Ihre E-Mails jetzt einfach von unterwegs.
www.yahoo.de/go