bug in PrescheduleNodesWithMultipleUses()

Fri Aug 23 03:11:01 PDT 2013

Hi

(this picks up from the thread in llvmdev: "PrescheduleNodesWithMultipleUses() causing failure in PickNodeToScheduleBottomUp() ???")

PrescheduleNodesWithMultipleUses() is a function that tries to transform a DAG where nodes have multiple uses, with the aim of reducing the live range.
(see comment in lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp)

However, it does not check to see if callResources will be affected by the transformation.

The problem
========
 Take the following correctly scheduled DAG (arrow to predecessor):

  SetUp2                SetUp1
    ^                     ^
    |                     |
    |                     |
 Destroy2---->PredSU <----SU
    ^           ^         ^
    |           |         |
    |           |         |
    ----------- | ---------
              | | |
            Destroy1
                ^
                |

 In this example there are two successors of 'PredSU' with type
 getCallFrameDestroyOpcode (Destroy) and one is a successor of the other.
 Taking the successor of the two Destroys (Destroy1), note that it's
 matching getCallFrameSetupOpcode (Setup1) is a predecessor of 'SU'.
 In this situation, re-routing the dependency on 'PredSU' through 'SU' will
 cause a dead lock Viz:

  SetUp2      PredSU    SetUp1
    ^            ^        ^
    |            |        |
    |            |        |
 Destroy2-----> SU -------
    ^           ^ ^
    |           | |
    |           | |
    ----------- | |
              | | |
             Destroy1
                ^
                |

The new function callResourceDeadLockDanger() will check for this situation
and prevent PrescheduleNodesWithMultipleUses() from making the transformation
if it will cause a callResource dead lock.

Outstanding issues
============

1. Is it too aggressive in searching predecessors and successors?
   Should the algorithm give up and assume the worst if the depth of search reaches a predefined limit?
   Can someone confirm that the extra processing time is not onerous.

2. Should the initial search for 'SetUp1' and 'Destroy1' only search along chains? viz conditional upon II->isCtrl()
   This will reduce the search space, but are getCallFrameSetupOpcode & getCallFrameDestroyOpcode always 'chained'?
    (Later searches for Destroy2 need to check all predecessors)

3. What is the best way to construct the test case?
    Using an IR as input does not guarantee the required DAG will be output for testing.
    The test IR only produces the correct DAG when built for the XCore target.
    (see attached test case)

4. Where should the test be placed?
    Currently I have placed it under test/CodeGen/XCore but it relates to lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp

Thank you

Robert

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130823/a0bb0c94/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PatchCallResourceDeadLock
Type: application/octet-stream
Size: 6952 bytes
Desc: PatchCallResourceDeadLock
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130823/a0bb0c94/attachment.obj>