<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/55393>55393</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [IROutliner] CTMark/consumer-typeset slow to compile for AArch64 @ -Oz and -O2
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            llvm:optimizations,
            llvm:compiletime
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          ornata
      </td>
    </tr>
</table>

<pre>
    I've observed that at -Oz the IR outliner is 59.8% slower than baseline, and at -O2 it's 52.6% slower than the baseline.

This was done by compiling [CTMark](https://github.com/llvm/llvm-test-suite/blob/main/CTMark/README.md) at both optimization levels with and without the outliner. CTMark was compiled using [LNT](https://lnt.readthedocs.io/en/latest/).

I collected the average over 3 samples, compiled with a single thread.

I used a debug build of Clang, but I suspect the issue should persist with a release build or a release + asserts build.

# Reproducing

Here's IR for one of the files in the benchmark which compiles slowly with the outliner: https://godbolt.org/z/boYe1TPTW

You could also compile consumer-typeset from CTMark for AArch64 to do an end-to-end test.

I think we should be able to take out a big chunk with the IR above though. :)

# Analysis

Focusing on -O2 because I don't want to think about the MachineOutliner getting in the way...

I collected time traces using `clang -ftime-trace` and looked at all of them. The worst outlier I found was `z12.c`.

![image](https://user-images.githubusercontent.com/4722725/167938830-ee4422a4-a4bb-4bc1-977e-468b83edf655.png)

After that I used Instruments to figure out the heaviest stack trace.

```
1.26 s    5.2%    1.26 s           llvm::isa_impl_wrap<llvm::Instruction, llvm::Value const* const, llvm::Value const*>::doit(llvm::Value const* const&)
1.10 s    4.6%    0 s            bool llvm::isa<llvm::Instruction, llvm::Value const*>(llvm::Value const* const&)
595.00 ms    2.4%    0 s             bool llvm::isa<llvm::Instruction, llvm::Value const*>(llvm::Value const* const&)
385.00 ms    1.6%    0 s              llvm::CallInst::classof(llvm::Value const*)
385.00 ms    1.6%    0 s               llvm::isa_impl<llvm::CallInst, llvm::Value, void>::doit(llvm::Value const&)
385.00 ms    1.6%    0 s                llvm::isa_impl_cl<llvm::CallInst, llvm::Value const*>::doit(llvm::Value const*)
385.00 ms    1.6%    0 s                 llvm::isa_impl_wrap<llvm::CallInst, llvm::Value const*, llvm::Value const*>::doit(llvm::Value const* const&)
385.00 ms    1.6%    0 s                  llvm::isa_impl_wrap<llvm::CallInst, llvm::Value const* const, llvm::Value const*>::doit(llvm::Value const* const&)
385.00 ms    1.6%    0 s                   bool llvm::isa<llvm::CallInst, llvm::Value const*>(llvm::Value const* const&)
378.00 ms    1.5%    0 s                    llvm::IntrinsicInst::classof(llvm::Value const*)
364.00 ms    1.5%    0 s                     llvm::isa_impl<llvm::IntrinsicInst, llvm::Value, void>::doit(llvm::Value const&)
364.00 ms    1.5%    0 s                      llvm::isa_impl_cl<llvm::IntrinsicInst, llvm::Value const*>::doit(llvm::Value const*)
364.00 ms    1.5%    0 s                       llvm::isa_impl_wrap<llvm::IntrinsicInst, llvm::Value const*, llvm::Value const*>::doit(llvm::Value const* const&)
363.00 ms    1.5%    0 s                        llvm::isa_impl_wrap<llvm::IntrinsicInst, llvm::Value const* const, llvm::Value const*>::doit(llvm::Value const* const&)
363.00 ms    1.5%    0 s                         bool llvm::isa<llvm::IntrinsicInst, llvm::Value const*>(llvm::Value const* const&)
295.00 ms    1.2%    0 s                          llvm::DbgInfoIntrinsic::classof(llvm::Value const*)
295.00 ms    1.2%    0 s                           llvm::isa_impl<llvm::DbgInfoIntrinsic, llvm::Instruction, void>::doit(llvm::Instruction const&)
295.00 ms    1.2%    0 s                            llvm::isa_impl_cl<llvm::DbgInfoIntrinsic, llvm::Instruction const>::doit(llvm::Instruction const&)
295.00 ms    1.2%    0 s                             llvm::isa_impl_wrap<llvm::DbgInfoIntrinsic, llvm::Instruction const, llvm::Instruction const>::doit(llvm::Instruction const&)
295.00 ms    1.2%    0 s                              bool llvm::isa<llvm::DbgInfoIntrinsic, llvm::Instruction>(llvm::Instruction const&)
295.00 ms    1.2%    0 s                               llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1::operator()(llvm::Instruction&) const
295.00 ms    1.2%    0 s                                decltype(static_cast<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1&>(fp)(static_cast<llvm::Instruction&>(fp0))) std::__1::__invoke<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1&, llvm::Instruction&>(llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1&, llvm::Instruction&)
295.00 ms    1.2%    0 s                                 bool std::__1::__invoke_void_return_wrapper<bool, false>::__call<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1&, llvm::Instruction&>(llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1&, llvm::Instruction&)
295.00 ms    1.2%    0 s                                  std::__1::__function::__alloc_func<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1, std::__1::allocator<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1>, bool (llvm::Instruction&)>::operator()(llvm::Instruction&)
295.00 ms    1.2%    0 s                                   std::__1::__function::__func<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1, std::__1::allocator<llvm::BasicBlock::instructionsWithoutDebug(bool)::$_1>, bool (llvm::Instruction&)>::operator()(llvm::Instruction&)
295.00 ms    1.2%    0 s                                    std::__1::__function::__value_func<bool (llvm::Instruction&)>::operator()(llvm::Instruction&) const
295.00 ms    1.2%    0 s                                     std::__1::function<bool (llvm::Instruction&)>::operator()(llvm::Instruction&) const
295.00 ms    1.2%    0 s                                      llvm::filter_iterator_base<llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, false>, std::__1::function<bool (llvm::Instruction&)>, std::__1::bidirectional_iterator_tag>::findNextValid()
270.00 ms    1.1%    0 s                                       llvm::filter_iterator_base<llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, false>, std::__1::function<bool (llvm::Instruction&)>, std::__1::bidirectional_iterator_tag>::operator++()
261.00 ms    1.0%    0 s                                        llvm::CodeExtractorAnalysisCache::CodeExtractorAnalysisCache(llvm::Function&)
261.00 ms    1.0%    0 s                                         llvm::CodeExtractorAnalysisCache::CodeExtractorAnalysisCache(llvm::Function&)
260.00 ms    1.0%    0 s                                          getCodeExtractorArguments(llvm::OutlinableRegion&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> >&, llvm::DenseSet<unsigned int, llvm::DenseMapInfo<unsigned int> >&, llvm::DenseMap<llvm::Value*, llvm::Value*, llvm::DenseMapInfo<llvm::Value*>, llvm::detail::DenseMapPair<llvm::Value*, llvm::Value*> >&, llvm::SetVector<llvm::Value*, std::__1::vector<llvm::Value*, std::__1::allocator<llvm::Value*> >, llvm::DenseSet<llvm::Value*, llvm::DenseMapInfo<llvm::Value*> > >&, llvm::SetVector<llvm::Value*, std::__1::vector<llvm::Value*, std::__1::allocator<llvm::Value*> >, llvm::DenseSet<llvm::Value*, llvm::DenseMapInfo<llvm::Value*> > >&)
260.00 ms    1.0%    0 s                                           llvm::IROutliner::findAddInputsOutputs(llvm::Module&, llvm::OutlinableRegion&, llvm::DenseSet<unsigned int, llvm::DenseMapInfo<unsigned int> >&)
260.00 ms    1.0%    0 s                                            llvm::IROutliner::doOutline(llvm::Module&)
```

cc @AndrewLitteken 
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJztWl1v6rgW_TXpi0UUnA_ggQdKW91K0zmj3mpG84ScxIBvTYxsh07Pr7_bdgKBQklOYeZe6SBEgu14r73W3jtfTkX-Pn708GBDkUgVlRuaI70kGsG39-077FP0-IxEqTkrqERMoXjkDz0cI8XFG7TA6AKlRFEzwMNTRIrcHY0R0zA1HIH95PAIM3F9lO8Fd14wcb8vS7DxRhTKRQFD3lEmVmsGwxbIi2-nL09EvnrxnYeHS63XygsnHn6A74LpZZn6MBr-cL6pNz1Nle6pkmlA95BykcJmRVgBm2o2_PB8P7l7uvdXuYdHBnwq9BKJtWYr9p1oJgrE6YZyAAZWrIdmB1ixftTs-MhNaOE72EBnqSrsv_z6cgw4L7QvKclhplxkymcCGqmBx4nBbkeN9jh6hNk5p5m2alFENlSSBQCBLQqRIqs1p8qIsUXhgCODhVM4yFg8mLNUMI6gnKblAqUl4zkSczTlpFiYqVLw9hGpUq3BsDXLlCopUsADDF1TqZjStSFJOQV563lko8nDt4goCDatXPceDg-H6JmupcjLDMA2u_5FJbUBBRE5hylNgABCA2UOTirEqriiRbZcWSGWLFvWJCgbgPzdQWwKB1qgg2gSeSq49oUE3x--m8gRf9L-y28vfzQR_SlKmN24T7gStSHYFqpcUdnT72uqqEZzKVZ1cBjkk4nMlkmEtIAwh3hCtMh7WvRgg4zmB8roJSvAmS3XKUieGhkF0uTV-gH8pmyBsmVpRtYOAlEkhahAJlgXSx9ZB0eHfE8Kwt9BvWb7g8hc5ELwm1xOaUYgQiAEIDFBBVCaFNpCsOjATpUOTySDFvqtrhkLqrWZqFLnjbz7_ulwZitAK0kGclWZkwSZiUHUm5vOnu2ERpuGXIhXausN4bwKhpWPXowdISEcrcQA4hF4L03eQmrCwd_72M9gexB6fUhTtoJUOpao4L3s2V7lu2pjWkBrTSGFXeGJBhgPcAx7_WQwCofDMOhRGkUYk6hHojTtRWnW740GA9qLkmE6DGk-T-LYX5sk2xNmMteuWJq0s7n5WCgtIa4KSBzgfc4WpXTiG16XlGwYxA5SmmSvjsN995Kg-tq_fR8nSCH4xD425Rk-u7bqYwspEBBOmCIzBnVl9ibJ2gunux4HKzNl0tSJXcfvhJcuGaCITeqd0yO88N4158KcOYbnZkq2jPX9fuBwR-5UA59gzxEo6YLvu_MjThiIXYDFo9gPArSyULAfHcf2D4ELhw1w_VPENYNgCllmwLh_kJZKifknNjubOhJwe0xsAXykwTRtBMtbRlF3Go5mQ9YaX-cw_wGArfK1FcarpGkHRy7myTXLThd_zuV428jphG8wbOKLP8WHmvVGS1Yolv1YridRB6vnMn4fyyXTvhvMFsl_DuoXKkBHrC3P223hXid5krCbT5d16qp1obNr568AOoRWF6h4tFfCcAuoDdt36eKxmIstus614kfMn6sYH0Dt0XVwLfV56WgMvgx1bcpIa_wVpL8Tfasc7OrA_5J_5xKxfXAdJOJVwDZM3xJAc8tF9loB39lTf7iHVXfmyQ6AMi4a03ach6NZ3-2KNZVECwlDTPcJ9A535cPX0KOcZtw8IwFbcNeqWTbLiFF8eim_AKvVYb52Hp2ycuBfdUxgDzJfuKnO3chZzdZsxoqNeKWXRXu6ViWHEXVVW1-OzCqTTvM2M6V3JqkuZWFryNo8iJs6qFM0J1zRbe7PQDLOf1J96nOM5nlZVJXI_QcCRWZbL8nj9Ihta8lWkssZMopMXUx9Xpm2MdOlnH1dgVYS_CT_SuS3Yn9jrkFrDa7lzGVOjCd82nn0_4C_UfLmjGsqZ0w7IDPz6m8vPhlnSm_7j3TlVBPGXUshcjozr-cA6mcPSmGf7s4ljav9Ztv2RHM0m7ozfnSalOVMUjuS8B0Nmiy2Is1Zkf9K_9JwpwQgnVJOhUHQVKHfVYWfMnSRYZcrt_bb0CHpN3UIOuvQfJAO1N3_Zd7RgKn69duUZEt6tr_p-UNNS7OOfhnm34Uz-CpO82pxH4FcuPdje9bdm0jzxvSZLmoUx8JjQzMX9SXc2i0KmiNW6PNn2b3R4T2y4XdwzXdHC0X_TfWRyQ8GPZG1ucPsMu_TwQ149WD02IO7D60HNo8Md8m062iWgPro3wiTXSCc8AYI-r3W4Phcn4jW9oDjF0gfoJ1Q7xJ8op_-7_y_YDlo3mM9f9st7qjOrpM8fyzWpVbQZTZ7ReJJ5CWnHxQ5UTuuk9iXpOITLnJR_T9FQL0OYX_BgPvNMuRFwaTIJX37hWlNX2mBbvJxmI_CEbnRTHM69uLbhtH4Dm1XWn1YGmPW5JjVDPXamebyGDBkV6GZlR69b_imlHzcctXXWor_QGLAX7tKCbR-iONwFN4sxzjNc5qQ_iiKg0ESRUkWx1k0JwkZ5Dgf9m84SSlXxgkP44qg5mIwmAwbfXe9FXizPsX0xXc3bIwDjIO438dBH5jxB6MkTnBGY0zwYBRG4BtdQR31zRRmndGNHFvkcBOmoNNcbaldJ1E2ZCy1Zn5Swj2bHAtZEE1urI9j6-B_AR23mhE">