<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/75666>75666</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            JumpThreading blows up LLVM IR size by a factor of N^2 with @llvm.lifetime.end instrinsic
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          terrelln
      </td>
    </tr>
</table>

<pre>
    # Repro

This issues causes compiling a file with `-O3 -fsanitize=fuzzer` to take hours and exhibit N^3 behavior or worse.

1. Download the file [TimeZoneDatabase.cpp](https://github.com/facebookincubator/velox/blob/3bb636b4d97b9b661a94804c0007752aa3edade3/velox/type/tz/TimeZoneDatabase.cpp)
2. `clang++ -O3 -fsanitize=fuzzer TimeZoneDatabase.cpp` and see that it runs for hours
3. Disable the `JumpThreadingPass` and re-run the compilation, and see that it finishes in ~5 minutes

# Debugging

@apolloww profiled and saw that a lot of time was being spent inside of DCE, specifically eliminating dead lifetime intrinsics here:

https://github.com/llvm/llvm-project/blob/d6f772074c48cf9bb57191cb065b5ce60012ed74/llvm/lib/Transforms/Utils/Local.cpp#L486-L491

That led me to find out that the `JumpThreadingPass` was blowing up the code size by a factor of N^2 in the size of the map.

If I truncate the map to the first 3 elements and print the IR before/after the first `JumpThreadingPass` I see this:

https://gist.github.com/terrelln/f8ac8f1b8caf83590a829c5d9d88e471

Specifically before `JumpThreadingPass` the cleanup blocks look like this:

```ll
35:                                               ; preds = %13
  %36 = landingpad { ptr, i32 }
          cleanup
  br label %60

37:                                               ; preds = %14
  %38 = landingpad { ptr, i32 }
          cleanup
  br label %56

39:                                               ; preds = %16
  %40 = landingpad { ptr, i32 }
          cleanup
  br label %53
```

After the `JumpThreadingPass` they look like this:

```
34:                                               ; preds = %13
  %35 = landingpad { ptr, i32 }
          cleanup
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %2) #13
  br label %67

36: ; preds = %14
  %37 = landingpad { ptr, i32 }
          cleanup
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %3) #13
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %2) #13
  br label %54

38: ; preds = %16
  %39 = landingpad { ptr, i32 }
          cleanup
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %4) #13
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %3) #13
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %2) #13
  br label %54
```

You can see the first block has one `@llvm.lifetime.end` intrinsics, the second has two, and the third has three. This extends all the way to the 2234 entries that the original file has, leading to (2234 * 2235) / 2 = 2,496,495 `@llvm.lifetime.end` intrinsics.

Normally,  the `JumpThreadingPass` has a limit in how many instructions it will duplicate controlled by [BBDuplicateThreshold](https://github.com/llvm/llvm-project/blob/aaa3f72c1ce6e1757df79c0d02e0675201ee07a3/llvm/lib/Transforms/Scalar/JumpThreading.cpp#L88-L91). However, [lifetime annotation intrinsics have a cost of 0](https://github.com/llvm/llvm-project/blob/aaa3f72c1ce6e1757df79c0d02e0675201ee07a3/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h#L719), so it is allowed to blow up the LLVM IR temporarily, since it will eventually be cleaned up.

This is all fine when the number of intrinsics added is sane, but when adding 2 million extra intrinsics, that ends up slowing down later passes quadratically. Specifically, now each of the N `%2 = alloca i32, align 4` has up to N `@llvm.lifetime.end` uses, which really slows down [wouldInstructionBeTriviallyDead](https://github.com/llvm/llvm-project/blob/d6f772074c48cf9bb57191cb065b5ce60012ed74/llvm/lib/Transforms/Utils/Local.cpp#L486-L491) because it iterates over all of them.

# Fix

Ultimately, I think `JumpThreadingPass` should refuse to duplicate blocks with too many instructions, including intrinsics. Or stop if it has grown the total IR size by a certain factor. However, I don't know what would be the most sane way to approach this, since I'm not familiar with the code.

If someone has a pointer to how they would like it fixed, I'd be interested in putting up a PR.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEWFtv4zjS_TXMSyGGTN2sBz_E4zG-fMj2DGYyC-y-laSSxQ1FakkqTvphf_uiKLtjd6fTM5jsdNCw0BIvdTmneIrovdoborXINyLfXuEUeuvWgZwjrc1VbdvntZAp_EKjsyLZiuRm_r3vlQfl_UQeGpw8P-wwKq3MHhA6pQkOKvQgiuT6pxSuO49GBfWRRLrtpo8fyYkigWAh4ANBbyfnAU0L9NSrWgX4IPIfU6ipx0dlHVgHB-s8Lc6NWC5gaw9GW2wh9DTvKvLNvRron9bQFgPW6GnRjKPIt0Ku-hBGL9IbIXdC7vYq9FO9aOwg5K7DhmprH5RpphqDdULuHknbJyF3tba1kLu0rou0qLO2KuuqLoolVtkqyZokScoyl4gptdhSejYzPI_Ej49C7l41S1azL3LBoWo0mr2QGyE38JWwwavLFEmMnieC0GMAFcBNxkNn3RzceZd0AVvlsdYUAyaK5P-nYbzvHWGrzP5n9P60lKNrN5k4bM4sBmWNkD98sVGnjPI9eVAG_pPDoMwUyJ8niiG0pXra75XZX3zIEhyt1vZwgNFZTmA7r4-HeX0EbQPYDoIaCA7ooSbGmB_JBFDGq5b48_aHH9k2P1KjOtWg1s9AWg3KYODxLWELWnUU11EmOJ7beOjJESPizKo3UKL14-lxPTr7L2rCCz7aoitLmZRZk62arqrrvFxWy6ZOirzOGyqSZCmpLbOzdRTPu3dofGfd4IXc_RaU5uedbVDPEEnvslVxfZdVy0sKYgCO10DMo06ZFuwU5rC9ld0YRG0PHJZpPGa4JfDqI0H9zPTFJjDpukhDyYnlUXEAp6InGHC84OJtB7cQ3GQaDHQaEfkdiel8gBRI00AmzEwfnTKzobe_QE2ddcwU7AK5s0lfc-L2CEHlv5E8HxYXGTwVN6b8CptVt6xXDXarNK8SXMmqyduqXa0oKy-i_es5sGZrv2pbjKgmNNPIgW4ePGhrH0Crh1dNFkUy_9P6SNNcpDfwx_5EuoHRUetBpFsQMl-m82LA_0mL-FqjYUNHbEGUGxiDY9KoVIIot6fhp7-jC6fXtQONNWlerkjO7U_L9zA3Ozd39a7m5sWFudV7mFucmZsl72tu-hkuzq2_-USQN-D3_DsBdwxI9u5wy_90QJhp8GhVCyJLuFwuTtV7QaZdjImQK1VkkPGSY3BgrDGTjgGUQlYgZPpi0wV4yws0FOz8N-BYfld30i_c-d8FJ88ugrN6PTjn4E-r7xqc7F2D811C_RrN_2EnaNAcT7nTaRhPE-jRgzWxALxmE9eAF4HDZsXDmxpr2jg3HOxJx_GX0Ct3_NA7ogVEcU9PgUzrgd3nUQd8Ph3nUqYZEO9A_kVuWKf2yqCeZXiPcWc9lyaeKeQqThTyhlfI59jsQEb4SCF_yKoi_ua_z7ML_fHBuoHPZt70zerIfiKwMmT5CL09wIDmmaVkcFPDItezqj0oraGdRq2iommsCc5q1lv1MzcZm8329JH38L3V7TebjDflIyKmXSmbZUMFLcu8bLuyapI2kZQUZS6TJVFSYvqmfPy1QY3cu1w4f9KRq9X1XbUUslrA_9kDPVIkqcg3n5QxGmNDlPoXIhkfCRAa66MWT_5qR5Vp9NTSy4sbg_rZK_b4Ht2ewqcg3JrO3g6jXvTscLmsuMXizsByVlUEtD1Qy5BkFXySwHd3f_8bK9FAw2gdOjVjySvT0Cc80COZMB014FzFqIXpUgofe-PInE4ZgkNPs4A201BT1NVnscW2pZbHezTEW9ZTmKdgG7kjYVBac0boKTj8gtsYIDJ1GsEfZX1rDwY0slIY0XN3_u8JW4dhFrALOJezvIqxByBs-pO8_xApKPOZnByyBrmYx7qh1d5AdiLTFGX-h7c4O3mKth561fTgKAaQbfWzpSLfHOyk29sXDm7o3qlHxSO3hH-SWX9BYyYrqCnehESYBXIYyIN9JBeBMMd1WHzeFu_U0_mr33RQAwaas3LLtdk8fLWW-Z6jBo463jfYs3p17DviHUyw9ssaF0_nSCsGzFlVhZ8c-GBHUB27wineO85SPCtsQM00eWkWG3IBlTk2jReF5RZaa4QsAzwwwA4M1ZhoZk_sEbmgMO5PxwuOo7OMw6haP_HvVshyAGMDdDgordAdPTv2rp-3ot4OxCfkXOxHq0wUzTaW-6iPZzOiQI5XGE_URouFLKN1cQb5wNQ0ME4hHNtlhJ9_WVy167St0gqvaL0sE1mUVV5kV_26S5arLi-rZV6maV6skiKpslWWFG3d5SUVV2otE5ku5TKXMpVZtlglVVU2ad7VXVa1SSmyhAZUehGZZN3-Kl6zrcu8KIqrKB58vK-T0tBhvoMTUop8e-XWEf71tPfMROWDf1klqKBpfQGjWP8igU_F7-tXAPNl3iv8nkEVsXM1Ob3-wxydbxGF3EUP_xsAAP__pj1OOA">