<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/60151>60151</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [clang-format, clangd] Out of memory in clang::format::guessLanguage()
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          danielrparks
      </td>
    </tr>
</table>

<pre>
    I have a file [blis.h](https://github.com/llvm/llvm-project/files/10460065/blis.h.gz) which causes the guessLanguage() function to continuously consume memory, possibly indefinitely. My friend @bagel897 , who has more ram than I do, observed that it consumed 25GB before she ran out.

I have clang 15.0.7 on Arch Linux.

This affects clangd when editing the problem file as well as clang-format when formatting it. clangd is able to index the file as long as it is not opened directly in the editor, and clang is able to compile it without problems.

I believe that guessLanguage() is to blame for the issue because if I change the file extension from `.h` to `.hpp`, clang-format and clangd work correctly with the file.

Here is a stack trace from a clangd process experiencing the issue:
```gdb
(gdb) info threads
  Id   Target Id                                           Frame 
* 1    Thread 0x7f692a554dc0 (LWP 30485) "clangd.main" __GI___libc_read (nbytes=4096, 
    buf=0x55a22f0ebe00, fd=0) at ../sysdeps/unix/sysv/linux/read.c:26
  2    Thread 0x7f691d3ff6c0 (LWP 30486) "clangd.main"     __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13d858) at futex-internal.c:57
  3    Thread 0x7f691cbfe6c0 (LWP 30487) "ground-worker-1" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  4    Thread 0x7f691c3fd6c0 (LWP 30488) "ground-worker-2" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  5    Thread 0x7f691bbfc6c0 (LWP 30489) "ground-worker-3" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  6    Thread 0x7f69133fb6c0 (LWP 30490) "ground-worker-4" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  7    Thread 0x7f6913fff6c0 (LWP 30491) "ground-worker-5" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  8    Thread 0x7f691b3fb6c0 (LWP 30492) "ground-worker-6" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  9    Thread 0x7f691abfa6c0 (LWP 30493) "ground-worker-7" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  10   Thread 0x7f691a3f96c0 (LWP 30494) "ground-worker-8" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f13e5a8) at futex-internal.c:57
  11   Thread 0x7f6918c5f6c0 (LWP 30495) "STWorker:blis.h" 0x00007f69299082ff in llvm::SmallVectorImpl<clang::format::UnwrappedLine>::~SmallVectorImpl () at /usr/include/llvm/ADT/SmallVector.h:587
  12   Thread 0x7f6912bfa6c0 (LWP 30496) "leWorker:blis.h" __futex_abstimed_wait_common64 (private=0, 
 cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55a22f11cc14) at futex-internal.c:57
(gdb) thread 11
[Switching to thread 11 (Thread 0x7f6918c5f6c0 (LWP 30495))]
#0  0x00007f69299082ff in llvm::SmallVectorImpl<clang::format::UnwrappedLine>::~SmallVectorImpl ()
 at /usr/include/llvm/ADT/SmallVector.h:587
587        if (!this->isSmall())
(gdb) bt
#0  0x00007f69299082ff in llvm::SmallVectorImpl<clang::format::UnwrappedLine>::~SmallVectorImpl ()
 at /usr/include/llvm/ADT/SmallVector.h:587
#1 llvm::SmallVector<clang::format::UnwrappedLine, 0u>::~SmallVector ()
 at /usr/include/llvm/ADT/SmallVector.h:1193
#2 clang::format::UnwrappedLineNode::~UnwrappedLineNode ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.h:336
#3 clang::format::UnwrappedLineParser::pushToken ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:4340
#4 0x00007f69299a9f54 in clang::format::UnwrappedLineParser::nextToken(int) [clone .constprop.0] ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:4182
#5 0x00007f6929936572 in clang::format::UnwrappedLineParser::parsePPDefine ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:1199
#6 0x00007f6929936ab6 in clang::format::UnwrappedLineParser::parsePPDirective ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:1066
#7 0x00007f6929936d50 in clang::format::UnwrappedLineParser::readToken ()
 at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:4315
#8 0x00007f69299a9f72 in clang::format::UnwrappedLineParser::nextToken(int) [clone .constprop.0] ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:4185
#9 0x00007f69299339c1 in clang::format::UnwrappedLineParser::parseStructuralElement ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:1933
#10 0x00007f6929934689 in operator() ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:506
#11 clang::format::UnwrappedLineParser::parseLevel ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:602
#12 0x00007f6929931c0f in clang::format::UnwrappedLineParser::parseBlock ()
 at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:890
#13 0x00007f6929932ef8 in clang::format::UnwrappedLineParser::parseStructuralElement ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:1830
#14 0x00007f6929934689 in operator() ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:506
#15 clang::format::UnwrappedLineParser::parseLevel ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:602
#16 0x00007f6929936259 in clang::format::UnwrappedLineParser::parseFile ()
 at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:399
#17 0x00007f6929937471 in clang::format::UnwrappedLineParser::parse ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/UnwrappedLineParser.cpp:358
#18 0x00007f69299379a8 in clang::format::TokenAnalyzer::process ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/TokenAnalyzer.cpp:112
#19 0x00007f69298ed00c in clang::format::guessLanguage ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/Format.cpp:3500
#20 0x00007f69298ed2be in clang::format::getStyle ()
    at /usr/src/debug/clang/clang-15.0.7.src/lib/Format/Format.cpp:3531
#21 0x000055a22d425aca in clang::clangd::getFormatStyleForFile ()
    at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/SourceCode.cpp:582
#22 0x000055a22d3e7804 in clang::clangd::ParsedAST::build ()
    at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/ParsedAST.cpp:556
#23 0x000055a22d432cda in generateDiagnostics ()
    at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/TUScheduler.cpp:1186
#24 0x000055a22d433a0a in clang::clangd::(anonymous namespace)::ASTWorker::updatePreamble(std::unique_ptr<clang::CompilerInvocation, std::default_delete<clang::CompilerInvocation> >, clang::clangd::ParseInputs, std::shared_ptr<clang::clangd::PreambleData const>, std::vector<clang::clangd::Diag, std::allocator<clang::clangd::Diag> >, clang::clangd::WantDiagnostics)::{lambda()#1}::operator()() [clone .part.0] [clone .lto_priv.0] () at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/TUScheduler.cpp:1119
#25 0x000055a22de8d47d in llvm::unique_function<void ()>::operator()() ()
 at /usr/include/llvm/ADT/FunctionExtras.h:384
#26 llvm::function_ref<void ()>::callback_fn<llvm::unique_function<void ()> >(long) ()
    at /usr/include/llvm/ADT/STLFunctionalExtras.h:45
#27 llvm::function_ref<void ()>::operator()() const () at /usr/include/llvm/ADT/STLFunctionalExtras.h:68
#28 clang::clangd::(anonymous namespace)::ASTWorker::runTask(llvm::StringRef, llvm::function_ref<void ()>) [clone .constprop.0] () at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/TUScheduler.cpp:1299
#29 0x000055a22d42c41c in run ()
    at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/TUScheduler.cpp:1432
#30 0x000055a22d582d37 in llvm::unique_function<void ()>::operator()() ()
 at /usr/include/llvm/ADT/FunctionExtras.h:384
#31 operator() () at /usr/src/debug/clang/clang-15.0.7.src/tools/extra/clangd/support/Threading.cpp:100
#32 Apply<clang::clangd::AsyncTaskRunner::runAsync(const llvm::Twine&, llvm::unique_function<void()>)::<lambda()> > () at /usr/include/llvm/Support/thread.h:42
#33 GenericThreadProxy<std::tuple<clang::clangd::AsyncTaskRunner::runAsync(const llvm::Twine&, llvm::unique_function<void()>)::<lambda()> > > () at /usr/include/llvm/Support/thread.h:50
#34 ThreadProxy<std::tuple<clang::clangd::AsyncTaskRunner::runAsync(const llvm::Twine&, llvm::unique_function<void()>)::<lambda()> > >(void) () at /usr/include/llvm/Support/thread.h:60
#35 0x00007f691f49f8fd in start_thread (arg=<optimized out>) at pthread_create.c:442
#36 0x00007f691f521d20 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzcml9z2yr6x18NuWHikUCSpYtcOHF9fpnpb7dzkrPn0oPgkc0GgxZQEp-Lfe07IPm_26Zpm22206lVScD38_A88IBgzsmFBrhC-TXKpxes80tjrwTTEpRtmX1wF7UR66tbvGSPgBlupAKM8utaSTdaonyKSLn0vnWIThCZITJbSL_s6hE3K0RmSj1ufi5ba_4J3CMyC5U4RGZpkhVJUuSIzPr6Rou_EKnw01LyJeasc-CwXwJedODcR6YXHVsAImV4qek099Jo7A3mRnupO9M5tQ7_cd0K8ApWxq4RucGtcU7Wao2lFtBILT2o9Qj__xo3VoIWGGVJzRagymqMQ4GnpcFL5vDKWMCWrbBfMo1vsTDhqakd2EcQ4a7H0m9aFJjkv13jGppQzC1DUY1N50comaJk0v87mJIrphc4zUfJaIyNxhPLl_ij1N3zwdv3S-kwaxrg3vVlBH5agsYgpJd6Ec3TWlMrWPWdwxx-AqXCb3z_sjF2xXxfqr-OBaUfbSoMTdQKgiWDhZ5jpZvKlNGL8Ct9eE8bj00LGgQW0gL30aqxQFBkbDAQ02Lg26uZm1UbqpQeP0m_NJ3f6HZH9qlBSXiE3rznul66UGGt2AoCUWxdOtcBriF6DZYNvsV8yfQCdjDw7EG74DKNNSuMimS0REUS6orXbYuKJOg_sNsWRuAnYx8wN3bgDhjb2g8Y_g8sRHbsPOMP2FvGoW-VbSprreHgHIbnFoIX8k1vRpIQTn2FRdL_XYh6uEPKcB3soBuD_dICE65_hvGtwBjfM7sA31-_9M_MBnNumpjgNNy8j5Xj5HncFBVheZ4JnmBEyo9_fsI0yco86ECE9FCjFZMaEYLn899u5_O5kjWfxxoQKXW99uAQnWZJVQQzbyRjXHcNotPkOc8ZIU0CNSSxIxoRbocmmMejESIzt3YC2jB4dFo-9zcewwgTIgeRWWhsxBGdkGJTPTkBSQVtmuIQpDgPEv7M503n4XnOauflCsT8iUk_52a1MrrIQiWtlY_MQy92C8aZ5qAQnXrbQYyLvoKIGl80LaJTWtFwHfyAexCbSvo2n4wVe6ZJqSjzcjBIfONSag9WMxWp8_GGmp5S87qBI-rxQL2wptPiMjg42Mu078JfiBpy9iLq7Aw1bcQRdXmWmrxb6vyUuq4bfkRdnaWm75a6OKWmtKkPqavkLHX2bqnHZ6ib49GsSs9S5--Wujzj4ad9Tc5SF--WujqlZnXDjqjpWerxu6VOk1Nq2lRH1NlZ6vL9Uqcn1CXPj-N6k27d3f8ZgRGdDCsxQnDynCRJEjO1qkpK0jQhNY9LMDpBdHK3Ykr9A7g39nbVKkRvYrLTP-yz3f76D_1kWduC-Cg1IPqhv_vvo_J4yMaZxyEdcxaRmdRcdQJ2K7_J9B6R2V7J0TJglztucsJNTn18k50pOMP9K_V3ynmafb2_dzl8n77jNB0e5Nd3T9LzZVwNmN3jgPEi9wh_8-mmGZrg_5JfDEb-Hu-IF2EQlE1faeqX0l0i-kG6WGhoadPYzqq1_58wACI0Pa_zpRrJDU66s0q_W2WaVnQrk-CXyPmbETAoObl_qAfjA0nOckRmAupugcisb2r4vez3UEb9K0rWiMxmvQAyO2jlE7MOeu2UFlvp9EXS-8L9g7Zzy3vzAPrNNPO2RXSS0SzZys4O3ZpVTZ4Ft_5WGA3PPsIgUkrt40CbX3NlNOARN9r51pp2lKB8-ta4aUm2uPkhLi3yMXkNbhuuP32aQiP12_lcD5Sm1XakosUxEKuL7wGKu3Hy8c2ZkmIXSuNjJpEnr2EK09yZAPvZ0ZXmW5LyJLpe526_dHTtcKujjqMVT1_tjHfedtx3lqkPClag_Vu7ZEV3E1OaHKFlRVkFNNOCZXHDetgOekuJebILmjR9nZk_wiOoN9ZdJLsROT1acdCUJ82rneZaGf7wluFeVru5NKVHKASa8v36f0n30LJf3v_zd-r_JzM4yatXO80sflh9O_ene6lIejxvj7Px64f_N-4Smpc7kPIYpGJfiuM4M080U-u_tgjDZ7mfAXHQ3DYn3POow2m4BJEk_AvyD76N_hTF_cXW0sluXCHJsVZSw5e0gr_z62Mf_zkyabqTmQ4y4_6MyEjOODuS2X_y28rsK4tiZ8aexuVrNXtjlENkBs_ess0rIiysTWc53BgBm9Fxb92z3dfrASiMy-R4nbcPEGNDTO7u-__WnVTiJ-vfNrmRn-8Gd0IP7U8JF9H-C9Bh_oGpZAttnJf8B8XcZ2Xe_3HHlyA6tRd75Z7S7EgpZcmXPAWRkmmj1yvTOazZClzLOAT98fFkb4sW0UnXCubhkwW2qhUgUjo_1NNp-a8O5q0_2tO56c9L2Fv9aDjz0mhEbvC2mICGdcrPBSjw8NWi9ANG9MP2cMNn_OZWt513B-24JbMgTuUdlB2wpsyzeBjGD21ta3k8s2m1X0PwgoMCTKkg_atlvsr1J9N-z8e2_YPG14qtasE2e4g0ReNp_-wwNdokSNtlW8usH1Zs23vKm3lr5eP-Su4N_DfdTeIkP_BfKEU2Fof7nIOrbU5OIXrzaOR2dNhsEp6n_8bNwtnQxodA4fo9tzLbiS32ZG30zC00n9HEmVI14w_zJqh-OVHvHKUywdpfTnI_t-t5_3HDwtQeTbZbPJPxt8GcNXAMm2_5pPIZXcUuGyLljxm7bKfvmXsIdtxtRXsr9eJ3aELovZT-65sfPz9kyF7eS6qj5IBnacy3bPeDtna_RVlGd7M-TQ6U5SURdPyrBjNNzy8mf7DJXNe2xsY8On4Hk3qx3YPc5aWU4EnbqvXn542JW2seHPr3Tus9H4_3ESn7SNwZ-v4pfkgpDj39vPn3fX0INXpzMM_0Y9KLAv1uC9x_BezHnT0nofi3kEVJ3hvkkzXPgXs7ifquVfCrG-J7jJHv9XuG37cVECn7ImfD52X2KPbssf_BJG2yqimbmA44z6yfD9-Vw0RgF4hOEb0xrZcr-RcIbDo_DNfM47Z_dc4tMA_xK3a274TFQTs5SQUZ9vyNBrrH8qLzm89lMS-yGPah-OgO0UmZHh2FvRBXVFS0YhdwlRbjLM2znOYXyyuW1HWTVCUteVYKqGoimjGjaZEQytOcXcgrkhCapGmVlnlO8xHNmwrKihd5mTd1k6EsgRWTahTMPDJ2cRGP414VSZqnF4rVoFw8LU-IhqfhrC4hKJ9e2Kt4zL3uFg5liZLOu10tXnoVj9nvny3epq0izIB_7zw2zXBs_aWr_t68F51VV998DD-KD10R4f4TAAD__-GJiE8">