<table border="1" cellspacing="0" cellpadding="8">

    <tr>

        <th>Issue</th>

        <td>

            <a href=https://github.com/llvm/llvm-project/issues/72856>72856</a>

        </td>

    </tr>

    <tr>

        <th>Summary</th>

        <td>

            TableGen Jupyter notebook should have a way to filter compiler output

        </td>

    </tr>

    <tr>

      <th>Labels</th>

      <td>

            enhancement,

            tablegen

      </td>

    </tr>

    <tr>

      <th>Assignees</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Reporter</th>

      <td>

          DavidSpickett

      </td>

    </tr>

</table>

<pre>

    Filing this to collect some ideas for future work, only one person I know of has tried this and probably worked around it by using the compiler directly.

**Problem**

If you use LLVM TableGen files like `Target.td` in a notebook, the `llvm-tblgen` output is > 320,000 lines. This breaks the limit Jupyter sets and removing that limit likely makes the client crash.

You might do this if you wanted to make a notebook about adding some LLVM internal thing like a scheduler or an instruction. You wouldn't want every cell to be massive even if the notebook could handle the text.

It's a niche that most people won't hit, so it needs input from people who do to decide what the best tradeoffs are. I don't want to create more things for folks to learn in the process.

**Possible Solutions**

* Arbitrary cut off for the output, basically `tail <N>`.

  * Easiest to understand, but zero nuance.

  * If the content of the includes changes between versions then your `<N>` may need to change.

  * Let's not do this, but writing it here as the "baseline" from which to compare better options.

* Detect the output is too large and return an error to the notebook telling them to use the compiler directly.

  * We're not actually fixing anything, but at least it's clearer.

* Emitting JSON and running one of the JSON query languages on it.

  * Now you’re learning yet another language.

  * The result is more JSON, not the record format you’re used to.

* Pragmas/notes to mark include file content in the output.

  * No way to tell “user” vs. “system” includes apart right now.

  * You may want to see some subset of an included file anyway.

* Regular expression for class and definition names.

  * If we use JSON, same issues as before.

  * Probably could match on the output, but likely easier to make it a compiler option.

  * You are now learning regex but at least there are sites that make building a regex easy, unlike a JSON query language I expect.

  * Is 2 expressions enough, what about multiclass?

* Marking "new" records somehow by comparing the previous output.

  * You may want a mix of old and new in the output.

  * Still leaves 300k of lines in the first cell even if you only want new stuff in the next ones.

  * Not sure we can reliably detect "new" given that the order may not be deterministic.

</pre>

<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx8Vl1vG7sR_TXUy-AKKyqW7Qc9pLZV-CI3DZqgRR-5y1ktKy6pcoaSt7_-Ykh9WL5BgCDGiuR8nDlzZgyR2wbEtbr7m7p7npnMQ0zrZ3Nw9vvedTtknrXRTuuN8y5sgQdHwBG66D12DBRHBGfREPQxQZ85J4RjTDulnyAGP0EMCHtMFAO8wi7EI8QeBkPAyaGtFk2wsE-xNa2fymu0YFLMwYJjaCfIVL0jdHHcO48JrEvYsZ_mqnlWzefT_1r-fUux9TjWj_fHrz1MMUMmhC9f_vUH_DCtx79jgN55JPBuh6BWzQ-TtshztmrVgAtgIETGNsaSlUShVo33h_E3bv0Wg1yLmfeZwRGo5QssdaP0U9M04F1AmsMPSbNNaHZUDHg3Oobf835iTEDIFYSEYzzUVA2fLklUfoLR7LC-7bzDwNAlQ8NN9v-JGUa3HRhsrMC6mvDRBBasY7HyLh8wbcwMxlpxWqpZgHGBMQXjxUrYVmAMUDegzQJ-TGACuECccscuhjmI82PM3gal77m4BDxgmqBD78V3izAK4w4oB0GCk3QusXTyGgYTrMdywvjGNwm-stL3JPG7Ts4FpDESwx7j3gvxqvPBsVSKotAnIFoCF6Q8fYrj5fIQC04RLHbOyg-Gi98WiYGTsRj7nsAknMMr2PguM-mBhIYRxpiwwnTqgeh3pUc8miQYFZP7FDsk-hlZI5FrPcL36LNASX_lrdKf4XNqHScjeGaG2PfFm9iu1JOEW0OuM95PQlA2zoNaPn1Vyxe1ak6uAcTYiyFXkoyQg8VEbIItFjLD_zFFCNmEDm8evfanDgws_Iv104XOZ4sE3WDCFgla5CNigAMmknTkVhAaJonqGhCMZirFKWCWxzfuvmCtdogXPp8jPCbHwkvHMGBCMLUxlNatIZSWU1rXYh8H1w1Vssa9SVJclpaL-4L1_ArwM7JI2hVRKFoXwYscnNqTcwpCfUxJ0I-3DGb0J5nEsWBL-EvRqon-G5W-T8UMmI5zKWDv3sSSCVPh1jlzUQU0xOAqOJ2wDNO7NF5GxwWc37__42uNOocgP4gSn4pWzv6XpTu9CdtspHAxgOObyL7GoxROvWj10KjHx4SV1WJtQgYTIg-YLjZuHv8YEBJS9gXI0ibiVjKRTLkcdzFZIfJo-KOnTIUa71L7lsx2NKT0RhCnqmdpd6ZgUfELPU-NV0v5ISs4mqkUT5Tp7PMpE6bzxzMcaH49ookYx-vhhfRmbxJDKqIb4vHGT5FjM10EgxCrxFJuCUsDFREtpmyN3oTpaKZ3Of8Tt9mbBPi2T0jSUKXvO2-ojgyLvQtOyAzBjEgfW_ZYkLxAT0YmNlGW4KVZ-5hu6_btPIirHo-Gu0G48UFq8mUyoYhJukwXx2CujK999hdcTOH78UqnhFt8u6U4195OCORKuYvci4s2O18mljm9Q0OTRJXDaVT9hODwKiBid0uGVwL9DlwCDDFvBzFW5kEdkGP27Armarm5FucPk3YShtI64FE0pzKaSp2HeJTdpQrPeX_ZJzy4mOlnxLwhjIHRvQlHorel0AGPvyD1d3beC3AHJFg2zU6eluXj_Kh3ibgO4_P4ld2gLGnFozggzn1_fhHwjUU16EP7MFBZ8xA6EyChd4UvtgroFYytEz98HqsxWUxV9SPLNiAP0uiCI3bdfGbXS_u4fDQzXC_um-bTw0I3d7NhvdKf7MND23er-8eHldaru_5O94t7s-xau1jZmVvrRi8XC90sFndLfTdvHprl4qHvcWEX-n7RqU8Njsb5uSxt85i2s9IB63v9cLeaedOip7IDa41hkME3YmCltdJPSmuWNVH2PK1lS07rsvu1eUvqU-MdMV0Ns2OP68tied7wLiOChtOWcxCannSod14uXZum1HeWk18PzHtSy89Kb5TebB0PuZ13cVR6Iz5Pf37bp_hf7FjpTe1tpTcluT8DAAD__31G-7I">