<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/125946>125946</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Clang] HTML comments inside C comments are treated as text
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            clang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Nerixyz
      </td>
    </tr>
</table>

<pre>
    I mentioned this briefly in https://github.com/llvm/llvm-project/pull/120843#issuecomment-2593556040.

The `CommentParser` currently emits HTML comments (`/* <!-- foo --> */`) as text:

```c
/// <!-- foo
/// bar
///
/// baz -->
/// <!--! text -->
int Test;
``` 
Leads to:
```text
VarDecl 0x30543910 <<source>:6:1, col:5> col:5 Test 'int'
`-FullComment 0x30543b90 <line:1:4, line:5:18>
 |-ParagraphComment 0x30543aa0 <line:1:4, line:2:7>
  | |-TextComment 0x30543a00 <line:1:4> Text=" "
  | |-TextComment 0x30543a20 <col:5> Text="<"
  | |-TextComment 0x30543a40 <col:6, col:12> Text="!-- foo"
 | `-TextComment 0x30543a60 <line:2:4, col:7> Text=" bar"
 `-ParagraphComment 0x30543b60 <line:4:4, line:5:18>
    |-TextComment 0x30543ac0 <line:4:4, col:11> Text=" baz -->"
    |-TextComment 0x30543ae0 <line:5:4> Text=" "
    |-TextComment 0x30543b00 <col:5> Text="<"
 `-TextComment 0x30543b20 <col:6, col:18> Text="!--! text -->"
``` 
https://godbolt.org/z/sonfj47xo

Doxygen however ignores HTML comments ([docs](https://www.doxygen.nl/manual/htmlcmds.html)).

I can see two possible strategies here:

1. Mirror Doxygen - skip anything between `<!--` and `-->` and include the content between `<!--!` and `-->`.
2. Emit a dedicated `HTMLComment(Comment)` which contains the text inside the comment (`<!-- ... -->`) and `DoxygenHTMLComment(Comment)` that contains a subtree of a parsed comment (`<!--! ... -->`). Not sure about the naming.

1 is easier to implement and has better backwards compatibility. 2 might be useful for tools that specifically want to extract these fragments (e.g. because they contain some commands for the tool that shouldn't be shown in Doxygen). I'm not sure what the best approach is.

</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyUlk2v2rwSxz-N2YyIHOeNLFhQKLqVnlZdHN294wyJW8dGtlOgn_7KTuCcQ-H0PhISef39Z_7jjIc7JzuNuCbFJ1LsFnz0vbHrb2jl-fJ70Zj2sv4CA2ovjcYWfC8dNFbiQV1Aaui9PzqSbQjbE7bvpO_HJhFmIGyv1K_r3_JozQ8UnrD9cVSKsH3K6CrPCMukcyMKMwSJJSvqrChKmtOE0A2hm5cegZR0O93_zq1DS0oKYrQWtVcXwEF6B_95-foPzBQHhK1ISWNIGyDZlrB0uYSDMbBckuwzEBbjDY_UwB14PPuQQ5QMl-NPhJOYFmH7d5h3Nxpu357f3fs9ST5EEZZG6dsjUnt4QedJ9ultIEDo5h_krQNv5jCvt2LkdPNfbncoFNBzRos8q1MaRbKtM6MVGOjZpiTZJiVsC8Iokm2K4MR8GFWBsEpqT1g1KSz3o1Kz81dyU0eykhojLdvkgTifF-HaasoFSLVdfueWd5Yf-zsM5x9hGMk21UwJmIh6wbO_p9A_KdlneInV3BHGgDD2VwiLkDeevL4fC_V3Qv6GUL4anLI72m35TMyILOlDZPk2M3b1Z-JWd0mGBTgTS_rU8uYdMv-ocgBPUxUPIXO-6R-BXVf_bOJzLr7lFh9U8imiof9PHZ_43bAnJVw9KOHdZxvB777Vu55o2sYonxjbEbb_TdjeGX34kVdnMzWcnTlfOtTQmxP-Qguy08bio55WfGqNcKTYEbZ6L3I6nZJ24iQ69NeB65GHg94PSgytS8IBYTVh9dxbv4DgGhwi-JOBo3FONgrBecs9dhId9Gjx1hfTBL5Ka42Fa8BLcD_lEbi--F7qDhr0J0QdTL61uJIC1230Pbo1n0st1Ngi-B5BGO1DLR69ztIHhBA_S-DzID1waLGVgnuMjwTP5toStrod1YFy6qXooxqX2kXpWEapnbyFMi2LeQO5tvwkSeAqHreMKZ7Zh480fc_9qyQHNzbeIoI5AIdj2M7aJ6Jhld3pJvDNeHCjReCNGX0MWfNB6m4uaQrSAXIn0YI3IIejwsgOEffcBY89Wmi4-HnitnVB_Mi9bKSS_pIAg0F2fagFjA4Po4KDCSij3JSLO6KQBym4Uhc4ce2DDp695SLG4xAOlne3NYtJl0CDgo8uWny5ugHODJPhXLdukgkVMUbNSr0ZVasJq2I4rjcnHSaO2fXoxxfCqgH01ZVTeC9QmrCd8ePRGi56kC7Ys2jXWVtnNV_gOq2yVVVntGSLfn1oD3khVuzQZFnJ69WBt5hjWYssLUtcVQu5ZpQVlNGCMZqzNGENR16Joiiw5VWakZziwKVKwqwTPvRFnGrWKSvqvFwo3qByccJiTCiuu9A0it3CruNw1IydIzlV0nn3ivDSqziWbeMbxe6uIczLdvt6hVsEbzF-C_NUsxitWv_rGS1G7-KUFhP4tWb_CwAA___Awfvk">