<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/128024>128024</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            clang SARIF output fails to escape brace characters in text (§3.11.5)
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            clang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          davidmalcolm
      </td>
    </tr>
</table>

<pre>
    Given this invalid C code:

```
void foo (void)
{
```

clang trunk [correctly reports](https://godbolt.org/z/rrTTjKrY3):

```
<source>:2:2: error: expected '}'
    2 | {
      |  ^
<source>:2:1: note: to match this '{'
    2 | {
      | ^
1 error generated.
Compiler returned: 1
```

With `-fdiagnostics-format=sarif`, clang [emits this](https://godbolt.org/z/7K3Gv3MT3):

```
{"$schema":"https://docs.oasis-open.org/sarif/sarif/v2.1.0/cos02/schemas/sarif-schema-2.1.0.json","runs":[{"artifacts":[{"length":18,"location":{"index":0,"uri":"file://<source>"},"mimeType":"text/plain","roles":["resultFile"]}],"columnKind":"unicodeCodePoints","results":[{"level":"error","locations":[{"physicalLocation":{"artifactLocation":{"index":0,"uri":"file://<source>"},"region":{"endColumn":2,"startColumn":2,"startLine":2}}}],"message":{"text":"expected '}'"},"ruleId":"14","ruleIndex":0},{"level":"note","locations":[{"physicalLocation":{"artifactLocation":{"index":0,"uri":"file://<source>"},"region":{"endColumn":1,"startColumn":1,"startLine":2}}}],"message":{"text":"to match this '{'"},"ruleId":"109","ruleIndex":1}],"tool":{"driver":{"fullName":"","informationUri":"https://clang.llvm.org/docs/UsersManual.html","language":"en-US","name":"clang","rules":[{"defaultConfiguration":{"enabled":true,"level":"error","rank":50},"fullDescription":{"text":""},"id":"14","name":""},{"defaultConfiguration":{"enabled":true,"level":"note","rank":-1},"fullDescription":{"text":""},"id":"109","name":""}],"version":"21.0.0git"}}}],"version":"2.1.0"}
```

which when formatted is:

```
{
    "$schema": "https://docs.oasis-open.org/sarif/sarif/v2.1.0/cos02/schemas/sarif-schema-2.1.0.json",
    "runs": [
        {
            "artifacts": [
                {
                    "length": 18,
                    "location": {
                        "index": 0,
                        "uri": "file://<source>"
                    },
                    "mimeType": "text/plain",
                    "roles": [
                        "resultFile"
                    ]
                }
            ],
            "columnKind": "unicodeCodePoints",
            "results": [
                {
                    "level": "error",
                    "locations": [
                        {
                            "physicalLocation": {
                                "artifactLocation": {
                                    "index": 0,
                                    "uri": "file://<source>"
                                },
                                "region": {
                                    "endColumn": 2,
                                    "startColumn": 2,
                                    "startLine": 2
                                }
                            }
                        }
                    ],
                    "message": {
                        "text": "expected '}'"
                    },
                    "ruleId": "14",
                    "ruleIndex": 0
                },
                {
                    "level": "note",
                    "locations": [
                        {
                            "physicalLocation": {
                                "artifactLocation": {
                                    "index": 0,
                                    "uri": "file://<source>"
                                },
                                "region": {
                                    "endColumn": 1,
                                    "startColumn": 1,
                                    "startLine": 2
                                }
                            }
                        }
                    ],
                    "message": {
                        "text": "to match this '{'"
                    },
                    "ruleId": "109",
                    "ruleIndex": 1
                }
            ],
            "tool": {
                "driver": {
                    "fullName": "",
                    "informationUri": "https://clang.llvm.org/docs/UsersManual.html",
                    "language": "en-US",
                    "name": "clang",
                    "rules": [
                        {
                            "defaultConfiguration": {
                                "enabled": true,
                                "level": "error",
                                "rank": 50
                            },
                            "fullDescription": {
                                "text": ""
                            },
                            "id": "14",
                            "name": ""
                        },
                        {
                            "defaultConfiguration": {
                                "enabled": true,
                                "level": "note",
                                "rank": -1
                            },
                            "fullDescription": {
                                "text": ""
                            },
                            "id": "109",
                            "name": ""
                        }
                    ],
                    "version": "21.0.0git"
                }
            }
        }
    ],
    "version": "2.1.0"
}
```

Note the braces characters within the `text` strings.

[3.11.5 Messages with placeholders](https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790716) has this text:

"Within both plain text and formatted message strings, the characters “{” and “}” SHALL be represented by the character sequences “{{” and “}}” respectively."


GCC 15 onwards has a `sarif-replay` tool which consumes sarif files and replays the results within them as if they were GCC diagnostics (I'm the author).  Running the above through `sarif-replay` gives this error:

```
$ sarif-replay tmp.sarif
tmp.sarif:37:32: error: unescaped '}' within message string [SARIF v2.1.0 §3.11.11]
   37 |                     "message": {
      |                                ^
   38 |                         "text": "expected '}'"
      |                         ~~~~~~~~~~~~~~~~~~~~~~
   39 |                     },
      |                     ~           
```

I believe the error message from `sarif-replay` is correct [1], and that the clang-produced sarif file is invalid, it should be:

```
   "text": "expected '}}'"
```

Looks like the
```
                        "text": "to match this '{'"
```
is also invalid, and that it should be:
```
                        "text": "to match this '{{'"
```

Hope this is constructive
Dave

[1] although ideally it would underline the '}' at line 38 and refer to §3.11.5 rather than to §3.11.11, but that's a bug for me

</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsWt-P2rj2_2s8L0egxIEBHnhgoOxWbVdfbVutvk9XJjkQt47NtR2mcx_2b79yHMAJCcNMR1fau7eiQ3B8fh9_fI4TZgzfScQ5GT-Q8eqOlTZXep6xA88KJlIliruNyp7mv_ADSrA5N8DlgQmewRJSlSFJFiSqPvdR_YkWB8Uz2CoFhE7dNaEzN2Py0JpHokUqmNyB1aX8DmT8kCqtMbXiCTTulbaGjFeETnNr98aJomtC1zuVbZSwQ6V3hK7_Reha6y9fvn3Q_584SV0akWRpVKlTJMk7kixo_R9Qa6Wrix97TC1mQOiETFbub7QAAKBAJkvwukP1z_0GMn7XzTZ23KSyzjNgFRTMprn3XMX64TnWnnPsVYMdStTMYjYk0WKpij0XqEGjLbXEzMmIL536B7c5kPtosM0420llLE_NYKt0wSxJVoZpvnWT6RK8_8n4AQtuTaXnbS6ffEh-OSSfvvS73JlKCR2ZNMeCuWvHjTYZZyo1Q8UMNwO1R1kL8Bqevw90GA8jQtepMhF1Nyqe5jhl4H8PqmnDb0bJSvSSUKpLaWrZ4wevE9OWb1lq2-MC5c7mfjCeenKhUma555fUNnGZ4Q8_EPlZpeYn-7Zc4Mm8RoJQWmWWIyh4gV-e9niisvjDErreC8YD3ZXAQElKNZpS2LWTQKmL02RVRctNTpUoC_mBy-zEtJTcrdGlyvD_FJfeYM-5YnTpgAOKE7VfG0eKoyPaNPv8yfCUiY8djjo6uuveGzhR467FFGW2rNzgB2vVjWXa9o1_5BKPo5NV_Tn6tEBj2A5DET5QRxddokaoXynw_Tka8ShISoHvQwd4ko4gVEjyV45B3BOD-K1i0IexVwIRzfoiEYeSrVIiFJtpfkAdjmxLIX5jxXkZn_hy6eGWK_k18GkT_CrwHQpxKGrcc2hI6PqrQW0-MVkyMcxtIc4JwOSuPDvDOXvw9fPptgxVqZg37GynTYZbVgq7VHLLd6W-SA-UbCOw9pvVJdZKXEEJzeR3f2scndzvvLRCk2q-b4toRDIMGe9cN7Lt62DdvIk1jeV2NmYQv5kxQe51WXPMvQNqc2ZPKXUbW7Tjtp7XXCMXs_12Wc28qA4ec57m8JijBJ-jDsC46d3EfXFyuZfDf3ozPyly3tNd5XKqnqBRS9UjF9t9i6SfNGARVgbgS4P-uQ2gvcK3nh-AMET9nOvZJ4CG5xC6R0Gfk73aNwsT6K5MeqmDgqXHzeHkRjHTw3O86oxVe7ReCRexbxdFcKUquiBuFEmvS5sTtkALKp9NnxvdeDW7apbd5cENtNBcQK-ihxdneYvyZzO-we5a9rfkhpXNy2xtVUBwJeAd1BeV0mvozxUV3OqXZ9Lo6oz-u50LM9C2Ud09D5bnLRZ6y-_XAV9YI0JYdVynaKR1N1J18ngJbgRFyf9g4-8DG_FPwsYr6P8esHGlY3wj7DgV-TeDR_xTZc65U-3xRat7vQ4_za4Wzm1tL0VXt3vZHryw3e3HukYXDK02uJdMNkwK2-OrUXpDRO1vUW9H1UYbC8c-9ibSF5ejLfpzPwzjrs2uMf0WyOxrp2_3RnNhP4_mt6rFb68EAip5sWiuY9_1Zu-vl1HPFiot8iChBl0A2Jj-X5RQz2wPAdlLM-p1u2nj-Ajap003bUyNkfOvhuROSceTqmjReVj1m7IINkfYaJaigTRnmqUWtYFHbnMuq5vkPqoCdx-BsZrLnRnWZ1njh2QYx8MxfPIVgyeDvWAp5kpkqDufd73wJAu1ZpZFMaFrdT7H8rcHx5sDZQapKvYCLdb7XPKPLyqNR_FkFk3ie0JnkDP_EA4qe05HcpT-4a3dKK-9Mxx_WGAyC07x6rLo6ARCl5V7AqeRd5RMIzKrDk3r61XFJbizOt_5_Ovi40fYIGjcazQonZjNU5MtGPxnidLFp8H_qohQikbjOjl-QPE0rLOh-vyyXEI8BiUfmc5M5R3mou0drHEv2JOLuiuAwB9spkqaskAD1Rxwpb-ppPvZplK9PtoJcqgAZoBv3eUTPKJGcLKD56dA6PQ9oZOiYuCflBM6GwL8XkrJ5c6Pb9TBJaxW5S7v0nTHD1iH-PjwuePYlY4gJARb7Ic-66LF-TpZJBP3hzYeZZcSTcr2YVt8tLOZH66k-bz4_f0afKYCWVKymFQLJo5PR2_JxD_y7saO_kq8j6gBTO9qGdOr01_Y-V9j9Wfnv1qLWb-lTZzvm_ZnSHIBZu9hg4LjwSOaf8B_DMlWq6IrX7iB-qUIF67YA2qVzzZn1q9DV8oO9lplZYpZkPdwfl3DEXELJlelyGDT_d7GLZ4OnN2y7qNS3w0I_r0y75L1bZG92pyFHLkBJowKDTx5pcPSN9OlRx0SLX5Ve6zfkTEVDFldVqBGosWKVV9-S3JRBCZsXoEEz5AJ8eSUfqx0LmWGWnDp0-S8iJmFajSZ1ni2RQ1Whct2DJrZ3A3nTLbuxbHz0aa0lY8InTgs3ZQ7t39A4bS7y-ZJNktm7A7n8WQURffThE7u8nkyHiU0Syb30yTespTG01Ea0XSU4TTebLPpHZ_TiI4jSqP4Pp6OJsMxnWzjON5u2Hg2oklERhEWjItT-3fHjSlxHtNpREd3gm1QmLl_y-HUmo1Xd3ruCAabcmfIKBLcWHNmYbkVOPevsHggU6Xdlxa2jAvjzPdI6IuHcBs8bp-ETkPvETq7K7WYt15-4TYvN8NUFYSunfD6yy25b5haQteVMW7nr-05zOm_AwAA__-rtT2J">