<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/128024>128024</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
clang SARIF output fails to escape brace characters in text (§3.11.5)
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
davidmalcolm
</td>
</tr>
</table>
<pre>
Given this invalid C code:
```
void foo (void)
{
```
clang trunk [correctly reports](https://godbolt.org/z/rrTTjKrY3):
```
<source>:2:2: error: expected '}'
2 | {
| ^
<source>:2:1: note: to match this '{'
2 | {
| ^
1 error generated.
Compiler returned: 1
```
With `-fdiagnostics-format=sarif`, clang [emits this](https://godbolt.org/z/7K3Gv3MT3):
```
{"$schema":"https://docs.oasis-open.org/sarif/sarif/v2.1.0/cos02/schemas/sarif-schema-2.1.0.json","runs":[{"artifacts":[{"length":18,"location":{"index":0,"uri":"file://<source>"},"mimeType":"text/plain","roles":["resultFile"]}],"columnKind":"unicodeCodePoints","results":[{"level":"error","locations":[{"physicalLocation":{"artifactLocation":{"index":0,"uri":"file://<source>"},"region":{"endColumn":2,"startColumn":2,"startLine":2}}}],"message":{"text":"expected '}'"},"ruleId":"14","ruleIndex":0},{"level":"note","locations":[{"physicalLocation":{"artifactLocation":{"index":0,"uri":"file://<source>"},"region":{"endColumn":1,"startColumn":1,"startLine":2}}}],"message":{"text":"to match this '{'"},"ruleId":"109","ruleIndex":1}],"tool":{"driver":{"fullName":"","informationUri":"https://clang.llvm.org/docs/UsersManual.html","language":"en-US","name":"clang","rules":[{"defaultConfiguration":{"enabled":true,"level":"error","rank":50},"fullDescription":{"text":""},"id":"14","name":""},{"defaultConfiguration":{"enabled":true,"level":"note","rank":-1},"fullDescription":{"text":""},"id":"109","name":""}],"version":"21.0.0git"}}}],"version":"2.1.0"}
```
which when formatted is:
```
{
"$schema": "https://docs.oasis-open.org/sarif/sarif/v2.1.0/cos02/schemas/sarif-schema-2.1.0.json",
"runs": [
{
"artifacts": [
{
"length": 18,
"location": {
"index": 0,
"uri": "file://<source>"
},
"mimeType": "text/plain",
"roles": [
"resultFile"
]
}
],
"columnKind": "unicodeCodePoints",
"results": [
{
"level": "error",
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"index": 0,
"uri": "file://<source>"
},
"region": {
"endColumn": 2,
"startColumn": 2,
"startLine": 2
}
}
}
],
"message": {
"text": "expected '}'"
},
"ruleId": "14",
"ruleIndex": 0
},
{
"level": "note",
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"index": 0,
"uri": "file://<source>"
},
"region": {
"endColumn": 1,
"startColumn": 1,
"startLine": 2
}
}
}
],
"message": {
"text": "to match this '{'"
},
"ruleId": "109",
"ruleIndex": 1
}
],
"tool": {
"driver": {
"fullName": "",
"informationUri": "https://clang.llvm.org/docs/UsersManual.html",
"language": "en-US",
"name": "clang",
"rules": [
{
"defaultConfiguration": {
"enabled": true,
"level": "error",
"rank": 50
},
"fullDescription": {
"text": ""
},
"id": "14",
"name": ""
},
{
"defaultConfiguration": {
"enabled": true,
"level": "note",
"rank": -1
},
"fullDescription": {
"text": ""
},
"id": "109",
"name": ""
}
],
"version": "21.0.0git"
}
}
}
],
"version": "2.1.0"
}
```
Note the braces characters within the `text` strings.
[3.11.5 Messages with placeholders](https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790716) has this text:
"Within both plain text and formatted message strings, the characters “{” and “}” SHALL be represented by the character sequences “{{” and “}}” respectively."
GCC 15 onwards has a `sarif-replay` tool which consumes sarif files and replays the results within them as if they were GCC diagnostics (I'm the author). Running the above through `sarif-replay` gives this error:
```
$ sarif-replay tmp.sarif
tmp.sarif:37:32: error: unescaped '}' within message string [SARIF v2.1.0 §3.11.11]
37 | "message": {
| ^
38 | "text": "expected '}'"
| ~~~~~~~~~~~~~~~~~~~~~~
39 | },
| ~
```
I believe the error message from `sarif-replay` is correct [1], and that the clang-produced sarif file is invalid, it should be:
```
"text": "expected '}}'"
```
Looks like the
```
"text": "to match this '{'"
```
is also invalid, and that it should be:
```
"text": "to match this '{{'"
```
Hope this is constructive
Dave
[1] although ideally it would underline the '}' at line 38 and refer to §3.11.5 rather than to §3.11.11, but that's a bug for me
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsWt-P2rj2_2s8L0egxIEBHnhgoOxWbVdfbVutvk9XJjkQt47NtR2mcx_2b79yHMAJCcNMR1fau7eiQ3B8fh9_fI4TZgzfScQ5GT-Q8eqOlTZXep6xA88KJlIliruNyp7mv_ADSrA5N8DlgQmewRJSlSFJFiSqPvdR_YkWB8Uz2CoFhE7dNaEzN2Py0JpHokUqmNyB1aX8DmT8kCqtMbXiCTTulbaGjFeETnNr98aJomtC1zuVbZSwQ6V3hK7_Reha6y9fvn3Q_584SV0akWRpVKlTJMk7kixo_R9Qa6Wrix97TC1mQOiETFbub7QAAKBAJkvwukP1z_0GMn7XzTZ23KSyzjNgFRTMprn3XMX64TnWnnPsVYMdStTMYjYk0WKpij0XqEGjLbXEzMmIL536B7c5kPtosM0420llLE_NYKt0wSxJVoZpvnWT6RK8_8n4AQtuTaXnbS6ffEh-OSSfvvS73JlKCR2ZNMeCuWvHjTYZZyo1Q8UMNwO1R1kL8Bqevw90GA8jQtepMhF1Nyqe5jhl4H8PqmnDb0bJSvSSUKpLaWrZ4wevE9OWb1lq2-MC5c7mfjCeenKhUma555fUNnGZ4Q8_EPlZpeYn-7Zc4Mm8RoJQWmWWIyh4gV-e9niisvjDErreC8YD3ZXAQElKNZpS2LWTQKmL02RVRctNTpUoC_mBy-zEtJTcrdGlyvD_FJfeYM-5YnTpgAOKE7VfG0eKoyPaNPv8yfCUiY8djjo6uuveGzhR467FFGW2rNzgB2vVjWXa9o1_5BKPo5NV_Tn6tEBj2A5DET5QRxddokaoXynw_Tka8ShISoHvQwd4ko4gVEjyV45B3BOD-K1i0IexVwIRzfoiEYeSrVIiFJtpfkAdjmxLIX5jxXkZn_hy6eGWK_k18GkT_CrwHQpxKGrcc2hI6PqrQW0-MVkyMcxtIc4JwOSuPDvDOXvw9fPptgxVqZg37GynTYZbVgq7VHLLd6W-SA-UbCOw9pvVJdZKXEEJzeR3f2scndzvvLRCk2q-b4toRDIMGe9cN7Lt62DdvIk1jeV2NmYQv5kxQe51WXPMvQNqc2ZPKXUbW7Tjtp7XXCMXs_12Wc28qA4ec57m8JijBJ-jDsC46d3EfXFyuZfDf3ozPyly3tNd5XKqnqBRS9UjF9t9i6SfNGARVgbgS4P-uQ2gvcK3nh-AMET9nOvZJ4CG5xC6R0Gfk73aNwsT6K5MeqmDgqXHzeHkRjHTw3O86oxVe7ReCRexbxdFcKUquiBuFEmvS5sTtkALKp9NnxvdeDW7apbd5cENtNBcQK-ihxdneYvyZzO-we5a9rfkhpXNy2xtVUBwJeAd1BeV0mvozxUV3OqXZ9Lo6oz-u50LM9C2Ud09D5bnLRZ6y-_XAV9YI0JYdVynaKR1N1J18ngJbgRFyf9g4-8DG_FPwsYr6P8esHGlY3wj7DgV-TeDR_xTZc65U-3xRat7vQ4_za4Wzm1tL0VXt3vZHryw3e3HukYXDK02uJdMNkwK2-OrUXpDRO1vUW9H1UYbC8c-9ibSF5ejLfpzPwzjrs2uMf0WyOxrp2_3RnNhP4_mt6rFb68EAip5sWiuY9_1Zu-vl1HPFiot8iChBl0A2Jj-X5RQz2wPAdlLM-p1u2nj-Ajap003bUyNkfOvhuROSceTqmjReVj1m7IINkfYaJaigTRnmqUWtYFHbnMuq5vkPqoCdx-BsZrLnRnWZ1njh2QYx8MxfPIVgyeDvWAp5kpkqDufd73wJAu1ZpZFMaFrdT7H8rcHx5sDZQapKvYCLdb7XPKPLyqNR_FkFk3ie0JnkDP_EA4qe05HcpT-4a3dKK-9Mxx_WGAyC07x6rLo6ARCl5V7AqeRd5RMIzKrDk3r61XFJbizOt_5_Ovi40fYIGjcazQonZjNU5MtGPxnidLFp8H_qohQikbjOjl-QPE0rLOh-vyyXEI8BiUfmc5M5R3mou0drHEv2JOLuiuAwB9spkqaskAD1Rxwpb-ppPvZplK9PtoJcqgAZoBv3eUTPKJGcLKD56dA6PQ9oZOiYuCflBM6GwL8XkrJ5c6Pb9TBJaxW5S7v0nTHD1iH-PjwuePYlY4gJARb7Ic-66LF-TpZJBP3hzYeZZcSTcr2YVt8tLOZH66k-bz4_f0afKYCWVKymFQLJo5PR2_JxD_y7saO_kq8j6gBTO9qGdOr01_Y-V9j9Wfnv1qLWb-lTZzvm_ZnSHIBZu9hg4LjwSOaf8B_DMlWq6IrX7iB-qUIF67YA2qVzzZn1q9DV8oO9lplZYpZkPdwfl3DEXELJlelyGDT_d7GLZ4OnN2y7qNS3w0I_r0y75L1bZG92pyFHLkBJowKDTx5pcPSN9OlRx0SLX5Ve6zfkTEVDFldVqBGosWKVV9-S3JRBCZsXoEEz5AJ8eSUfqx0LmWGWnDp0-S8iJmFajSZ1ni2RQ1Whct2DJrZ3A3nTLbuxbHz0aa0lY8InTgs3ZQ7t39A4bS7y-ZJNktm7A7n8WQURffThE7u8nkyHiU0Syb30yTespTG01Ea0XSU4TTebLPpHZ_TiI4jSqP4Pp6OJsMxnWzjON5u2Hg2oklERhEWjItT-3fHjSlxHtNpREd3gm1QmLl_y-HUmo1Xd3ruCAabcmfIKBLcWHNmYbkVOPevsHggU6Xdlxa2jAvjzPdI6IuHcBs8bp-ETkPvETq7K7WYt15-4TYvN8NUFYSunfD6yy25b5haQteVMW7nr-05zOm_AwAA__-rtT2J">