<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Clang-format should use UTF-8 character's width shown in editor rather than storage byte size."
href="https://bugs.llvm.org/show_bug.cgi?id=47989">47989</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Clang-format should use UTF-8 character's width shown in editor rather than storage byte size.
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Formatter
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>panzhongxian@126.com
</td>
</tr>
<tr>
<th>CC</th>
<td>djasper@google.com, klimek@google.com, llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>When I input some UTF-8 character rather than simple ASCII, I find that
clang-format will take the character's length as how many bytes expressed,
rather than as the length show in terminal.
For example:
`测` is stored in 3 bytes, but it takes 2 ASCII space in vim or other editors.
The expected formatted code should be as follow:
#define test \
/* 测试 */ \
"aa" \
"bb" \
"bb"
But what I really get is as follow:
#define test \
/* 测试 */ \
"aa" \
"bb" \
"bb"
I have tryed both clang-format 11.0.0 and the newest 12.0.0.
I think the width on screen should be used instead of its storage byte size.(Or
is there any configuration to control this?)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>