[cfe-dev] clang-format emacs integration fails with multi-byte utf8 characters in the buffer

Kirill Ignatiev kirill.ignatiev at gmail.com
Sun Jan 18 19:12:01 PST 2015


Is this the right mailing list?

The clang-format emacs package asks clang-format to return replacement
rules, which are specified using byte offsets, not character offsets,
and emacs always works with character offsets, not byte offsets.

So if you have some multi-byte utf8 characters in a file, the emacs
function clang-format-region will apply the rules incorrectly because
byte offsets would not be equal to character offsets.

In the function clang-format-region, the input parameters start and
end are char offsets, so they should be converted to byte offsets:

112a122,124
>   (setq start (position-bytes start)
>         end (position-bytes end))

and in clang-format--replace, offset and length should be converted to
char offsets, for example, like so:

93a95,101
> (defun clang-format--position-from-bytes (offset)
>   (when offset
>     (save-excursion
>       (goto-char offset)
>       (while (> (position-bytes (point)) offset) (forward-char -1))
>       (point))))

95,98c103,107
<   (goto-char offset)
<   (delete-char length)
<   (when text
<     (insert text)))
---
>   (let ((start (clang-format--position-from-bytes offset))
>         (end (clang-format--position-from-bytes (+ offset length))))
>     (goto-char start)
>     (delete-region start end))
>   (when text (insert text)))



More information about the cfe-dev mailing list