[PATCH] UTF-8 support for clang-format.

Wed Jun 5 00:23:54 PDT 2013


================
Comment at: lib/Format/FormatToken.h:96
@@ -94,3 +95,3 @@
   /// with the token.
   unsigned TokenLength;
 
----------------
Daniel Jasper wrote:
> How about we make these slightly easier to understand and shorter?
> 
> What are the remaining usages of TokenLength? Would it make sense to rename that to "ByteCount"? And would it then make sense to rename CodePointCount to "TokenLength"? Or even just "Length" as we are in a class ..Token?
I think I'd like to keep CodePointCount as it makes it really obvious we're not counting bytes. I'm also in favor of renaming TokenLength to ByteCount though, so the duality is more clear.

================
Comment at: lib/Format/Utils.h:1
@@ +1,2 @@
+//===--- Utils.h - Format C++ code ----------------------------------------===//
+//
----------------
Daniel Jasper wrote:
> Please don't call this "Utils", this is far too generic. How about "Encodings"? I think hex/octal escape sequences are also a kind of encoding ..
+1 to not have utils. Alternatively, create two headers, one for encodings, and one for string literal related functions, and name them appropriately.


http://llvm-reviews.chandlerc.com/D918