<div dir="ltr">Consider landing the obvious bits (print function, mostly) separately and use this only for the interesting bits.</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 9, 2017 at 11:10 AM, Michał Górny via Phabricator via cfe-commits <span dir="ltr"><<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">mgorny added inline comments.<br>

<br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:293<br>

<br>

+def to_bytes(str):<br>

+    # Encode to UTF-8 to get binary data.<br>

----------------<br>

Pretty much a nit but using variable names that match type names can be a bit confusing here.<br>

<br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:302<br>

<span class="">+        return bytes<br>

+    return to_bytes(bytes)<br>

</span>+<br>

----------------<br>

Shouldn't this be:<br>

<br>

    return bytes.decode('utf-8')<br>

<br>

?<br>

<br>

Otherwise, unless I'm missing something this function will always return the parameter [with no modifications], either in the first conditional if it's of `str` type, or in the conditional inside `to_bytes()` if it's of `bytes` type.<br>

<br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:306<br>

+    try:<br>

+        return to_string(bytes.decode('utf-8'<wbr>))<br>

+    except AttributeError: # 'str' object has no attribute 'decode'.<br>

----------------<br>

This logic looks really weird to me. What is the purpose of having both `to_string()` and `convert_string()`? Why do `to_bytes()` and `to_string()` use `isinstance()` to recognize types, and here you rely on exceptions? Why is `to_string()` called after decoding?<br>

<br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:310<br>

+    except UnicodeError:<br>

+        return str(bytes)<br>

+<br>

----------------<br>

I don't think this can ever succeed. If the argument is not valid utf8, it certainly won't be valid ASCII.<br>

<span class=""><br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:323<br>

</span>   for line in patch_file:<br>

-    match = re.search(r'^\+\+\+\ [^/]+/(.*)', line)<br>

+    match = re.search(to_bytes(r'^\+\+\+\ [^/]+/(.*)'), line)<br>

     if match:<br>

----------------<br>

Any reason not to use:<br>

<br>

    br'^\+...'<br>

<br>

? i.e. make it bytes immediately instead of converting.<br>

<br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:344<br>

+  keys_to_delete = []<br>

<span class="">+  for filename in list(dictionary.keys()):<br>

</span>+    filename_cp = convert_string(bytes(filename)<wbr>)<br>

----------------<br>

Since you are using `keys_to_delete` now, you can remove the `list()`.<br>

<br>

<br>

================<br>

Comment at: tools/clang-format/git-clang-<wbr>format:348<br>

     if len(base_ext) == 1 or base_ext[1].lower() not in allowed_extensions:<br>

-      del dictionary[filename]<br>

+      keys_to_delete += [filename]<br>

+  for key in keys_to_delete:<br>

----------------<br>

Another nit. I think it'd be better to just append a single item instead of a list of 1 item ;-).<br>

<br>

    keys_to_delete.append(<wbr>filename)<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

<a href="https://reviews.llvm.org/D30773" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D30773</a><br>

<br>

<br>

<br>

______________________________<wbr>_________________<br>

cfe-commits mailing list<br>

<a href="mailto:cfe-commits@lists.llvm.org">cfe-commits@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-commits</a><br>

</div></div></blockquote></div><br></div>