r197608 - clang-format-diff.py: fix -regex/-iregex matching

Alp Toker alp at nuanti.com
Thu Dec 19 01:56:03 PST 2013


On 19/12/2013 09:08, Daniel Jasper wrote:
>
>
>
> On Wed, Dec 18, 2013 at 10:34 PM, Alp Toker <alp at nuanti.com 
> <mailto:alp at nuanti.com>> wrote:
>
>     Author: alp
>     Date: Wed Dec 18 15:34:07 2013
>     New Revision: 197608
>
>     URL: http://llvm.org/viewvc/llvm-project?rev=197608&view=rev
>     Log:
>     clang-format-diff.py: fix -regex/-iregex matching
>
>     While debating the finer points of file extension matching, we
>     somehow missed
>     the bigger problem that the current code will match anything
>     starting with the
>     default or user-specified pattern (e.g. lit.site.cfg.in
>     <http://lit.site.cfg.in>).
>
>     Fix this by doing what find(1) does, implicitly wrapping the
>     pattern with ^$.
>
>     Modified:
>         cfe/trunk/tools/clang-format/clang-format-diff.py
>
>     Modified: cfe/trunk/tools/clang-format/clang-format-diff.py
>     URL:
>     http://llvm.org/viewvc/llvm-project/cfe/trunk/tools/clang-format/clang-format-diff.py?rev=197608&r1=197607&r2=197608&view=diff
>     ==============================================================================
>     --- cfe/trunk/tools/clang-format/clang-format-diff.py (original)
>     +++ cfe/trunk/tools/clang-format/clang-format-diff.py Wed Dec 18
>     15:34:07 2013
>     @@ -43,7 +43,7 @@ def main():
>                            help='apply edits to files instead of
>     displaying a diff')
>        parser.add_argument('-p', metavar='NUM', default=0,
>                            help='strip the smallest prefix containing
>     P slashes')
>     -  parser.add_argument('-regex', metavar='PATTERN', default='',
>     +  parser.add_argument('-regex', metavar='PATTERN', default=None,
>                            help='custom pattern selecting file paths
>     to reformat '
>                            '(case sensitive, override -iregex)')
>        parser.add_argument('-iregex', metavar='PATTERN', default=
>     @@ -66,11 +66,11 @@ def main():
>          if filename == None:
>            continue
>
>     -    if args.regex != '':
>     -      if not re.match(args.regex, filename):
>     +    if args.regex is not None:
>
>
> I am not opposed, but was there a particular reason to change '' to None?

Yes. The empty string '' maps to '^$' which matches nothing, in the same 
way 'a' maps to '^a$' which matches only the path "a".

It's fine to permit it, or perhaps disallow it as find(1) does, but 
special-casing empty string to match everything was the one behaviour 
that didn't add up.

I noticed this a couple of days when everything in my diff got formatted 
instead of just matching no files (with arguments passed in from a script).


>     +      if not re.match('^%s$' % args.regex, filename):
>
>
> AFAIK, re.match() only matches if the regex matches from the beginning 
> of the string. So, the "^" is unnecessary. The "$" is a good idea, 
> though :-).

Using both ^ and $ serves an important purpose here -- namely, the three 
of us managed to consistently miss what was going on here. Writing out 
both start and end matchers explicitly makes it much clearer.

Alp.



>              continue
>          else:
>     -      if not re.match(args.iregex, filename, re.IGNORECASE):
>     +      if not re.match('^%s$' % args.iregex, filename, re.IGNORECASE):
>              continue
>
>          match = re.search('^@@.*\+(\d+)(,(\d+))?', line)
>
>
>     _______________________________________________
>     cfe-commits mailing list
>     cfe-commits at cs.uiuc.edu <mailto:cfe-commits at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>

-- 
http://www.nuanti.com
the browser experts




More information about the cfe-commits mailing list