[libcxx-commits] [PATCH] D74678: [libcxxabi] Replace names with single letters in test_demangle.

James Y Knight via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Sat Feb 15 14:19:54 PST 2020


jyknight created this revision.
jyknight added a reviewer: EricWF.
Herald added subscribers: libcxx-commits, ldionne, christof.
Herald added a project: libc++.
jyknight updated this revision to Diff 244847.
Herald added subscribers: mstorsjo, jfb, mgrang, fedor.sergeev.
Herald added a reviewer: mclow.lists.

...and then remove all the duplicates after this process.

Many test-cases here were taken from the actual symbols defined in a
build of llvm/clang. But, having this file contain almost every
llvm/clang symbol name is somewhat irritating when using 'git grep',
and makes this file much larger than is necessary.

So, for the set of llvm/clang-extracted symbols, rewrite the names to
single-letter names, this rather-hacky python script:




import sys, re, string
from collections import OrderedDict

highletters=''.join([chr(0x80+n) for n in range(26)])
letters=''.join([chr(ord('a')+n) for n in range(26)])
translation = string.maketrans(highletters, letters)

word_re = re.compile('[a-zA-Z_][a-zA-Z0-9_]*')

def rewrite_demangle(mangled, demangled):

  allwords = [word for word in word_re.findall(demangled)
              if word != 'const' and
                 mangled.find('%d%s' % (len(word), word)) != -1]
  allwords = list(enumerate(OrderedDict.fromkeys(allwords)))
  allwords.sort(key=lambda x: len(x[1]), reverse=True)
  
  # Replace names with a unique character first, so that subsequent
  # replacements don't accidentally replace it a second time.
  for namenum, word in allwords:
    mangled = mangled.replace('%d%s' % (len(word), word),
                              '1' + chr(namenum + 0x80))
    demangled = re.sub(r'\b'+word+r'\b', chr(namenum + 0x80), demangled)
  
  # Then return, with actual alphabetic characters.
  return mangled.translate(translation), demangled.translate(translation)

for l in sys.stdin:

  mangled, unmangled = l.rstrip('\n').split(' ', 1)
  sys.stdout.write("    {\"%s\", \"%s\"},\n" % rewrite_demangle(mangled, unmangled))





Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D74678

Files:
  libcxxabi/test/test_demangle.pass.cpp





More information about the libcxx-commits mailing list