[llvm] r187732 - Introduce an optimisation for special case lists with large numbers of literal entries.

Peter Collingbourne peter at pcc.me.uk
Mon Aug 5 12:51:00 PDT 2013


On Mon, Aug 05, 2013 at 12:31:09PM -0700, Sean Silva wrote:
> On Mon, Aug 5, 2013 at 10:48 AM, Peter Collingbourne <peter at pcc.me.uk>wrote:
> 
> > Author: pcc
> > Date: Mon Aug  5 12:48:04 2013
> > New Revision: 187732
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=187732&view=rev
> > Log:
> > Introduce an optimisation for special case lists with large numbers of
> > literal entries.
> >
> > Our internal regex implementation does not cope with large numbers
> > of anchors very efficiently.  Given a ~3600-entry special case list,
> > regex compilation can take on the order of seconds.  This patch solves
> > the problem for the special case of patterns matching literal global
> > names (i.e. patterns with no regex metacharacters).  Rather than
> > forming regexes from literal global name patterns, add them to
> > a StringSet which is checked before matching against the regex.
> > This reduces regex compilation time by an order of roughly thousands
> > when reading the aforementioned special case list, according to a
> > completely unscientific study.
> >
> > No test cases.  I figure that any new tests for this code should
> > check that regex metacharacters are properly recognised.  However,
> > I could not find any documentation which documents the fact that the
> > syntax of global names in special case lists is based on regexes.
> >
> 
> The header comment in `include/llvm/Transforms/Utils/SpecialCaseList.h`
> says:
> 
> ```
> // Note that the wild card is in fact an llvm::Regex, but * is automatically
> // replaced with .*
> ```

I was referring to user-level documentation.

> > The extent to which regex syntax is supported in special case lists
> > should probably be decided on/documented before writing tests.
> >
> 
> <http://llvm-reviews.chandlerc.com/D1268> is working towards getting things
> documented. Since I presume this feature is already being used in the wild,
> we probably need to maintain the current format.

OK.  Once that patch lands we can add tests based on whatever we end
up documenting (it appears to be somewhat in flux right now).

Thanks,
-- 
Peter



More information about the llvm-commits mailing list