[cfe-commits] r162407 - in /cfe/trunk: lib/AST/CommentLexer.cpp test/Sema/warn-documentation.cpp unittests/AST/CommentLexer.cpp

Matthieu Monrocq matthieu.monrocq at gmail.com
Sun Aug 26 03:41:42 PDT 2012


On Sat, Aug 25, 2012 at 10:23 PM, Dmitri Gribenko <gribozavr at gmail.com>wrote:

> On Fri, Aug 24, 2012 at 6:33 PM, Jordan Rose <jordan_rose at apple.com>
> wrote:
> > On Aug 22, 2012, at 15:56 , Dmitri Gribenko <gribozavr at gmail.com> wrote:
> >> +bool isHTMLTagName(StringRef Name) {
> >> +  return llvm::StringSwitch<bool>(Name)
> >> +      .Cases("em", "strong", true)
> >> +      .Cases("tt", "i", "b", "big", "small", true)
> >> +      .Cases("strike", "s", "u", "font", true)
> >> +      .Case("a", true)
> >> +      .Case("hr", true)
> >> +      .Cases("div", "span", true)
> >> +      .Cases("h1", "h2", "h3", true)
> >> +      .Cases("h4", "h5", "h6", true)
> >> +      .Case("code", true)
> >> +      .Case("blockquote", true)
> >> +      .Cases("sub", "sup", true)
> >> +      .Case("img", true)
> >> +      .Case("p", true)
> >> +      .Case("br", true)
> >> +      .Case("pre", true)
> >> +      .Cases("ins", "del", true)
> >> +      .Cases("ul", "ol", "li", true)
> >> +      .Cases("dl", "dt", "dd", true)
> >> +      .Cases("table", "caption", true)
> >> +      .Cases("thead", "tfoot", "tbody", true)
> >> +      .Cases("colgroup", "col", true)
> >> +      .Cases("tr", "th", "td", true)
> >> +      .Default(false);
> >> +}
> >
> > This is going to be very slow (StringSwitch just chains the string
> comparisons). Maybe we should use a StringMap instead?
>
> It is a good idea, but it would require some refactoring since
> StringMap should be populated at runtime and we want to do that only
> once.
>
> Dmitri
>

I am not sure a StringMap would be a good idea (memory-wise), it seems
overkill and wasteful. If you want a simple whitelist, the best thing to do
is a simple C-array, sorted, and a binary_search.

I think this should work:

static char const* const HTMLTags[] = { "a", "b", ... };
static size_t const HTMLTagsSize = sizeof(HtmlTag) / sizeof(char const*);

bool isHTMLTagName(StringRef Name) {
#ifndef NDEBUG
    static bool const IsSorted = std::adjacent_find(HTMLTags, HTMLTags +
HTMLTagsSize, std::greater<StringRef>()) == HTMLTags + HTMLTagsSize;
    assert(IsSorted && "The HTMLTags should be sorted in lexicographical
order.");
#endif
    return std::binary_search(HTMLTags, HTMLTags + HTMLTagsSize, Name);
}

>From past experience, it seems recommended to build an array of StringRef
because it triggers the use of the constructors (unfortunately).

-- Matthieu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120826/b85069c0/attachment.html>


More information about the cfe-commits mailing list