[cfe-commits] r171262 - /cfe/trunk/lib/AST/CommentLexer.cpp

Dmitri Gribenko gribozavr at gmail.com
Thu Jan 3 16:50:07 PST 2013


On Mon, Dec 31, 2012 at 1:38 PM, Matthieu Monrocq
<matthieu.monrocq at gmail.com> wrote:
> On Sun, Dec 30, 2012 at 8:45 PM, Dmitri Gribenko <gribozavr at gmail.com>
> wrote:
>>
>> Author: gribozavr
>> Date: Sun Dec 30 13:45:46 2012
>> New Revision: 171262
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=171262&view=rev
>> Log:
>> Comment lexing: replace manual comparison with StringRef::find_first_of
>>
>> This gives an about 1.8% improvement on Clang bootstrap with
>> -Wdocumentation
>>
>> Modified:
>>     cfe/trunk/lib/AST/CommentLexer.cpp
>>
>> Modified: cfe/trunk/lib/AST/CommentLexer.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/CommentLexer.cpp?rev=171262&r1=171261&r2=171262&view=diff
>>
>> ==============================================================================
>> --- cfe/trunk/lib/AST/CommentLexer.cpp (original)
>> +++ cfe/trunk/lib/AST/CommentLexer.cpp Sun Dec 30 13:45:46 2012
>> @@ -415,15 +415,12 @@
>>          return;
>>
>>        default: {
>> -        while (true) {
>> -          TokenPtr++;
>> -          if (TokenPtr == CommentEnd)
>> -            break;
>> -          const char C = *TokenPtr;
>> -          if(C == '\n' || C == '\r' ||
>> -             C == '\\' || C == '@' || C == '&' || C == '<')
>> -            break;
>> -        }
>> +        size_t End = StringRef(TokenPtr, CommentEnd - TokenPtr).
>> +                         find_first_of("\n\r\\@&<");
>> +        if (End != StringRef::npos)
>> +          TokenPtr += End;
>> +        else
>> +          TokenPtr = CommentEnd;
>>          formTextToken(T, TokenPtr);
>>          return;
>>        }
>>
>
> Interesting! I just had a look at the implementation of "find_first_of" and
> realized it built an array of bits (thanks to bitset) so that it had a O(nb
> chars + len string) complexity.
>
> This table implementation seems very much akin to what a compiler would
> generate for a switch though, and thus I wonder whether the compiler could
> optimize a manual for loop with a switch even better; notably because the
> set of characters to look for is known at compile time.
>
> Did you tested this approach ?

Just checked -- the code generated for a 'switch' is conceptually the
same as for 'if (C == ... || ... || ...)'.  Seems like LLVM thinks
that converting this switch to a bitmap is not profitable.

Dmitri

-- 
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/



More information about the cfe-commits mailing list