[llvm-commits] PATCH: replace if-chain in AsmParser's directive parsing code

Tue Jan 8 09:43:52 PST 2013

On Thu, Jan 3, 2013 at 9:53 AM, Eli Bendersky <eliben at google.com> wrote:
> On Wed, Dec 19, 2012 at 3:04 PM, Eli Bendersky <eliben at google.com> wrote:
>> Ping?
>
> Ping #2
>

Ping #3

Eli


>
>
>>
>> On Tue, Dec 18, 2012 at 2:05 PM, Eli Bendersky <eliben at google.com> wrote:
>>>>> AsmParser has a couple of if-chains when parsing directives. The
>>>>> attached patch replaces the if-chains with enum+switch. No change in
>>>>> functionality.
>>>>>
>>>>> Unfortunately I didn't measure any significant speedups for the whole
>>>>> llvm-mc (only 1-2%), so I'll do more fine-grained measurements once I
>>>>> get the opportunity.
>>>>>
>>>>> Also, I should add that in general AsmParser's parsing code can be
>>>>> cleaned up even more (for example w.r.t. the extensions registersing
>>>>> directive parsers, which can be unified with other directive parsing
>>>>> code). So this patch should just be seen as a first step in a
>>>>> (hopefully) right direction.
>>>>
>>>> Just MHO, but this makes the code more complex and hopefully won't provide a speedup.
>>>> We generally expect that:
>>>>
>>>> StringRef X = …
>>>>
>>>> if (X == "foo")
>>>>   return ...
>>>> if (x == "bar")
>>>>   return …
>>>>
>>>> to be turned into a switch on X[0].
>>>>
>>>
>>> I may be missing something, but this code:
>>>
>>> int main(int argc, char** argv) {
>>>   StringRef r(argv[1]);
>>>
>>>   if (r == "something")
>>>     return 17;
>>>   if (r == "foo")
>>>     return 31;
>>>   if (r == "bar")
>>>     return 71;
>>>   if (r == "baz")
>>>     return 97;
>>>
>>>   return 0;
>>> }
>>>
>>> When compiled with 'clang -O3' (Clang 3.2) does not produce a jump
>>> table, but a chain of tests and jumps as usual from an if-chain (**).
>>> I tried several variations (else-ifs...), and the result is similar.
>>> By the way, this is also what GCC does (and I suspect that many
>>> developers have their LLVM compiled with GCC).
>>> Perhaps I am missing something?
>>>
>>> Besides, the actual pattern in AsmParser is not a pure "if X return Y"
>>> chain. Some "then" statements don't return, and some conditions are
>>> logical ANDs of several tests, and so on. So detecting a clean switch
>>> pattern could be difficult.
>>>
>>>> Just MHO, but this makes the code more complex and hopefully won't provide a speedup.
>>>
>>> To address the "more complex" point. Whimsically, it would be hard to
>>> make that code path in AsmParser more complex ;-) But seriously, a
>>> longer-term plan would be to combine this lookup table with another
>>> that already exists in the same function - DirectiveMap, which is used
>>> for assembler extensions registering themselves on certain directives.
>>> My thinking was to first get the general if-chain removal approved,
>>> and then try to fuse the two lookup tables to attain some level of
>>> consistency in the code. IMHO the end result would be much cleaner and
>>> more readable than what it has now.
>>>
>>> (**) Methodology:
>>>
>>> $ ~/llvm/3.2rc2/bin/clang++ -fno-rtti -O3 stringref_switch.cpp
>>> `~/llvm/3.2rc2/bin/llvm-config --cxxflags --libs`
>>> `~/llvm/3.2rc2/bin/llvm-config --ldflags` -o stringref_switch
>>>
>>> Followed by objdump -d. Here's a typical piece of assembly output:
>>>
>>>   4005c0:       53                      push   %rbx
>>>   4005c1:       48 8b 5e 08             mov    0x8(%rsi),%rbx
>>>   4005c5:       48 89 df                mov    %rbx,%rdi
>>>   4005c8:       e8 e3 fe ff ff          callq  4004b0 <strlen at plt>
>>>   4005cd:       48 89 c1                mov    %rax,%rcx
>>>   4005d0:       48 83 f9 03             cmp    $0x3,%rcx
>>>   4005d4:       75 60                   jne    400636 <main+0x76>
>>>   4005d6:       48 8d 35 89 01 00 00    lea    0x189(%rip),%rsi
>>> # 400766 <_IO_stdin_used+0xe>
>>>   4005dd:       48 89 df                mov    %rbx,%rdi
>>>   4005e0:       48 89 ca                mov    %rcx,%rdx
>>>   4005e3:       e8 d8 fe ff ff          callq  4004c0 <memcmp at plt>
>>>   4005e8:       89 c1                   mov    %eax,%ecx
>>>   4005ea:       b8 1f 00 00 00          mov    $0x1f,%eax
>>>   4005ef:       85 c9                   test   %ecx,%ecx
>>>   4005f1:       74 6f                   je     400662 <main+0xa2>
>>>   4005f3:       48 8d 35 70 01 00 00    lea    0x170(%rip),%rsi
>>> # 40076a <_IO_stdin_used+0x12>
>>>   4005fa:       48 89 df                mov    %rbx,%rdi
>>>   4005fd:       ba 03 00 00 00          mov    $0x3,%edx
>>>   400602:       e8 b9 fe ff ff          callq  4004c0 <memcmp at plt>
>>>   400607:       89 c1                   mov    %eax,%ecx
>>>   400609:       b8 47 00 00 00          mov    $0x47,%eax
>>>   40060e:       85 c9                   test   %ecx,%ecx
>>>   400610:       74 50                   je     400662 <main+0xa2>
>>>   400612:       48 8d 35 55 01 00 00    lea    0x155(%rip),%rsi
>>> # 40076e <_IO_stdin_used+0x16>
>>>   400619:       48 89 df                mov    %rbx,%rdi
>>>   40061c:       ba 03 00 00 00          mov    $0x3,%edx
>>>   400621:       e8 9a fe ff ff          callq  4004c0 <memcmp at plt>
>>>   400626:       89 c1                   mov    %eax,%ecx
>>>   400628:       ba 61 00 00 00          mov    $0x61,%edx
>>>   40062d:       31 c0                   xor    %eax,%eax
>>>
>>> I do notice that the compiler tries to be smart about the length of
>>> the string, but for strings of the same length it has no choice but
>>> sequentially "memcmp" them. Note that the vast majority of assembly
>>> directives fall into very few bins in terms of string length.
>>>
>>> Eli