[llvm-dev] (RFC) JumpMaps: switch statement optimization

Fri Mar 31 02:04:33 PDT 2017

On Fri, Mar 31, 2017 at 10:22 AM, Marta Wszola
<marta.wszola at myrelabs.com> wrote:
> We have the prototype version working on X86. Our prototype:
> a) analyses switch statements conditions' and keys' byte sizes,
> b) finds jump maps (of course, jump tables (also partitioned) have
> priority),
> c) creates appropriate function calls,
> d) creates simple jump map structure in memory.
> We've prepared a few tests in C to verify correctness.
>
> We've got a few questions regarding implementation:
> 1) Right now, the data structure that we emit uses position independent
> addresses. Of course, it's not an optimal solution. Should we implement
> relative addresses and various relocations support in our patch or
> should we leave it for X86 target developers?
> 2) What is the best way to insert search functions? At which step should
> we do it? We were thinking about creating all functions early in IR and
> deleting the unused ones after CodeGen (our prototype does not create
> functions, it "assumes" that they are linked). We can alternatively
> write a library to be linked later.
> 3) We've come into problems with the analysis step. Since it needs to
> know the minimal sizes of jump tables and jump maps (and it probably
> will have even more target hooks in the future), it's a TargetMachine
> pass required by ISel. However, we cannot register a TargetMachine
> Module pass, it makes PassManager either throw an assertion stating that
> the pass cannot be scheduled or enter an infinite loop. We didn't have
> this problem with older LLVM version. Right now, we decided to turn it
> into a function pass, but it's not what we really want. How (where?)
> should we register a TargetMachine Module analysis pass?
> 4) What is the best way to test it? We need to test not only calls
> creation (which should be easy with FileCheck) but also the data
> structure and the functions being called.
>
> Finally, should we submit the code we have so far to Phabricator for the
> early review now? As you can see, the implementation is not complete.

Yes, I think uploading the patch would be useful to help the discussion.

Thanks,
Hans

> On 15.02.2017 07:12, Nema, Ashutosh via llvm-dev wrote:
>> Agree with Hans points as not all targets link against compiler-rt, if these functions get generated in IR then the compiler-rt dependency won't be there, also keeping these functions in IR may fetch more performance gains by inlining.
>>
>> Looking forward to try this optimization.
>>
>> Regards,
>> Ashutosh
>>
>> -----Original Message-----
>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Hans Wennborg via llvm-dev
>> Sent: Wednesday, February 15, 2017 12:09 AM
>> To: Witold Waligora <witold.waligora at myrelabs.com>
>> Cc: llvm-dev <llvm-dev at lists.llvm.org>
>> Subject: Re: [llvm-dev] (RFC) JumpMaps: switch statement optimization
>>
>> I wonder if it would make sense to emit the jumpmap_find_ functions in IR rather than in compiler-rt.
>>
>> Many targets don't link against compiler-rt, e.g. x86 Linux typically uses libgcc.
>>
>> If they're emitted in IR, the functions could potentially be inlined.
>> For example if the size of the switch is known to be small so no binary search is done, could inlining the find_ function be smaller than setting up the call?
>>
>> On Tue, Feb 14, 2017 at 5:47 AM, Witold Waligora via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> I didn't answer compiler-rt and benchmarks question.
>>>
>>> compiler-rt would have to implement the decoding function(s) to undo
>>> whatever the compiler did. This would be target-dependent.
>>> Our target implements four of them, one for each basic type:
>>> jumpmap_find_i8, jumpmap_find_i16, jumpmap_find_i32, jumpmap_find_i64.
>>>
>>> We don't have any benchmarks for any of the in-tree targets yet.
>>>
>>> Witold
>>>
>>>
>>> W dniu 2017-02-14 o 14:28, Witold Waligora via llvm-dev pisze:
>>>> JumpMap lowering is nearly identical to that of JumpTables with the
>>>> exception of lack of range-check basic-block.
>>>> We introduce JumpMapInfo structure which follows the same flow as
>>>> JumpTableInfo and is finally emitted by AsmPrinter.
>>>>
>>>> There are many ways a Target may want to encode jumpmaps (deltas,
>>>> compression, relative vs absolute), so we plan to keep this flexible
>>>> and target-driven when upstreaming.
>>>>
>>>> Witold
>>>>
>>>> W dniu 2017-02-14 o 10:53, Nema, Ashutosh pisze:
>>>>> Hi Witold,
>>>>>
>>>>> Can you explain what you meant by adding support in compiler_rt & changes in lowering.
>>>>>
>>>>> Did you tried any benchmarks, please share if you have any numbers with performance improvements & code size reduction ?
>>>>>
>>>>> Regards,
>>>>> Ashutosh
>>>>>
>>>>> -----Original Message-----
>>>>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
>>>>> Witold Waligora via llvm-dev
>>>>> Sent: Monday, February 13, 2017 6:41 PM
>>>>> To: llvm-dev at lists.llvm.org
>>>>> Subject: [llvm-dev] (RFC) JumpMaps: switch statement optimization
>>>>>
>>>>> Hi All,
>>>>>
>>>>> This is our first appearance on llvm-dev so a small introduction is in order.
>>>>> We are a team of two working for good few years now on an out-of-tree embedded back-end. Our primary goal is code size, secondary performance.
>>>>> We've recently implemented an optimization that we believe others could use. It's about time to contribute back.
>>>>>
>>>>> -------------------------------------------------------
>>>>> JumpMaps: a generalization of JumpTables JumpTables produce fast and small code but have a limitation - they only work if case values are easy to compute from the variable being switch()ed. To overcome this limitation we introduce a {key,value} structure - a JumpMap.
>>>>>
>>>>> Simplifying somewhat, LLVM would currently generate a sequential
>>>>> if-elseif- for small switches and an inline binary-search for large switches, if it can't produce a JumpTable. JumpMaps would instead generate a call followed by a jump-through-register.
>>>>> Pseudo-code example:
>>>>>
>>>>> switch(x) {
>>>>>   case 0x6851: {BB1}
>>>>>   case 0x1383: {BB2}
>>>>>   case 0x2224: {BB3}
>>>>>   default:     {defaultBB}
>>>>> }
>>>>>
>>>>> Without Jump Maps:
>>>>>   if(x==0x6851) {
>>>>>     goto BB1
>>>>>   }
>>>>>   else if(x==0x1383) {
>>>>>     goto BB2
>>>>>   }
>>>>>   else if(x==0x2224){
>>>>>     goto BB3
>>>>>   }
>>>>>   else{
>>>>>     goto defaultBB
>>>>>   }
>>>>>
>>>>> With Jump Maps:
>>>>>   jumpmap_0 = {
>>>>>     keys = {3, 0x6851, 0x1383, 0x2224}
>>>>>     vals = {defaultBB, BB1, BB2, BB3}
>>>>>   }
>>>>>   addr dst = __jumpmap_find(&jumpmap_0, x)
>>>>>   goto dst;
>>>>>
>>>>> On our target Jump Maps produce both smaller and faster code even for quite small switch statements. We believe other architectures would benefit as well, X86 and ARM included:
>>>>> - jumpmap struct has a good chance of being cached more efficiently
>>>>> than a large chunk of inline binary-search code
>>>>> - for large switches branch prediction should have an easier time
>>>>> compared to inline binary-search
>>>>> - embedded architectures (ARM, MSP430) would benefit from smaller
>>>>> code (trading off .text for .rodata)
>>>>>
>>>>> In terms of lowering, JumpMap flow is very similar to that of JumpTables. The lowering is even slightly easier as JumpMaps do not require an extra range-check basic-block - the default case is also handled by __jumpmap_find.
>>>>> Targets need to provide lowering for the map structure and __jumpmap_find function, which would typically become a part of compier-rt library and could be written in hand-crafted asm for extra speed.
>>>>>
>>>>>
>>>>> Questions or comments?
>>>>> Please let me know.
>>>>>
>>>>>
>>>>> Best Regards,
>>>>> Witold Waligóra
>>>>> Myre Laboratories
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>