[llvm-commits] Final SwitchInst for PR1255: case-ranges.

Stepan Dyatkovskiy stpworld at narod.ru
Fri Jul 20 12:33:00 PDT 2012


Hi Nuno,
Thank you for you review!

> I really would like to see some numbers about performance/memory regression.
Currently on my local llvm copy I still need update several passes (add 
support for case ranges): SCCP, LazyValueInfo, InstructionCombining and 
GVN. Though first of all we should agree on the SwitchInst design.
I can measure performance/memory for passes already updated 
(SparsePropogation, CodeExtractor, CloneFunction, LowerSwitch, 
SimplifyCFG, LowerExpectIntrinsic, Local, JumpThreading and GlobalOpt).

>
> Some comments about the patch itself:
>   - I don't understand why you need the IntRange class. We already have
> ConstantRange, which does exactly the same thing, and it's even more
> memory efficient (it only stores 2 APInts; no extra bits needed).  The
> optimizations expect ConstantRanges as well.
Now ConstantRange is fine, since I moved out all specific operations 
from IntRange and now it is really duplicates the first one.

>   - I'm not sure you need IntegersSubset and IntegersSubsetGeneric. Can
> you explain why do have this extra layer?
>   - I believe IntegersSubset should store ConstantRanges directly. It's
> as memory efficient as storing 2 APInts, and the code should become
> simpler.  I also missed the point of RangeLinks.
So the main point here is that we represent as case the *set* of ranges. 
Going in this direction Chris offered to pack ranges, since most of them 
are not a ranges instead, but just a single numbers. He offered to store 
all values in flat collection, and define range as pair of indices of 
items from this collection.
Originally described here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120213/136954.html
In patch with "final switch" I removed IntegersSubsetGeneric. 
IntegersSubset represts ranges in style proposed by Chris. I created 
flat collection and RangeLinks with pair of pointers to the items in 
flat collection.
A few days ago it hit me: using pointer doesn't gives us economy in 
memory consumption. We can only achieve this effect using short integer 
indices (that are unsigned short for example).

>   - How is SwitchInst serialized to bitcode?  We cannot break the
> bitcode format until llvm 4.0, so we need a way to be binary compatible 																												
> or provide an auto-upgrade path for older bit-code files.
Serialization is already implemented, *but* I kept backward 
compatability and tested it. I spent a lot of time testing it, so hope 
it will not updated again. It is not matters what we will finally use: 
IntegersSubset or set of ConstantRanges. The main point is concept, that 
case may be a number, set of numbers, or set of ranges and numbers. Of 
course already has optimizations when case is just a number, or case 
contains numbers only. Anyway it is better to receive critics as soon as 
possible, since I didn't got explicit reviews on this patch. You can 
find current implementation in BitcodeWriter.cpp, WriteInstruction 
function, string "case Instruction::Switch".

>
> I didn't look at IntegersSubsetMapping. Can you explain where and how it
> is used?  I think I remember some commits about that in CodeGen, but I
> don't remember the purpose.
Storing case value in packed format restricts manipulation with case 
ranges. It is expensive to add or remove ranges to/from packed set of 
ranges. So I created some kind of factory with additional features and 
called it 'IntegersSubsetMapping'. If we will store 'packed' cases I 
propose to work with SwitchInst in next style:
If we want to modify SwitchInst do the next:
1. Unpack all cases: build IntegersSubsetMapping, that represents Switch 
in unpacked format (see IntegersSubsetMapping::add).
2. Add/remove/combine/intersect/etc. ranges.  The most significant 
methods are "optimize", "diff" and "verify".
3. Pack it back and store new 'switch' case values (using 
SwitchInst::CaseIt::setSubset method).

I can describe features of IntegersSubsetMapping with more details, just 
let me know if you need it.

Thanks!

-Stepan.

>
> Thanks,
> Nuno
>
>
> Quoting Stepan Dyatkovskiy <stpworld at narod.ru>:
>
>> Hi all.
>> Please find the patch that converts current SwitchInst into one that
>> supports case-ranges. This patch is not compilable, since I removed
>> all backward compatibility layer. This is how I see the final SwitchInst.
>>
>> Main features
>>  - Case value is the subset of integers, based on APInt values
>> (implemented as IntegersSubset). IntegersSubset uses packed data
>> representation. It has two collections: flat numbers collection, and
>> collection of pairs with indices (pointers) to the flat numbers
>> collection.
>>  - To construct/modify subsets for SwitchInst IntegersSubsetMapping
>> was introduced. It uses unpacked data representation and stores
>> SwitchInst data as set of clusters. Cluster is integer range + successor.
>>  - Case value removed from instruction operands list. We needn't
>> User-Use concept here, since it is just a case value, it will always
>> constant, and case value couldn't reused with another
>> instructions/values.
>>
>> P.S.:  Patch contains a lot of changes. Probably you want to apply it
>> locally and look at final instruction class.
>>
>> P.S.2: I also have patch with backward compatibility level. It is
>> compilable and passes regression tests and test-suite tests.
>>
>> Thanks!
>>
>> -Stepan.




More information about the llvm-commits mailing list