[llvm-commits] Final SwitchInst for PR1255: case-ranges.

Mon May 27 10:40:55 PDT 2013

Hi Duncan,

All notes, you mentioned, I've implemented in new implementation of 
'switch'.

1. No default cases.

2. Internally all ranges set are just a flat array.
For your example (8 bit items):
 >     0 .. 120 -> destA
 >     121 .. 250 -> destB
 >     251 .. 255 -> destC
We get array: 0, 121, 251.
In more details it will array with three pairs:
{0,   destA } ; means [0, 121)   --> destA
{121, destB } ; means [121, 250) --> destB
{251, destC } ; means [251, 0)   --> destC

3. Users can't access to internal data. They uses iterators. Iterator 
presents ConstantRange instance and BasicBlock pointer to them.
BTW, case iterators implementation moved out from .h file.

4. API. Of course all default-related cases has been removed. I've kept 
'removeCase' operation. It just replaced successor for given range with 
"tombstone" one.

Please find patch in attachment.

Patch contains implementation of SwitchRInst class (yes new 
instruction). It is working patch, I cut it out from my llvm-branch that 
uses 'switchr' instead of switch.

Thanks!

-Stepan.

Duncan Sands wrote:
> Hi Stepan, thanks for the patch, it looks pretty good.
>
>> Please find the patch that converts current SwitchInst into one that supports
>> case-ranges. This patch is not compilable, since I removed all backward
>> compatibility layer. This is how I see the final SwitchInst.
>
> I'm not going to comment on IntegersSubsetMapping as it is only used as a
> helper for codegen (it is #included in Instructions.h but this is probably
> just a mistake) so isn't relevant to the general switch design.
>
> General design comments:
>
> (1) is there any point in having the notion of a default case?  Since cases are
> now arbitrary sets of integers, there doesn't seem to be a useful distinction
> between the default case and the others, you might as well say that the union
> of the sets should cover every possible value.  For example, rather than saying
> that the cases are:  -2..0 -> destA, 1,2 -> destB, 4 -> destC, default -> destD,
> you may as well say -2..0 -> destA, 1,2 -> destB, 4 -> destC, 3, 5..-3 -> destD.
>
> I think this is an important point to decide on.
>
> That said, it may be convenient internally, as an implementation detail, to have
> an "everything else" case.  I mean, since most cases will just be single values
> or small ranges of values, while the complement of those cases will be a more
> baroque set, you could just not bother storing the complement.  In the above
> example you could store -2..0 (destA), 1,2 (destB) and 4 (destC) and mark destD
> as corresponding to "everything else".  This distinction wouldn't be visible to
> users though.
>
> (2) users will see a list of cases, each case will have a corresponding set of
> integers, each set consisting of ranges that can be iterated over.  Thus it is
> natural to expose a range type (IntRange class) and a set type (IntegersSubset).
> In order to not build in any notion of signedness/unsignedness into case values,
> the range type should be wrapping [in C switch values are always signed, but in
> other languages like Ada they can be signed or unsigned].  For example, consider
> the following unsigned i8 switch:
>     0 .. 120 -> destA
>     121 .. 250 -> destB
>     251 .. 255 -> destC
> Here the ranges should be [0, 121), [121, 251), [251, 0), and not artificially
> broken up into [0, 121), [121, 128) union [128, 251), [251, 0) because of some
> notion that they should be non-wrapping when viewed as signed numbers (the
> range [121, 251) wraps as signed numbers since 127 is positive and 128 is
> negative).  The ConstantRange class is already designed to deal with this kind
> of thing, and I think you should just drop IntRange and use ConstantRange
> instead.
>
> (3) storing values.  If there is no default case, then the union of the case
> sets covers every value.  As Chris pointed out (well, maybe this is not what
> he pointed out), that means that the end of one range must be the start of
> another range, so there is no point in storing [Lo, Hi) pairs for each range,
> you can just store the start point of each range.  In the example from (1),
> rather than storing <-2, 0> for destA and <1, 2> for destB you can store
> <-2, 1, 2> and have destA, destB store indices into this.  The complete set
> of APInt values you would store for the above example is:
>     -2, 1, 3, 4, 5
>
> Of course you would want to store them ordered.  I suggest ordering them as
> signed numbers.  This isn't the same as considered switch cases to be signed,
> it is just an internal convenience and ensures that two switch instructions
> with the same ranges will have exactly the same list of APInt values stored
> in them.
>
> Note that with this scheme if cases have only one value, then you have to
> store only one value if cases are consecutive, but you have to store two
> values if they are not consecutive.  For example:
>     1 -> A
>     2 -> B
>     3 -> C
>     5 -> D
> complement -> E
> You need to store: 1, 2, 3, 4, 5
> The 4 is the lower endpoint of one of the E case ranges.
>
> Do you have some statistics on whether case values are typically consecutive,
> or whether they tend to be lots of gaps between them?
>
> Note that if you go this route then you would probably want to abandon the idea
> of exposing a real set type (IntegersSubset) directly to the user.  Instead the
> user could query the case iterator for information about the "set" for that
> case, like the number of ranges in it, and get an iterator for visiting the
> ranges.  All of this would just be referencing and indexing into the above list
> of APInt values stored in the switch.
>
> If the decision is that there should be a default case then this scheme is less
> interesting (there are also several other possibilities for storing values).
>
> Maybe the best thing to do is to only provide users with methods that are
> agnostic about internal storage.  That can be done by only exporting ways of
> querying information about sets without exporting the sets themselves (see
> above), so methods like this
>     const IntegersSubset *findCaseDest(BasicBlock *BB) const {
> and
>     IntegersSubsetTy &getCaseSubset() {
> would have to go.  Note that you could still use a type like IntegersSubset for
> adding new cases, i.e. as a way of passing a set to the SwitchInst code, but you
> wouldn't ever get an IntegersSubset by querying SwitchInst methods.
>
> Choosing a good SwitchInst API is extremely important and needs some discussion.
> As well as the above "storage agnostic" consideration, there should also be
> thought about what case manipulation methods need to be built in.
>
> For example, maybe a method for saying "replace destination X with destination
> Y" should be built in.  If Y is not already a destination for the switch then
> this would be trivial.  If it is already a destination then the case sets for
> X and Y would be joined (union).  In a world with no default cases, this would
> replace removeCase (which wouldn't mean anything anymore - right now it means:
> merge the set of values for this case into the default case).  What I'm saying
> is that maybe there should be a set of high level manipulation methods built
> into SwitchInst rather than exposing sets and requiring users to do all kinds
> of set manipulations.  But what should these methods be?  As well as the above
> method, which is essentially "union", there should probably be an intersection
> method too, for splitting a case set into two subsets with different
> destinations.  However really there needs to be an examination of what kinds
> of switch manipulations are done in practice, and an appropriate set of methods
> distilled out of that.
>
> Ciao, Duncan.
>
> PS: I noticed some mistakes, for example
>
>     unsigned getNumCases() const {
>       return getNumOperands()/2 - 1;
>     }
>
> is now wrong (shouldn't divide by 2 any more), but these kind of things can be
> fixed later.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0-instruction.patch
Type: text/x-diff
Size: 47975 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130527/9f9f66f8/attachment.patch>