[llvm-commits] New classes for PR1255: Should enhance LLVM switch instruction to take case ranges.
Stepan Dyatkovskiy
STPWORLD at narod.ru
Sun Mar 11 01:00:07 PST 2012
Hi Chris.
Summary of previous discussion:
1. Finally ConstantInt should be replaced with APInt.
2. How to store case ranges. I remind your last proposal:
[quote]
The mapping of the range to entries in CaseValues could look like this:
[N, N+1] -> Single element in CaseValues indicates the case value for the destination.
[MAXINT, MININT] -> The entire case is "default"
Otherwise, the pair indicates a series of ranges (inclusive). If one of the ranges is MAXINT/MININT, then it is the default case. For example [0, 4, 6,6] would handle 0,1,2,3,4,6. [4,6, MAXINT,MININT] would handle 4,5,6 and default.
[/quote]
My proposal was to create ConstantRangesSet that will store all numbers and ranges for complex case. I still propose to use this schema.
You can ask what about most probable case, when for most of successors will be the only case value?
OK. We can create collection for "classic" cases vector<pair<SingleNumber,Success> > too. Then finally on the top level we have 2 collections:
class SwitchInst {
..
vector<pair<SingleNumber, BasicBlock*> > ClassicCases;
vector<pair<ConstantRangesSet, BasicBlock> > ComplexCases;
..
}
So, what do you think about this?
-Stepan.
20.02.2012, 19:58, "Stepan Dyatkovskiy" <STPWORLD at narod.ru>:
>> On Feb 14, 2012, at 3:39 AM, Stepan Dyatkovskiy wrote:
>>> Hi, Chris.
>>> You last thoughts relative to pr1255 was here:
>>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120102/134424.html
>> Aha, I thought that I had suggested arrays/vectors somewhere. Unfortunately, it doesn't work well because arrays all want the element type to be the same, having some that are scalars and some that are vectors isn't kosher.
>
> I propose to store single values as ranges two. Vector with one element - single value; two elements means the range. Instead I mean just a structure here, just a concept. I propose to store union range/single value in array.
> But now I had undestood you mean - it is non-economical implementation relative to the RAM consumption. Let me think more.
> The only thing I can propose afterall is ConstantRangesSet + CRSBuilder concept as interface for user. If we accepted some interface we can be free with its implementation.
>
>>>> Please remind me why it makes sense to use ConstantArray (or something like it) here. It seems that the case values of the switch could be very reasonably represented with APInts, eliminating the Value*'s for the case values all together (which is what I recommended in Comment 15 of PR1255). OTOH, I do vaguely remember making the comment about arrays and vectors in some PR (but can't find it). Thinking about this a third time now, I really think that avoiding Constant*'s is the right thing to do.
>>> What is the main reason to use APInts?
>> There are a bunch of reasons. For one, operands (including in a ConstantArray) are very expensive/heavyweight: sizeof(Use) = 64 bytes. APInt is lighter weight: sizeof(APInt) = 8 bytes.
>
> You right here. So as I said I try to think more about it.
>
>> The other reason is one of purity: these are not IR values that can be used and manipulated like other values. There are no defs and uses here. The case values are purely an implementation detail of the switch instruction, so they should not be Value*'s or Use's.
>
> Its clean for me. Though as I saw Value + Use idea was growed up to something more big and complex. For example AsmPrinter can print Values only. Verifier can assert Values only. It seems that it is in llvm developers mind, that all values are inharited from Value.
> From another side (mathematical). SwitchInst is an instruction. Instruction has some arguments and case-values are operands instead. So SwitchInst is S*= I(S, c1,s1, c2, s2, cn, sn), where
> S - current state.
> S* - new state.
> c1 - case #1
> s1 - successor #1
> ..
> So "case" should be an operand from this side.
>
>>> I was need to do a set of workarounds adapting LLVM infrastructure to this feature: back-ends, asm-writer, lazy-values, Verifier.cpp with its original assertions and so on. APInt numbers will like an aliens from another planet, since the will need to work with entities that are from LLVMContext, and often we need to move it back using ConstantInt::get(...). If you have a time you can look at second draft patch attached to this post (cr-0.5.0). I made it two months ago or so on. It also contains changes in indexing I already applied, filtering it you can see what will wait LLVM if we will use APInt for case values. Patch is really huge, just look how SimplifyCFG, AsmWriter and BitcodeWriter will changed.
>> I'm not sure what you mean here. The first thing that codegen does today is get the APInt out of a ConstantInt. Removing the layer should simplify a number of things, particularly if SwitchInst has a good set of helper methods.
>
> There are several props and cons. From one side we should to invoke getValue always. From another side we should extend AsmPrinter, Verify asserts adding support fo native APInt and so on. We should brake a little bit c u r r e n t concept of usage Value+Use classes.
>
> It is on IMHO rights. I can implement cases either based on APInts or on ConstantInts. But I propose replace ConstantInt with APInt on the last stage of PR1255 implementation.
>
> P.S.: Sorry for latency in reply. I'm on my vocation till 5th March.
>
> -Stepan.
More information about the llvm-commits
mailing list