[LLVMdev] FP emulation (continued)

Mon Nov 27 11:44:13 PST 2006

On Mon, 20 Nov 2006, Roman Levenstein wrote:
>> The first step is to get somethign simple like this working:
>>
>> void %foo(double* %P) {
>>    store double 0.0, double* %P
>>    ret void
>> }
>>
>> This will require the legalizer to turn the double 0.0 into two
>> integer zeros, and the store into two integer stores.
>
> Sample code like this, i.e. simple stores, loads or even some
> arithmetic operations works fine now. No problems.

Great.

> But there are big issues with correct legalization and expansion, i.e.
> with ExpandOp() and LegalizeOp(). I don't know how to explain it
> properly, but basically these functions assume at many places that in
> the case where an MVT requires more than one register this MVT is
> always an integer type. There are some assertions checking for it, and
> there are quite some places where it is assumed. More over, since
> getTypeAction(MVT::f64) now returnes Expand, the legalizer tries to
> expand too much and BTW it does not check for getOperationAction or
> something like that in this case. For example, it tries to expand also
> all the operations like ADD, SUB, etc into operations on the halves of
> f64 (probably because it thinks it is an integer ;-) even though for
> such operations I do not need any expanstion, since they are
> implemented as library functions.

Right.  These places will have to be updated, and code to handle the new 
cases needs to be added.

> For most of the places assuming the integer type to be expanded, I
> inserted some code to explicitly check if MVT::f64 is being expanded.
> This worked for most of the cases, but not for all. In particular I
> cannot solve the SELECT_CC on f64 expansion. It generates a target
> specific SELECT_CC node that correctly contains pairs of i32 for the
> TrueValue and FalseValue. But when the value of this operation is used
> later, then expander tries to expand the result of it. And it cannot do
> it, since it seems to have a problem with EXTRACT_ELEMENT applied to
> SELECT_CC mentioned above. The problem is probably that it cannot
> extract the corresponding halves from the target specific SELECT_CC
> node (and it can do it without problems for usual integer-based
> ISD::SELECT_CC nodes). At this place I got stuck, since I do not see
> how I can overcome it.

I don't follow, can you try explaining it and including the relevant code 
that isn't working for you?

> Overall, changing the lagalizer to support the expansion of tge
> MVT::f64 proves to be more complicated as I initially expected. And it
> also seems to be a bit of overkill. Therefore I was thinking about the
> special pass after code selection, but before register allocation.

Ok.

> After all, I just want to do a transformation on all instructions that
> read or write from/into virtual f64 regs.
>
>  load/store vregf64, val
> ->
>  load/store vregi32_1, val_low
>  load/store vregi32_2, val_high
>
> My subjective feeling is that is can be done easier in a separate pass
> rather then chaning the legalizer all over the place in a rather
> non-elegant way.

You could do this, but it's not the "right" way to go, for the same 
reason that expanding i64 -> 2x i32 after isel isn't the right thing to 
do.  Doing this 'late' requires lots of bogus instructions/register files 
to be added to the target, it doesn't allow the dag combiner to optimize 
and eliminate redundant expressions (for example, 'store double 0.0' 
should only materialize 0 into one 32-bit register, not two zeros), and it 
generally isn't in the spirit of the current infrastructure.

I realize that you may not be interested in getting the best possible 
solution (time pressures may be more important), but realize that you will
lose performance if you do a pass after isel time to handle this.

>> The best approach is to make the legalizer do this transformation.
>
> I believe, since you know it certainly better than me. But I
> experienced quite some problems, as I described above. Now, if we would
> assume for a second that this approach with a separate pass makes some
> sense. I'm just curious how I could insert a new pass after the code
> selection, but before any other passes including regiser allocation? I

This should be workable.

> have not found any easy way to do it yet. For post-RA pass it is very
> easy and supported, but for pre-RA or post-code-selection - it is non
> obvious.

I suggest a third approach:

1. Add an f64 register class to the target.
2. Add FP pseudo instructions that are produced by the isel, these use the
    f64 register class.
3. Write a machine function pass that runs before the RA that translates
    these instructions into libcalls or other integer ops.  This would
    lower the f64 pseudo regs into 2x i32 pseudo regs.  The real RA should
    never see the bogus f64 regs.

> P.S. A minor off-topic question: Is it possible to explain the LLVM
> backend that "float" is the same type as "double" on my target? I
> managed to explain it for immediates and also told to promote f32 to
> f64. But it does not work for float variables or parameters, because
> LLVM considers them to be float in any case and to have a 32bit
> representation in memory. Or do I need to handle this equivalence in
> the front-end only?

If you tell the code generator to promote f32 to f64, it will handle 90% 
of the work for you.  The remaining pieces are in the 
lowercall/lowerarguments/lower return code, where you do need to specify 
how these are passed.  Usually just saying they are in f64 registers 
should be enough.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/