[llvm-commits] [PATCH] UINT_TO_FP of vectors

Dirk Steinke steinke.dirk.ml at googlemail.com
Wed Mar 16 14:08:36 PDT 2011


On 03/16/2011 10:00 PM, Dirk Steinke wrote:
> Hi Nadav,
>
> On 03/16/2011 04:37 PM, Rotem, Nadav wrote:
>> Hi,
>>
>> I attached a patch for legalizing UINT_TO_FP of vectors on platforms
>> which do not have this operation (such as X86). The legalized code uses
>> the vector INT_TO_FP operations and is faster than scalarizing.
>>
> [snip]
>> SDValue VectorLegalizer::ExpandUINT_TO_FLOAT(SDValue Op) {
>> +
>> +
>> + EVT VT = Op.getOperand(0).getValueType();
>> + DebugLoc DL = Op.getDebugLoc();
>> +
>> + // Make sure that the SINT_TO_FP and SRL instructions are available.
>> + if (!TLI.isOperationLegalOrCustom(ISD::SINT_TO_FP, VT) ||
>> + !TLI.isOperationLegalOrCustom(ISD::SRL, VT))
>> + return DAG.UnrollVectorOp(Op.getNode());
>> +
>> + EVT SVT = VT.getScalarType();
>> + assert((SVT.getSizeInBits() == 64 || SVT.getSizeInBits() == 32) &&
>> + "Elements in vector-UINT_TO_FP must be 32 or 64 bits wide");
>> +
>> + unsigned BW = SVT.getSizeInBits();
>> + SDValue HalfWord = DAG.getConstant(BW/2, VT);
>> +
>> + // Constants to clear the upper part of the word.
>> + // Notice that we can also use SHL+SHR, but using a constant is
>> slightly
>> + // faster on x86.
>> + uint64_t HWMask =
>> (SVT.getSizeInBits()==64)?0x00000000FFFFFFFF:0x0000FFFF;
>> + SDValue HalfWordMask = DAG.getConstant(HWMask, VT);
>> +
>> + // Two to the power of half-word-size.
>> + SDValue TWOHW = DAG.getConstantFP((1<<(BW/2)), Op.getValueType());
>> +
>> + // Clear upper part of LO, lower HI
>> + SDValue HI = DAG.getNode(ISD::SRL, DL, VT, Op.getOperand(0), HalfWord);
>> + SDValue LO = DAG.getNode(ISD::AND, DL, VT, Op.getOperand(0),
>> HalfWordMask);
>> +
>> + // Convert hi and lo to floats
>> + // Convert the hi part back to the upper values
>> + SDValue fHI = DAG.getNode(ISD::SINT_TO_FP, DL, Op.getValueType(), HI);
>> + fHI = DAG.getNode(ISD::FMUL, DL, Op.getValueType(), fHI, TWOHW);
>> + SDValue fLO = DAG.getNode(ISD::SINT_TO_FP, DL, Op.getValueType(), LO);
>> +
>> + // Add the two halves
>> + return DAG.getNode(ISD::FADD, DL, Op.getValueType(), fHI, fLO);
>> +}
>> +
>
> thanks for working on this, but your code seems suboptimal to me. If I'm
> not mistaken, you should be able to turn
> UINT_TO_FP(a) into SINT_TO_FP(a & ~SIGNBIT) - SINT_TO_FP(a & SIGNBIT)
> which gets rid of one floating point multiplication, and replaces one
> shift by an AND, but at the cost of one extra vector constant. In
> theory, using PANDN on x86, one memory load should be enough, but
> well... What do you think?
>
> Dirk

Hi Nadav,
now that I think about it, why not simply
UINT_FP(a) = SINT_TO_FP(a - SIGNBIT) + FLOAT(SIGNBIT)?

Dirk



More information about the llvm-commits mailing list