[LLVMdev] SSE Scalar Convert Intrinsics

Fri Jun 5 13:33:28 PDT 2009

On Jun 5, 2009, at 1:19 PM, Dan Gohman wrote:

>
> On Jun 5, 2009, at 8:51 AM, David Greene wrote:
>
>> I have a question about the SSE scalar convert intrinsics.
>>
>> cvtsd2si is defined thusly:
>>
>> def int_x86_sse2_cvtsd2si64 :
>> GCCBuiltin<"__builtin_ia32_cvtsd2si64">,
>>             Intrinsic<[llvm_i64_ty, llvm_v2f64_ty], [IntrNoMem]>;
>>
>> This matches the signature of the GCC intrinsic.  The fact that the
>> GCC
>> intrinsic has a type mismatch on the input (vector rather than  
>> scalar)
>> is strange, but ok, we'll run with it.
>>
>> Until this:
>>
>> def Int_CVTSD2SIrm : SDI<0x2D, MRMSrcMem, (outs GR32:$dst), (ins
>> f128mem:
>> $src),
>>                        "cvtsd2si\t{$src, $dst|$dst, $src}",
>>                        [(set GR32:$dst, (int_x86_sse2_cvtsd2si
>>                                          (load addr:$src)))]>;
>>
>> Er, this makes us load a 128-bit quantity, which is almost certainly
>> not
>> what we want.
>
> Yes, that looks wrong, even if it ends up doing something that
> ends up working.
>
>>
>>
>> Do we need two intrinsics for these scalar converts, one to satisfy
>> the
>> (arguably broken) GCC interface and one to really reflect the
>> operation
>> as specified by the ISA?
>
> That's what's done for most other instructions, unfortunately.
> For cvtsd2si, there's currently no "normal" version in the tree,
> but if you add one, it wouldn't be alone.
>
> One thing we'd like to do at some point is have front-ends lower
> intrinsics for scalar instructions into
> extractelement+op+insertelement, so that we don't need two
> versions of each of the instructions.  Doing this for everything
> will require some work to make sure that the extra insert/extract
> operators don't incur unnecessary copying, but that's also
> something we'd like to do regardless.

Agreed!

Nate