[LLVMdev] SSE Scalar Convert Intrinsics
Dan Gohman
gohman at apple.com
Fri Jun 5 13:19:01 PDT 2009
On Jun 5, 2009, at 8:51 AM, David Greene wrote:
> I have a question about the SSE scalar convert intrinsics.
>
> cvtsd2si is defined thusly:
>
> def int_x86_sse2_cvtsd2si64 :
> GCCBuiltin<"__builtin_ia32_cvtsd2si64">,
> Intrinsic<[llvm_i64_ty, llvm_v2f64_ty], [IntrNoMem]>;
>
> This matches the signature of the GCC intrinsic. The fact that the
> GCC
> intrinsic has a type mismatch on the input (vector rather than scalar)
> is strange, but ok, we'll run with it.
>
> Until this:
>
> def Int_CVTSD2SIrm : SDI<0x2D, MRMSrcMem, (outs GR32:$dst), (ins
> f128mem:
> $src),
> "cvtsd2si\t{$src, $dst|$dst, $src}",
> [(set GR32:$dst, (int_x86_sse2_cvtsd2si
> (load addr:$src)))]>;
>
> Er, this makes us load a 128-bit quantity, which is almost certainly
> not
> what we want.
Yes, that looks wrong, even if it ends up doing something that
ends up working.
>
>
> Do we need two intrinsics for these scalar converts, one to satisfy
> the
> (arguably broken) GCC interface and one to really reflect the
> operation
> as specified by the ISA?
That's what's done for most other instructions, unfortunately.
For cvtsd2si, there's currently no "normal" version in the tree,
but if you add one, it wouldn't be alone.
One thing we'd like to do at some point is have front-ends lower
intrinsics for scalar instructions into
extractelement+op+insertelement, so that we don't need two
versions of each of the instructions. Doing this for everything
will require some work to make sure that the extra insert/extract
operators don't incur unnecessary copying, but that's also
something we'd like to do regardless.
Dan
More information about the llvm-dev
mailing list