[PATCH] [X86][FastISel] Teach how to select float-half conversion intrinsics.
Andrea Di Biagio
Andrea_DiBiagio at sn.scee.net
Fri Feb 20 08:06:10 PST 2015
Here is a new version of the patch. which hopefully addresses all your comments.
This patch checks that the operand type of intrinsic 'convert_to_fp16' is 'float', and that the return type of intrinsic 'convert_from_fp16' is 'float'. Those checks are required because both intrinsics may accept 'any' floating point type (even 'double' and 'long double').
As you suggested, I added another test (named 'fast-isel-float-double-convertion.ll') to check that fast-isel doesn't accidentally select a wrong instruction for double-to-half conversions. This new test is currently marked XFAIL since fast-isel only knows how to select float-to-half and half-to-float conversions.
In the previous patch you suggested to use an INSERT_SUBREG to perform an element insertion into a vector.
However, INSERT_SUBREG requires a valid sub-register index operand to identify which sub-register we want to address. Unfortunately, register class VR128 doesn't allow to use any sub-register index; therefore we cannot use insert_subreg to address the lower 32-bits of a VR128 register.
Instead, I implemented the element insertion (from GR32 to VR128) using tablegen'd function 'fastEmit_r' to emit the equivalent of a SCALAR_TO_VECTOR.
Conversions from FR32-to-VR128 are implicitly handled by method 'constrainOperandRegClass' (used by all the 'fastEmitInst_*' methods in FastISel).
We cannot use an 'extract_subreg' to extract a FR32 from VR128 for the same reason why we cannot use 'insert_subreg' on to promote an FR32 to VR128 (i.e. there is no sub_reg index that we can use). I found out that it is perfectly ok to 'copy' from register class VR128 to class FR32; the two classes are basically identical except for the accepted value types. This is also what ISel normally does when promoting FR32 to VR128 (and from VR128 to FR32). See for example the tablegen patterns in X86InstrSSE.td.
(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))) ->
(COPY_TO_REGCLASS (v4f32 VR128:$src), FR32)
(v4f32 (scalar_to_vector FR32:%src)) ->
(COPY_TO_REGCLASS FR32:$src, VR128)
Please let me know if ok to submit.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5133 bytes
Desc: not available
More information about the llvm-commits