[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

Thu Jul 26 11:16:22 PDT 2012

Jakob Stoklund Olesen <jolesen at apple.com> writes:

>> If the 128-bit register is not ever used as a 128-bit register,
>> shouldn't the coalescer pick the 64- or 32-bit register?
>
> That optimization is not currently implemented for sub-registers. For
> example, if you create a GR64 virtual register and only ever use the
> sub_32bit sub-register, it would be possible to replace the virtual
> register with a GR32 register. It's not impossible to do, but it
> doesn't come up a lot.

It does come up a lot in vector code.  Extraction of scalar values from
vectors is pretty common, especially given the limitations of SSE/AVX.
Typically we have done this using EXTRACT_SUBREG.  So either we would
have to prevent coalescing to avoid a 128-bit spill or we would always
have to use a 128-bit spill even if we never use anything but the scalar
value.

Neither option is a good one.

> When not using sub-registers, the optimization does exist. For
> example, if you have a VR128 virtual register, but all the
> instructions using it only require FR32, MRI->recomputeRegClass() will
> figure it out, and downgrade to FR32.

I don't think this optimization applies because the SSE/AVX instruction
defines a vector register but we never use the upper elements.

Would adding Fs patterns for these cases, forcing the result register to
FR64, help?

What does Fs mean anyway, "fake scalar?"  :)

                                -Dave