[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
Jakob Stoklund Olesen
jolesen at apple.com
Thu Jul 26 10:04:34 PDT 2012
On Jul 26, 2012, at 9:43 AM, dag at cray.com wrote:
> Jakob Stoklund Olesen <jolesen at apple.com> writes:
>
>> As far as I can tell, all sub-register operations involving sub_ss and
>> sub_sd can simply be replaced with COPY_TO_REGCLASS:
>>
>> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
>> (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2),
>> sub_sd))>;
>>
>> Becomes:
>>
>> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
>> (VMOVSDrr VR128:$src1, (COPY_TO_REGCLASS VR128:$src2, FR64))>;
>
> A few questions:
>
> Will COPY_TO_REGCLASS actually generate a copy instruction or can
> TableGen/isel fold it away?
Both EXTRACT_SUBREG and COPY_TO_REGCLASS are emitted as COPY instructions by InstrEmitter. One as a sub-register copy, one as a full register copy. Both are handled by the register coalescer.
It would actually be possible to have EmitCopyToRegClassNode() try to call MRI->constrainRegClass() first, just like AddRegisterOperand() does. That could avoid the copy in some cases, and you would simply get a VR128 register as the second VMOVSDrr operand. I am not proposing we do that for now. Let the register coalescer deal with that.
> What happens if the result of the above pattern using COPY_TO_REGCLASS
> is spilled? Will we get a 64-bit store or a 128-bit store?
This behavior isn't affected by the change. FR64 registers are spilled with 64-bit stores, and VR128 registers are spilled with 128-bit stores.
When the register coalescer removes a copy between VR128 and FR64 registers, it chooses the larger spill size for the result. This is the same for sub-register copies and full register copies.
The important point here is that VR128 is a sub-class of FR64, so getCommonSubClass(VR128, FR64) -> VR128. This is the Liskov substitution principle for register classes.
/jakob
More information about the llvm-dev
mailing list